Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CBRD-23866] Fix not to reuse heap when creating a table with REUSE_OID #2584

Merged
merged 2 commits into from
Mar 8, 2021

Conversation

wrlawodms
Copy link
Contributor

@wrlawodms wrlawodms self-assigned this Mar 2, 2021
@wrlawodms
Copy link
Contributor Author

The code changed is based on the previous things and to make it consistent with 9.X version.

9.3.9

 5696   if (prm_get_bool_value (PRM_ID_DONT_REUSE_HEAP_FILE) == false)
 5697     { 
 5698       /*
 5699        * Try to reuse an already mark deleted heap file
 5700        */
 5701 
 5702       vpid.volid = hfid->vfid.volid;
 5703       if (file_type != FILE_HEAP_REUSE_SLOTS
 5704           && file_reuse_deleted (thread_p, &hfid->vfid, file_type,
 5705                                  &hfdes) != NULL
 5706           && file_find_nthpages (thread_p, &hfid->vfid, &vpid, 0, 1) == 1)
 5707         {
 5708           hfid->hpgid = vpid.pageid;
 5709           if (heap_reuse (thread_p, hfid, &hfdes.class_oid,
 5710                           reuse_oid) != NULL)

@wrlawodms
Copy link
Contributor Author

wrlawodms commented Mar 3, 2021

@byungwook-kim Hi, this fix makes a sql test failed. You can check the ci/cicleci: test_sql to see the test, which prints query plans. This patch makes only DONT_REUSE_OID table to be able to reuse a heap file. This patch possibly makes the number of allocated pages and slots different than before. Could it have an impact on the query plan? Please let me have your thoughts. Thank you.

@beyondykk9
Copy link
Contributor

beyondykk9 commented Mar 3, 2021

@byungwook-kim Hi, this fix makes a sql test failed. You can check the ci/cicleci: test_sql to see the test, which prints query plans. This patch makes only DONT_REUSE_OID table to be able to reuse a heap file. This patch possibly makes the number of allocated pages and slots different than before. Could it have an impact on the query plan? Please let me have your thoughts. Thank you.

Please check the below code sequence in query optimization:

build_query_graph
	build_graph_for_entity
		qo_add_node
			qo_get_class_info
				grok_classes
					qo_estimate_statistics
						statblock->heap_num_pages = NOMINAL_HEAP_SIZE (class_mop);
						statblock->heap_num_objects = (statblock->heap_num_pages * DB_PAGESIZE) / NOMINAL_OBJECT_SIZE (class_mop);

@wrlawodms
Copy link
Contributor Author

wrlawodms commented Mar 4, 2021

I think the answer file is wrong and should be changed together.
The failed test case fails also on the original version like the result by this PR if it is run alone, not as a part of sql full tests. It is because the factors that affect the query plan are affected by the status of the database, which is being changed during full tests. To be specific, this difference is caused by reusing heap and the cache.

With this PR, it is no longer affected by the previous dropped heap file. So it seems to reveal the wrong answer of the test case. As mentioned by @byungwook-kim, one of the factors for generating a query plan is the number of pages in the heap file, and this is different like below:

Num_pages (by SHOW HEAP CAPACITY):
Original version: 356
This PR version (and if ran alone): 1

I need your confirmation because query optimization is not my area, @byungwook-kim. Do you agree that this different number of pages affects the query plan and the answer file should be changed along?

@beyondykk9
Copy link
Contributor

I think the answer file is wrong and should be changed together.
The failed test case fails also on the original version like the result by this PR if it is run alone, not as a part of sql full tests. It is because the factors that affect the query plan are affected by the status of the database, which is being changed during full tests. To be specific, this difference is caused by reusing heap and the cache.

With this PR, it is no longer affected by the previous dropped heap file. So it seems to reveal the wrong answer of the test case. As mentioned by @byungwook-kim, one of the factors for generating a query plan is the number of pages in the heap file, and this is different like below:

Num_pages (by SHOW HEAP CAPACITY):
Original version: 356
This PR version (and if ran alone): 1

I need your confirmation because query optimization is not my area, @byungwook-kim. Do you agree that this different number of pages affects the query plan and the answer file should be changed along?

Ok, I agree with you. If the query plan need to change by this issue then please request QA team to change the test case.

@wrlawodms
Copy link
Contributor Author

@byungwook-kim thanks for replying.

@@ -5187,7 +5187,7 @@ heap_create_internal (THREAD_ENTRY * thread_p, HFID * hfid, const OID * class_oi

memset (&des, 0, sizeof (des));

if (prm_get_bool_value (PRM_ID_DONT_REUSE_HEAP_FILE) == false)
if (prm_get_bool_value (PRM_ID_DONT_REUSE_HEAP_FILE) == false && file_type != FILE_HEAP_REUSE_SLOTS)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. How about using '==' like the first condition of &&? I mean that the 'file_type == FILE_HEAP' expression is more readable I think.
  2. Codes that check FILE_HEAP_REUSE_SLOTS remain in the functions called when the 'if condition' is true. This is no longer unnecessary and needs to be removed. However, I ask you to keep a separate record for this because it is more efficient to deal with this problem in the reuse heap improvement.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the code sequence, the heap_cache_class_info is called. the heap_cache_class_info asserts that file type is FILE_HEAP. I think therefore it needs check the file type is FILE_HEAP or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hornetmj

  1. I partially agree with that. I did like you said, but I changed because in pre-version like 10.1, file_type is already filtered and this code is based on it. If file_type == FILE_HEAP is used, the code will be different by version although the purpose is the same. What do you think about it?
  2. Do you mean all the code blocks not covered due to this patch should be removed together? For example, file_tracker_item_reuse_heap() and even heap_reuse()? I'm not sure it's right because that functions are designed to cover both file types and it'd be better to leave it for future use.

Copy link
Contributor Author

@wrlawodms wrlawodms Mar 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@byungwook-kim Could you let me know more clearly? Do you have the same opinion with @hornetmj and give me one more reason?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wrlawodms

  1. I think it's right to move in the right direction. And as the code is changed to cpp, the logic is the same, but there are many cases where the code has changed.
  2. I partially agree with your opinion. The important thing is to write it down so we can consider it if necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I agree

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants