Tracking how much memory is committed per object heap #36731

cshung · 2020-05-20T00:36:26Z

The overall goal of this PR (and more to come) is to provide a per-object-heap level hard limit.

In particular, this PR introduced a new field named committed_by_oh that has 4 entries corresponding to small object heap, large object heap, pinned object heap, and supporting data structure. The values are the amount of memory committed in bytes.

The key change here is passing the oh parameters to the various virtual_* methods. These tell the method which object heap this virtual operation is meant for. The various call site needs to provide that value. Most of the time, we know the call's purpose, by looking at the locals or the flags on the associated segment.

The rest of the code is meant for making sure these passed values are correct. In particular, all the code that is under the VERIFY_COMMITTED_BY_OH definition is optional and is meant for testing only. The amount of memory committed for an object heap should match with the committed memory calculated from the associated segments of that object heap. However, it is not always true. In particular, by the time the virtual_* operation started until the segment objects are updated (or threaded), it is not true. The unsaved_virtual_operation_count is meant for making sure that we do not assert for identical value if we knew the virtual_* operation has already happened but the data structure is not yet updated. The begin_virtual_operation and complete_virtual_operation pair is meant to update that variable, and also provides a pairing of these operations by the character argument.

The testing was a bit more ambitious than it has to be. We only need to make sure all the passed oh parameter value (and hence the committed_by_oh value) is correct under the case for heap_hard_limit, but I expanded the check so that the values are true even if heap_hard_limit is not there. This is to avoid accidental misuse of those parameter values in the future. A slight twist is that if VERIFY_COMMITTED_BY_OH is not defined, the committed_by_oh is not maintained at all if heap_hard_limit is not on, this is meant for performance. I wanted to avoid taking the critical section unnecessarily.

ghost · 2020-05-20T00:36:28Z

Tagging subscribers to this area: @Maoni0
Notify danmosemsft if you want to be subscribed.

VSadov · 2020-05-20T02:40:41Z

I am wondering about the ultimate purpose of this accounting. - Is that for diagnostics or we expect users to actually specify separate individual limits per kind of allocation?

I.E. User can allow XX megs for small objects, YY megs for large, ZZ megs for pinned...

cshung · 2020-05-20T03:13:45Z

I am wondering about the ultimate purpose of this accounting. - Is that for diagnostics or we expect users to actually specify separate individual limits per kind of allocation?

I.E. User can allow XX megs for small objects, YY megs for large, ZZ megs for pinned...

Yes. the overall goal of this PR (and more to come) is to provide (i.e. let user specifies) a per-object-heap level hard limit.

Maoni0 · 2020-05-21T06:35:18Z

this verification code seems unnecessarily complicated. I would propose a much simpler way to verify.

the reason why virtual_commit and virtual_decommit takes a global cs is because they need to care about the global use of the committed memory so we don't exceed the hardlimit. but for verification you can just verify per heap. and the methods that could modify a heap_segment's committed are already synchronized per heap and they are either PER_HEAP or take a heap number or heap. so you can keep committed_by_oh_per_heap and you don't need to take any locks because these operations are already guarded by the per heap lock when needed. you can get rid of the begin/end_virtual_operation and just put the verification each time after we update committed. delete segments is a none issue for hardlimit but if you really want to verify the non hardlimit scenario too, you can manually update the committed_by_oh_per_heap when we delete segments.

you can verify that the committed_by_oh is the sum of committed_by_oh_per_heap of all heaps in a GC if you want.

what do you think?

cshung · 2020-05-21T16:46:18Z

what do you think?

I think that is a very interesting idea, this can hit 3 birds in one stone:

Simplify the logic,
Take check_commit_cs less frequently, and
Verify (in addition to the committed_by_oh values) that the heap (number) is also correct.

The verification code was placed under VERIFY_COMMITTED_BY_OH for a reason. I knew they are non-trivial (and therefore could have bugs or complexity issue), they were designed to be optional.

For the interest of time, I think we can proceed with this plan:

Delete the code under VERIFY_COMMITTED_BY_OH
Merge the PR
File a tracking issue to do the testing work, and
Complete the per object heap limit config switch feature.

I ran the stress tests (for both workstation and server GC, both without heap_hard_limit) with VERIFY_COMMITTED_BY_OH turned on. No assert was recorded for 24 hours each. Therefore I am pretty sure the code is correct.

Maoni0 · 2020-05-21T19:16:03Z

I would suggest to not merge the PR as is. it's true that it's under an optional define but we should not check in code even under an optional define when we know we would change it and the change doesn't require that much effort (I think this should be much easier to implement than the one you did). I would estimate that since you now have looked through the relevant code paths it wouldn't take more than a few days, right?

cshung · 2020-05-21T20:15:10Z

I would suggest to not merge the PR as is. it's true that it's under an optional define but we should not check in code even under an optional define when we know we would change it

I think you might have misunderstood what I just said. I have pushed the change to delete the code under the definition.

it wouldn't take more than a few days, right?

I think I shouldn't take more than a week. I was just thinking it is probably a better idea to do the config switches first. Either way will work fine for me.

mjsabby · 2020-05-21T21:11:27Z

@cshung can you make sure Large Pages is tested in this case? even manually? I'll be able to put it on bing.com as soon as your build shows up in rolling builds.

Maoni0 · 2020-05-21T21:22:57Z

sorry @cshung, I misread. I'm totally fine with checking in the rest of the code but I have not looked at it in detail yet. let me do that. I see the description of the PR still mentions the code you deleted, BTW.

src/coreclr/src/gc/gc.cpp

src/coreclr/src/gc/gcpriv.h

src/coreclr/src/gc/gc.cpp

cshung · 2020-05-22T18:38:24Z

@cshung can you make sure Large Pages is tested in this case? even manually? I'll be able to put it on bing.com as soon as your build shows up in rolling builds.

@mjsabby I am very happy that Bing is going to try it out. This is going to be a very good validation of the work.

This PR is indeed trying to solve the 3x commit problem we discovered earlier.

However, this PR alone is insufficient, as it only tracks the committed memory per object heap. In particular, it doesn't have the new configuration switches yet.

I am currently working on introducing the new switches, with that, I will alter the initial reservation logic, and add the per object heap commit limit checks. With that, we will solve the 3x commit problem.

I will make sure large pages are tested when I introduce the switches.

Maoni0

LGTM!

cshung added the area-GC-coreclr label May 20, 2020

cshung requested review from VSadov, Maoni0 and PeterSolMS May 20, 2020 00:36

cshung mentioned this pull request May 21, 2020

Fix the initial segment size computation #35802

Closed

cshung force-pushed the public/dev/andrew/track-commit-by-object-heap branch 2 times, most recently from b41e8da to a0c8a9b Compare May 21, 2020 18:34

cshung mentioned this pull request May 21, 2020

Implement committed_by_oh_per_heap #36828

Closed