Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make try_init_allocator() a ctor #338

Closed
wants to merge 1 commit into from
Closed

Make try_init_allocator() a ctor #338

wants to merge 1 commit into from

Conversation

TerrorJack
Copy link
Contributor

We can make try_init_allocator() a ctor, and even avoid calling it in dlmalloc(). Assigning the ctor with highest priority will enable regular ctors to call malloc() safely, and wizer-like optimization will be capable to strip out some extra bits.

@@ -4562,7 +4562,7 @@ static void* tmalloc_small(mstate m, size_t nb) {

#if __wasilibc_unmodified_upstream // Forward declaration of try_init_allocator.
#else
static void try_init_allocator(void);
__attribute__((constructor(0))) static void try_init_allocator(void);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The entire range of 0-100 I believe is reserved for system functions like this. I'm not convinced 0 is the best choice here, but its not not hard to change in the future.

In emscripten we document the priorities in order to keep track of all the order which this stuff happens: https://github.com/emscripten-core/emscripten/blob/main/system/lib/README.md?plain=1#L9-L28

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Practically speaking, right now whenever we add a new constructor in wasi-libc, we just grep for all the constructors in the wasi-libc tree to figure out the ordering constraints, which seems ok for now, as there aren't very many.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe assigning priority 0 is a correct choice, given:

  • dlmalloc is a service which is likely used by other ctors
  • try_init_allocator itself doesn't rely on other ctors to be run before

We can also have a documentation somewhere to list all our ctors. I can add one in this PR if you folks think it's appropriate, though not sure where's the best place to add it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its ok to simply use grep as Dan says, I was just pointing out that it can get complicated, and that one needs to be careful when choosing the order of these things, once there is more than one.

There are also some things that might (one day) need to happen before malloc initialization (e.g. TLS setup, application of relocations in PIC code) although I don't think any of those apply yet in wasi-libc.

@sunfishcode
Copy link
Member

I think this makes sense. We often like to avoid static ctors because they often slow down startup times due to doing extra work up front, however in this case, most programs are calling malloc early in startup anyway already, so the extra work needs to be done anyway.

@sbc100
Copy link
Member

sbc100 commented Oct 12, 2022

Out of interest, why does lazy-initialization inhibit "wizer-like optimization"?

@sunfishcode
Copy link
Member

The static call graph has dlmalloc calling try_init_allocator, so try_init_allocator can't be easily deleted. If we remove that call and make it a constructor, then wizer can delete it after it runs.

@sbc100
Copy link
Member

sbc100 commented Oct 12, 2022

The static call graph has dlmalloc calling try_init_allocator, so try_init_allocator can't be easily deleted. If we remove that call and make it a constructor, then wizer can delete it after it runs.

Nice. That makes sense now.

Presumably in such cases dlmalloc iself would persist, which I assume would massively dominate the size of try_init_allocator? But I guess we care about every byte.

@sunfishcode
Copy link
Member

Yeah, the code size win there likely isn't huge, but it's something. Likewise the win of not doing the load+branch in every malloc call is likely not huge, but it's something.

@sunfishcode
Copy link
Member

I am now thinking about how this PR interacts with upcoming use cases for wasm which involve calling malloc etc. before user code runs. For example, in the component model canonical ABI, value imports aren't spec'd yet, but when they are, they're expected to work by having the outside call in to allocate space before calling the first user function. I don't think this is necessarily a show stopper, but I do want to think it through here.

Is the motivation for this PR to avoid the problem that the malloc init code doesn't know where __heap_base memory ends, and calls sbrk(0), assuming that no one else has done memory.grow before that point? In think we really do want to add a __heap_end symbol to mark the end to fix that problem, as that's fully robust in the presence of memory.grow happening at any time.

@sunfishcode
Copy link
Member

I've now posted https://reviews.llvm.org/D136110 to add a __heap_end symbol.

@sbc100
Copy link
Member

sbc100 commented Oct 17, 2022

I am now thinking about how this PR interacts with upcoming use cases for wasm which involve calling malloc etc. before user code runs. For example, in the component model canonical ABI, value imports aren't spec'd yet, but when they are, they're expected to work by having the outside call in to allocate space before calling the first user function. I don't think this is necessarily a show stopper, but I do want to think it through here.

Is the motivation for this PR to avoid the problem that the malloc init code doesn't know where __heap_base memory ends, and calls sbrk(0), assuming that no one else has done memory.grow before that point? In think we really do want to add a __heap_end symbol to mark the end to fix that problem, as that's fully robust in the presence of memory.grow happening at any time.

Are you interested in knowing the original heap end that was set at compile time, or the current size of the memory? (I guess its the former since you can use an intrinsic to know that current memory size).

@sunfishcode
Copy link
Member

Are you interested in knowing the original heap end that was set at compile time, or the current size of the memory? (I guess its the former since you can use an intrinsic to know that current memory size).

It's to know the original heap. The idea is, between the time when wasm-ld runs and the wasm malloc init code runs, someone may have used memory.grow to allocate memory. If that happens, the extra memory should be considered in-use, and malloc shouldn't clobber it. So malloc needs to know the extend of what wasm-ld allocated.

@TerrorJack
Copy link
Contributor Author

Thanks for all the review comments! Closing this one since #377 is moving in the right direction.

@TerrorJack TerrorJack closed this Jan 9, 2023
@TerrorJack TerrorJack deleted the patch-2 branch January 9, 2023 15:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants