Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialize global random generator initialization when invoking hypothesis under concurrent loads #4094

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

andfoy
Copy link

@andfoy andfoy commented Aug 27, 2024

See #4028 (comment)

This PR adds a lock around the initialization of the Hypothesis global random generator, this prevents issues such as #4028 (comment) from occurring under free-threaded Python.

@jobh
Copy link
Contributor

jobh commented Aug 28, 2024

Thanks for your contribution @andfoy!

While protecting the initialization fixes the crash, I think there is still a race on seed-restore that will make the random generator non-deterministic in this case - which will lead to more subtle failures.

I think making _hypothesis_global_random thread-local could be a possibility?

@jobh
Copy link
Contributor

jobh commented Aug 28, 2024

I think making _hypothesis_global_random thread-local could be a possibility?

...hm, the seeder also manages e.g. numpy.random and other potentially global instances.

Maybe avoiding the crash is enough, possibly adding a warning somewhere in the docs that threaded runs are known to cause non-deterministic PRNG. I'll defer to @Zac-HD when he's back.

@Zac-HD
Copy link
Member

Zac-HD commented Sep 5, 2024

  • I think that @jobh is correct that making _hypothesis_global_random thread-local is the best solution for that particular case.
    • Hypothesis is explicitly not threadsafe, but we're happy to accept patches to behave reasonably well in common multi-threaded scenarios and I think that remaining deterministic when two parallel threads are running is worth it.
    • We should probably write a test that running with settings(derandomize=True) gives the same results even when other threads are running tests at the same time.
  • There's just nothing we can do to make shared global state like that in the random module or numpy.random consistent under concurrency.
    • Perhaps we should issue a warning when known-global PRNG state is changed, and we know there are concurrent threads active? This would be a bit fiddly but we've certainly gone to further lengths for a good error message before. We could further reduce impact by targeting it at the free-threading builds...

@andfoy
Copy link
Author

andfoy commented Sep 6, 2024

Thanks for the review @Zac-HD, b3ec9ae should move random generator registration to be unique per thread. Please let me know if the changes are enough

I'm aware of the test failures, I'm going to fix it in the next commit

@Zac-HD
Copy link
Member

Zac-HD commented Sep 11, 2024

I don't think we should make RANDOMS_TO_MANAGE threadlocal; at least some of them (e.g. Numpy'd default PRNG) are explicitly global and there we should instead warn if there are any changes (ie uses) of those globals while we're in a multi-threaded setting.

Which in turn has some non-trivial interaction with the newly-threadlocal hypothesis global prng state that I'd have to think through...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants