Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High level graph bug #92

Merged
merged 7 commits into from
Aug 17, 2022
Merged

Conversation

multimeric
Copy link
Contributor

It seems that the optimize hook is sometimes passed a dictionary, and sometimes a HighLevelGraph instance, which behaves largely like a dictionary but isn't one. When this happens, we try to set a key with dsk[key] = CachedComputation() which fails and the whole thing crashes. I've included a basic reproducible example of this as a test.

Now dask doesn't really document how the optimizers work very well, but considering that all of the dask.optimization functions return a new dict that they create, I think we can just use HighLevel.to_dict() as I have done.

@codecov-commenter
Copy link

codecov-commenter commented Aug 11, 2022

Codecov Report

Merging #92 (b27577f) into master (1c27c8c) will increase coverage by 2.38%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master      #92      +/-   ##
==========================================
+ Coverage   87.44%   89.83%   +2.38%     
==========================================
  Files           3        3              
  Lines         247      246       -1     
  Branches       41       41              
==========================================
+ Hits          216      221       +5     
+ Misses         17       15       -2     
+ Partials       14       10       -4     
Impacted Files Coverage Δ
src/graphchain/core.py 92.00% <100.00%> (+0.45%) ⬆️
src/graphchain/utils.py 79.54% <0.00%> (+11.36%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@multimeric
Copy link
Contributor Author

@lsorber?

@lsorber
Copy link
Member

lsorber commented Aug 14, 2022

Hi @multimeric, this looks like a nice improvement, thanks for the contribution! I reviewed and approved the PR. If it's ready to merge for you, I'll merge. After that, I want to upgrade the scaffolding to the latest Poetry Cookiecutter and then release a new version of Graphchain.

@multimeric
Copy link
Contributor Author

multimeric commented Aug 14, 2022

Yep it's all ready on my end. I'd love this to be included in the next release since it fixes a large category of bugs.

@lsorber
Copy link
Member

lsorber commented Aug 14, 2022

Actually, before we proceed with this could you have a look at the code around line 40? I recall we encountered this issue in the past and monkey patched HLG with a set item method at the time.

I’d prefer not to mutate the HLG into a dict if we can avoid it. Do you think we can update the monkey patch to work again?

If we cannot make the monkey patch work, then we can continue with your approach of converting to dict, but in that case we can remove the monkey patch entirely.

@multimeric
Copy link
Contributor Author

Hmm interesting point. There must have been some issue with the monkey patch such that it wasn't working on my code. You'll note that the test fails if we remove to to_dict() call, which suggests this.

@multimeric
Copy link
Contributor Author

Okay so it seems what is happening is that we aren't patching all of the copies of the same key in the graph, because you have that break statement:
https://github.com/radix-ai/graphchain/blob/1c27c8cf9d2277dc6331420f91eb161d72890960/src/graphchain/core.py#L41-L46

Incidentally, if I remove the break, all the tests pass, including the new one I introduced. I can do this, which would be a super easy change, but maybe there was some reason you added it in the first place?

@lsorber
Copy link
Member

lsorber commented Aug 15, 2022

Incidentally, if I remove the break, all the tests pass, including the new one I introduced. I can do this, which would be a super easy change, but maybe there was some reason you added it in the first place?

I'm not sure there's a good reason no, I think it might have just been a mistake.

@multimeric
Copy link
Contributor Author

Okay, I'll implement it that way, I think it's a better solution.

@multimeric
Copy link
Contributor Author

All done!

@multimeric
Copy link
Contributor Author

@lsorber.

Copy link
Member

@lsorber lsorber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@lsorber lsorber merged commit c7cede2 into superlinear-ai:master Aug 17, 2022
@lsorber
Copy link
Member

lsorber commented Aug 17, 2022

I'll be updating the scaffolding soon, and then I'll release a new version. Thanks for the contributions @multimeric!

@lsorber
Copy link
Member

lsorber commented Aug 29, 2022

@multimeric FYI: I just published Graphchain v1.4.0 on PyPI!

@multimeric
Copy link
Contributor Author

That's great news! Thanks for the update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants