-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make StableHasher::finish
return a small hash
#6
Conversation
@@ -378,14 +378,13 @@ impl SipHasher128 { | |||
} | |||
|
|||
#[inline] | |||
pub fn finish128(mut self) -> [u64; 2] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have an idea if this affects performance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I extracted finish128
method on Godbolt and look at the diff with the previous version and there was significant increase in "predicted cycles" and instructions, which was due to the ptr::write_bytes
call generating a memcpy
instead of a small instructions because the size wasn't a constant anymore.
I therefore changed that part (Godbolt) to copy the "last" and "last + 1" in a array as to have the size a constant as before and now the diff is very small, llvm-mca indicates 15100ins -> 15400ins for 100 iterations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for investigating!
Since this code is super hot in the compiler and the u64 version of the finish method probably isn't going to be used much, I'd prefer if we didn't regress things at all.
How about extracting a finish128_inner(nbuf: usize, buf: &mut [MaybeUninit<u64>; BUFFER_WITH_SPILL_CAPACITY], state: State, processed: usize)
function? Then Hasher::finish
could would only need to copy state
and finish128
could keep taking self
by value (?_
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at the generated assembly on Godbolt and it's very very similar, mainly some registry naming changes, some instructions moving place and one more stack variable.
Compared to the other version (with the last and last+1 copy), it's roughly similar, in terms of instructions per llvm-mca.
One interesting thing, with the inner variant the number of cycles needed per llvm-mca is dropping significantly from 4410 to 3810, maybe because of less memory pressure. That seems like a win.
Pushed the inner variant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me. I'm wondering if we should have some kind of performance testing as part of this repo. But I'm not sure how to do that in a stable, useful way.
@@ -496,7 +502,11 @@ impl Hasher for SipHasher128 { | |||
} | |||
|
|||
fn finish(&self) -> u64 { | |||
panic!("SipHasher128 cannot provide valid 64 bit hashes") | |||
let mut buf = self.buf.clone(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we still need to clone the buffer 🤷
Hasher::finish
is just defined in an unfortunate way.
Thank you, @Urgau! |
This PR changes
SipHasher128
andStableHasher
so that theirfinish
implementation are no longer fatal and return a "small hash".Implements #3 (comment)
r? @michaelwoerister