-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make StableHasher::finish
return a small hash
#6
Merged
Merged
Changes from all commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,5 @@ | ||
# Unreleased | ||
|
||
- `StableHasher::finish` now returns a small hash instead of being fatal (#6) | ||
- Remove `StableHasher::finalize` (#4) | ||
- Import stable hasher implementation from rustc ([db8aca48129](https://github.com/rust-lang/rust/blob/db8aca48129d86b2623e3ac8cbcf2902d4d313ad/compiler/rustc_data_structures/src/)) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -377,41 +377,47 @@ impl SipHasher128 { | |
} | ||
} | ||
|
||
#[inline] | ||
#[inline(always)] | ||
pub fn finish128(mut self) -> [u64; 2] { | ||
debug_assert!(self.nbuf < BUFFER_SIZE); | ||
SipHasher128::finish128_inner(self.nbuf, &mut self.buf, self.state, self.processed) | ||
} | ||
|
||
// Process full elements in buffer. | ||
let last = self.nbuf / ELEM_SIZE; | ||
#[inline] | ||
fn finish128_inner( | ||
nbuf: usize, | ||
buf: &mut [MaybeUninit<u64>; BUFFER_WITH_SPILL_CAPACITY], | ||
mut state: State, | ||
processed: usize, | ||
) -> [u64; 2] { | ||
debug_assert!(nbuf < BUFFER_SIZE); | ||
|
||
// Since we're consuming self, avoid updating members for a potential | ||
// performance gain. | ||
let mut state = self.state; | ||
// Process full elements in buffer. | ||
let last = nbuf / ELEM_SIZE; | ||
|
||
for i in 0..last { | ||
let elem = unsafe { self.buf.get_unchecked(i).assume_init().to_le() }; | ||
let elem = unsafe { buf.get_unchecked(i).assume_init().to_le() }; | ||
state.v3 ^= elem; | ||
Sip13Rounds::c_rounds(&mut state); | ||
state.v0 ^= elem; | ||
} | ||
|
||
// Get remaining partial element. | ||
let elem = if self.nbuf % ELEM_SIZE != 0 { | ||
let elem = if nbuf % ELEM_SIZE != 0 { | ||
unsafe { | ||
// Ensure element is initialized by writing zero bytes. At most | ||
// `ELEM_SIZE - 1` are required given the above check. It's safe | ||
// to write this many because we have the spill and we maintain | ||
// `self.nbuf` such that this write will start before the spill. | ||
let dst = (self.buf.as_mut_ptr() as *mut u8).add(self.nbuf); | ||
let dst = (buf.as_mut_ptr() as *mut u8).add(nbuf); | ||
ptr::write_bytes(dst, 0, ELEM_SIZE - 1); | ||
self.buf.get_unchecked(last).assume_init().to_le() | ||
buf.get_unchecked(last).assume_init().to_le() | ||
} | ||
} else { | ||
0 | ||
}; | ||
|
||
// Finalize the hash. | ||
let length = self.processed.debug_strict_add(self.nbuf); | ||
let length = processed.debug_strict_add(nbuf); | ||
let b: u64 = ((length as u64 & 0xff) << 56) | elem; | ||
|
||
state.v3 ^= b; | ||
|
@@ -496,7 +502,11 @@ impl Hasher for SipHasher128 { | |
} | ||
|
||
fn finish(&self) -> u64 { | ||
panic!("SipHasher128 cannot provide valid 64 bit hashes") | ||
let mut buf = self.buf.clone(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Seems like we still need to clone the buffer 🤷 |
||
let [a, b] = SipHasher128::finish128_inner(self.nbuf, &mut buf, self.state, self.processed); | ||
|
||
// Combining the two halves makes sure we get a good quality hash. | ||
a.wrapping_mul(3).wrapping_add(b).to_le() | ||
} | ||
} | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have an idea if this affects performance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I extracted
finish128
method on Godbolt and look at the diff with the previous version and there was significant increase in "predicted cycles" and instructions, which was due to theptr::write_bytes
call generating amemcpy
instead of a small instructions because the size wasn't a constant anymore.I therefore changed that part (Godbolt) to copy the "last" and "last + 1" in a array as to have the size a constant as before and now the diff is very small, llvm-mca indicates 15100ins -> 15400ins for 100 iterations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for investigating!
Since this code is super hot in the compiler and the u64 version of the finish method probably isn't going to be used much, I'd prefer if we didn't regress things at all.
How about extracting a
finish128_inner(nbuf: usize, buf: &mut [MaybeUninit<u64>; BUFFER_WITH_SPILL_CAPACITY], state: State, processed: usize)
function? ThenHasher::finish
could would only need to copystate
andfinish128
could keep takingself
by value (?_There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at the generated assembly on Godbolt and it's very very similar, mainly some registry naming changes, some instructions moving place and one more stack variable.
Compared to the other version (with the last and last+1 copy), it's roughly similar, in terms of instructions per llvm-mca.
One interesting thing, with the inner variant the number of cycles needed per llvm-mca is dropping significantly from 4410 to 3810, maybe because of less memory pressure. That seems like a win.
Pushed the inner variant.