Make `StableHasher::finish` return a small hash #6

Urgau · 2024-06-29T17:14:47Z

This PR changes SipHasher128 and StableHasher so that their finish implementation are no longer fatal and return a "small hash".

Implements #3 (comment)
r? @michaelwoerister

michaelwoerister · 2024-07-03T08:09:59Z

src/sip128.rs

@@ -378,14 +378,13 @@ impl SipHasher128 {
    }

    #[inline]
-    pub fn finish128(mut self) -> [u64; 2] {


Do we have an idea if this affects performance?

I extracted finish128 method on Godbolt and look at the diff with the previous version and there was significant increase in "predicted cycles" and instructions, which was due to the ptr::write_bytes call generating a memcpy instead of a small instructions because the size wasn't a constant anymore.

I therefore changed that part (Godbolt) to copy the "last" and "last + 1" in a array as to have the size a constant as before and now the diff is very small, llvm-mca indicates 15100ins -> 15400ins for 100 iterations.

Thanks for investigating!

Since this code is super hot in the compiler and the u64 version of the finish method probably isn't going to be used much, I'd prefer if we didn't regress things at all.

How about extracting a finish128_inner(nbuf: usize, buf: &mut [MaybeUninit<u64>; BUFFER_WITH_SPILL_CAPACITY], state: State, processed: usize) function? Then Hasher::finish could would only need to copy state and finish128 could keep taking self by value (?_

I looked at the generated assembly on Godbolt and it's very very similar, mainly some registry naming changes, some instructions moving place and one more stack variable.

Compared to the other version (with the last and last+1 copy), it's roughly similar, in terms of instructions per llvm-mca.

One interesting thing, with the inner variant the number of cycles needed per llvm-mca is dropping significantly from 4410 to 3810, maybe because of less memory pressure. That seems like a win.

Pushed the inner variant.

src/sip128.rs

michaelwoerister

This looks good to me. I'm wondering if we should have some kind of performance testing as part of this repo. But I'm not sure how to do that in a stable, useful way.

michaelwoerister · 2024-07-03T13:49:15Z

src/sip128.rs

@@ -496,7 +502,11 @@ impl Hasher for SipHasher128 {
    }

    fn finish(&self) -> u64 {
-        panic!("SipHasher128 cannot provide valid 64 bit hashes")
+        let mut buf = self.buf.clone();


Seems like we still need to clone the buffer 🤷
Hasher::finish is just defined in an unfortunate way.

michaelwoerister · 2024-07-03T13:50:42Z

Thank you, @Urgau!

Urgau requested a review from michaelwoerister July 3, 2024 07:16

michaelwoerister reviewed Jul 3, 2024

View reviewed changes

src/sip128.rs Outdated Show resolved Hide resolved

michaelwoerister approved these changes Jul 3, 2024

View reviewed changes

Urgau force-pushed the finish-non-fatal branch from 3a72324 to ddced90 Compare July 3, 2024 11:06

Urgau added 3 commits July 3, 2024 14:33

Switch SipHasher128::finish to a non-consuming method

b541bac

Make SipHasher128::finish non-fatal by returning a small-hash

dc5fda0

Return inner small hash in StableHasher::finish

3c1f1fc

Urgau force-pushed the finish-non-fatal branch from ddced90 to 3c1f1fc Compare July 3, 2024 12:43

michaelwoerister reviewed Jul 3, 2024

View reviewed changes

michaelwoerister merged commit c2b3deb into rust-lang:main Jul 3, 2024
7 checks passed

weihanglo mentioned this pull request Jul 9, 2024

feat!: use stable hash from rustc-stable-hash rust-lang/cargo#14116

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make `StableHasher::finish` return a small hash #6

Make `StableHasher::finish` return a small hash #6

Urgau commented Jun 29, 2024

michaelwoerister Jul 3, 2024

Urgau Jul 3, 2024 •

edited

Loading

michaelwoerister Jul 3, 2024

Urgau Jul 3, 2024

michaelwoerister left a comment

michaelwoerister Jul 3, 2024

michaelwoerister commented Jul 3, 2024

Make StableHasher::finish return a small hash #6

Make StableHasher::finish return a small hash #6

Conversation

Urgau commented Jun 29, 2024

michaelwoerister Jul 3, 2024

Choose a reason for hiding this comment

Urgau Jul 3, 2024 • edited Loading

Choose a reason for hiding this comment

michaelwoerister Jul 3, 2024

Choose a reason for hiding this comment

Urgau Jul 3, 2024

Choose a reason for hiding this comment

michaelwoerister left a comment

Choose a reason for hiding this comment

michaelwoerister Jul 3, 2024

Choose a reason for hiding this comment

michaelwoerister commented Jul 3, 2024

Make `StableHasher::finish` return a small hash #6

Make `StableHasher::finish` return a small hash #6

Urgau Jul 3, 2024 •

edited

Loading