Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initial version of checksum based freshness #14137

Open
wants to merge 27 commits into
base: master
Choose a base branch
from

Conversation

Xaeroxe
Copy link

@Xaeroxe Xaeroxe commented Jun 25, 2024

Implementation for #14136 and resolves #6529

This PR implements the use of checksums in cargo fingerprints as an alternative to using mtimes. This is most useful on systems with poor mtime implementations.

This has a dependency on rust-lang/rust#126930. It's expected this will increase the time it takes to declare a build to be fresh. Still this loss in performance may be preferable to the issues the ecosystem has had with the use of mtimes for determining freshness.

@rustbot
Copy link
Collaborator

rustbot commented Jun 25, 2024

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @weihanglo (or someone else) some time within the next two weeks.

Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (S-waiting-on-review and S-waiting-on-author) stays updated, invoking these commands when appropriate:

  • @rustbot author: the review is finished, PR author should check the comments and take action accordingly
  • @rustbot review: the author is ready for a review, this PR will be queued again in the reviewer's queue

@rustbot rustbot added A-build-execution Area: anything dealing with executing the compiler A-cli Area: Command-line interface, option parsing, etc. A-configuration Area: cargo config files and env vars A-rebuild-detection Area: rebuild detection and fingerprinting A-unstable Area: nightly unstable support S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 25, 2024
Cargo.toml Outdated Show resolved Hide resolved
@rustbot rustbot added the A-infrastructure Area: infrastructure around the cargo repo, ci, releases, etc. label Jul 13, 2024
bors added a commit to rust-lang-ci/rust that referenced this pull request Jul 17, 2024
Add unstable support for outputting file checksums for use in cargo

Adds an unstable option that appends file checksums and expected lengths to the end of the dep-info file such that `cargo` can read and use these values as an alternative to file mtimes.

This PR powers the changes made in this cargo PR rust-lang/cargo#14137

Here's the tracking issue for the cargo feature rust-lang/cargo#14136.
@bors
Copy link
Collaborator

bors commented Jul 26, 2024

☔ The latest upstream changes (presumably #13947) made this pull request unmergeable. Please resolve the merge conflicts.

@Xaeroxe
Copy link
Author

Xaeroxe commented Jul 26, 2024

Merge conflicts resolved.

src/cargo/util/context/mod.rs Outdated Show resolved Hide resolved
weihanglo and others added 12 commits October 2, 2024 13:59
This commit is not really necessary to be in the commit history.
It is just for me to start over the test porting.
* cargo_env_changes
* fingerprint_cleaner_does_not_rebuild
* modify_only_some_files
* rebuild_if_build_artifacts_move_forward_in_time
* simulated_docker_deps_stay_cached
* update_dependency_mtime_does_not_rebuild
These tests are modified or renamed to reflect the switch to
checksum fingerprint:

* bust_patched_dep
* modifying_and_moving
* rebuild_on_mid_build_file_modification
* rebuild_sub_package_then_while_package
* skip_mtime_check_in_selected_cargo_home_subdirs
* use_mtime_cache_in_cargo_home
We don't rely on mtime anymore for checksum-based fingerprint
Two new tests:

* checksum_actually_uses_checksum: chekcsum works when mtime forwards
* same_size_different_content: checksum does check content
workingjubilee added a commit to workingjubilee/rustc that referenced this pull request Oct 3, 2024
…yukang

Add unstable support for outputting file checksums for use in cargo

Adds an unstable option that appends file checksums and expected lengths to the end of the dep-info file such that `cargo` can read and use these values as an alternative to file mtimes.

This PR powers the changes made in this cargo PR rust-lang/cargo#14137

Here's the tracking issue for the cargo feature rust-lang/cargo#14136.
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Oct 3, 2024
Rollup merge of rust-lang#126930 - Xaeroxe:file-checksum-hint, r=chenyukang

Add unstable support for outputting file checksums for use in cargo

Adds an unstable option that appends file checksums and expected lengths to the end of the dep-info file such that `cargo` can read and use these values as an alternative to file mtimes.

This PR powers the changes made in this cargo PR rust-lang/cargo#14137

Here's the tracking issue for the cargo feature rust-lang/cargo#14136.
github-actions bot pushed a commit to rust-lang/miri that referenced this pull request Oct 4, 2024
Add unstable support for outputting file checksums for use in cargo

Adds an unstable option that appends file checksums and expected lengths to the end of the dep-info file such that `cargo` can read and use these values as an alternative to file mtimes.

This PR powers the changes made in this cargo PR rust-lang/cargo#14137

Here's the tracking issue for the cargo feature rust-lang/cargo#14136.
@weihanglo
Copy link
Member

@bors try

Let's see how it really goes.

@bors
Copy link
Collaborator

bors commented Oct 4, 2024

⌛ Trying commit 4ddd017 with merge 5b99bb7...

bors added a commit that referenced this pull request Oct 4, 2024
initial version of checksum based freshness

Implementation for #14136 and resolves #6529

This PR implements the use of checksums in cargo fingerprints as an alternative to using mtimes. This is most useful on systems with poor mtime implementations.

This has a dependency on rust-lang/rust#126930. It's expected this will increase the time it takes to declare a build to be fresh. Still this loss in performance may be preferable to the issues the ecosystem has had with the use of mtimes for determining freshness.
@bors
Copy link
Collaborator

bors commented Oct 4, 2024

💔 Test failed - checks-actions

@bors bors added S-waiting-on-author Status: The marked PR is awaiting some action (such as code changes) from the PR author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 4, 2024
@weihanglo
Copy link
Member

All tests are passed. I will do another round of review today.

Copy link
Member

@weihanglo weihanglo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Xanewok

Would you be willing to rearrange commits and make them easier to review/track for future readers?

Some intermediate commits like a45d663 and 5b36d64 could be squashed, and some like what I've done for tests are helpful to understand they were copied from the other file.

If that sounds too much a burden, no worries. I can do a simple cleanup before this PR merges. Thank you :)

Comment on lines +20 to +27
let mut cmd = p.cargo("build").arg("-Zchecksum-freshness").build_command();
let output = cmd.output().unwrap();
assert!(
String::from_utf8(output.stderr)
.unwrap()
.contains("error: the `-Z` flag is only accepted on the nightly channel of Cargo, but this is the `stable` channel")
);
assert!(!output.status.success());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually use Cargo-idiom test to write stuff, such as this one

Could you tweak it a bit to something like this?

Suggested change
let mut cmd = p.cargo("build").arg("-Zchecksum-freshness").build_command();
let output = cmd.output().unwrap();
assert!(
String::from_utf8(output.stderr)
.unwrap()
.contains("error: the `-Z` flag is only accepted on the nightly channel of Cargo, but this is the `stable` channel")
);
assert!(!output.status.success());
p.cargo("build -Zchecksum-freshness")
.with_stderr_data(str![[
r#"[ERROR] the `-Z` flag is only accepted on the nightly channel of Cargo, but this is the `stable` channel
See https://doc.rust-lang.org/book/appendix-07-nightly-rust.html for more information about Rust release channels.
"#]])
.with_status(101)
.run();

Comment on lines +2095 to +2096
#[derive(Debug, Eq, PartialEq, Hash, Copy, Clone)]
pub enum DepInfoPathType {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't needed, right?

Suggested change
#[derive(Debug, Eq, PartialEq, Hash, Copy, Clone)]
pub enum DepInfoPathType {
enum DepInfoPathType {

Comment on lines +2474 to +2483
#[derive(Copy, Clone, Debug, Eq, PartialEq)]
pub struct InvalidChecksumAlgo {}

impl Display for InvalidChecksumAlgo {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> std::fmt::Result {
write!(f, "expected `sha256`, or `blake3`")
}
}

impl std::error::Error for InvalidChecksumAlgo {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This type doesn't seem needed. We can inline the string message into InvalidChecksum::InvalidChecksumAlgo.

}
}

#[derive(Clone, Copy, Debug, Eq, PartialEq, thiserror::Error)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be sufficient.

Suggested change
#[derive(Clone, Copy, Debug, Eq, PartialEq, thiserror::Error)]
#[derive(Debug, thiserror::Error)]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we have a test checking that switch -Zfreshness-checksum on and off, and check they still build successfully?

Comment on lines +2512 to +2515
#[cfg_attr(
not(all(target_arch = "x86_64", target_os = "windows", target_env = "msvc")),
ignore
)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be sufficient

Suggested change
#[cfg_attr(
not(all(target_arch = "x86_64", target_os = "windows", target_env = "msvc")),
ignore
)]
#[cfg(all(target_arch = "x86_64", target_os = "windows", target_env = "msvc"))]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we have a test verifying that the Cargo-flavored dep-info under fingerprint folder is correctly encoded? Something like relative_depinfo_paths_ws but simpler

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-build-execution Area: anything dealing with executing the compiler A-cli Area: Command-line interface, option parsing, etc. A-configuration Area: cargo config files and env vars A-dep-info Area: dep-info, .d files A-documenting-cargo-itself Area: Cargo's documentation A-infrastructure Area: infrastructure around the cargo repo, ci, releases, etc. A-rebuild-detection Area: rebuild detection and fingerprinting A-unstable Area: nightly unstable support S-waiting-on-author Status: The marked PR is awaiting some action (such as code changes) from the PR author.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

(Option to) Fingerprint by file contents instead of mtime
9 participants