Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[6.2.0]Automatically retry the build if encountered remote cache eviction error #17953

Closed

Conversation

ShreeM01
Copy link
Contributor

@ShreeM01 ShreeM01 commented Apr 3, 2023

With #17358, Bazel will exit with code 39 if remote cache evicts blobs during the build. With #17462 and #17747, Bazel is able to continue the build without bazel clean or bazel shutdown.

However, even with #17639 and following changes to extend the lease, remote cache can still evict blobs in some rare cases.

Based on above changes, this PR makes bazel retry the invocation if it encountered the remote cache eviction error during previous invocation if --experimental_remote_cache_eviction_retries is set, or build rewinding.

$ bazel build --experimental_remote_cache_eviction_retries=5 ...
INFO: Invocation ID: b7348bfa-9446-4c72-a888-0a0ad012f225
Loading:
Loading:
Loading: 0 packages loaded
Analyzing: target //a:bar (0 packages loaded, 0 targets configured)
INFO: Analyzed target //a:bar (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
[0 / 2] [Prepa] BazelWorkspaceStatusAction stable-status.txt
ERROR: .../workspace/a/BUILD:8:8: Executing genrule //a:bar failed: Failed to fetch blobs because they do not exist remotely: Missing digest: b5bb9d8014a0f9b1d61e21e796d78dccdf1352f23cd32812f4850b878ae4944c/4
Target //a:bar failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 0.447s, Critical Path: 0.05s
INFO: 2 processes: 2 internal.
ERROR: Build did NOT complete successfully
Found remote cache eviction error, retrying the build...
INFO: Invocation ID: 983f60dc-8bb9-4b82-aa33-a378469ce140
Loading:
Loading:
Loading: 0 packages loaded
Analyzing: target //a:bar (0 packages loaded, 0 targets configured)
INFO: Analyzed target //a:bar (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
[0 / 2] [Prepa] BazelWorkspaceStatusAction stable-status.txt
Target //a:bar up-to-date:
  bazel-bin/a/bar.out
INFO: Elapsed time: 0.866s, Critical Path: 0.35s
INFO: 3 processes: 1 internal, 1 processwrapper-sandbox, 1 remote.
INFO: Build completed successfully, 3 total actions
$

Part of #16660.

Closes #17711.
Commit: 24b458

PiperOrigin-RevId: 520610524
Change-Id: I20d43d1968767a03250b9c8f8a6dda4e056d4f52

coeuvre and others added 2 commits March 31, 2023 12:06
With bazelbuild#17358, Bazel will exit with code 39 if remote cache evicts blobs during the build. With bazelbuild#17462 and bazelbuild#17747, Bazel is able to continue the build without bazel clean or bazel shutdown.

However, even with bazelbuild#17639 and following changes to extend the lease, remote cache can still evict blobs in some rare cases.

Based on above changes, this PR makes bazel retry the invocation if it encountered the remote cache eviction error during previous invocation if `--experimental_remote_cache_eviction_retries` is set, or **build rewinding**.

```
$ bazel build --experimental_remote_cache_eviction_retries=5 ...
INFO: Invocation ID: b7348bfa-9446-4c72-a888-0a0ad012f225
Loading:
Loading:
Loading: 0 packages loaded
Analyzing: target //a:bar (0 packages loaded, 0 targets configured)
INFO: Analyzed target //a:bar (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
[0 / 2] [Prepa] BazelWorkspaceStatusAction stable-status.txt
ERROR: .../workspace/a/BUILD:8:8: Executing genrule //a:bar failed: Failed to fetch blobs because they do not exist remotely: Missing digest: b5bb9d8014a0f9b1d61e21e796d78dccdf1352f23cd32812f4850b878ae4944c/4
Target //a:bar failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 0.447s, Critical Path: 0.05s
INFO: 2 processes: 2 internal.
ERROR: Build did NOT complete successfully
Found remote cache eviction error, retrying the build...
INFO: Invocation ID: 983f60dc-8bb9-4b82-aa33-a378469ce140
Loading:
Loading:
Loading: 0 packages loaded
Analyzing: target //a:bar (0 packages loaded, 0 targets configured)
INFO: Analyzed target //a:bar (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
[0 / 2] [Prepa] BazelWorkspaceStatusAction stable-status.txt
Target //a:bar up-to-date:
  bazel-bin/a/bar.out
INFO: Elapsed time: 0.866s, Critical Path: 0.35s
INFO: 3 processes: 1 internal, 1 processwrapper-sandbox, 1 remote.
INFO: Build completed successfully, 3 total actions
$
```

Part of bazelbuild#16660.

Closes bazelbuild#17711.

PiperOrigin-RevId: 520610524
Change-Id: I20d43d1968767a03250b9c8f8a6dda4e056d4f52
@sgowroji sgowroji added the team-Remote-Exec Issues and PRs for the Execution (Remote) team label Apr 3, 2023
@ShreeM01 ShreeM01 requested a review from keertk April 3, 2023 15:12
@coeuvre
Copy link
Member

coeuvre commented Apr 4, 2023

@kshyanashree I think there are some deps missing from 6.2 branch. Let me do the cherry-picks.

@ShreeM01 ShreeM01 closed this Apr 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team-Remote-Exec Issues and PRs for the Execution (Remote) team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants