Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce macro usage for lints #104863

Merged
merged 9 commits into from
Dec 2, 2022
Merged

Conversation

nnethercote
Copy link
Contributor

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. labels Nov 25, 2022
@nnethercote
Copy link
Contributor Author

Commit 2 removes the -Zno-interleave-lints flag. I'm not sure if this is a good idea, but I was curious about how much simplification could occur with its removal. The answer is: quite a bit.

@Noratrieb
Copy link
Member

The option was added in #57726 to enable better benchmarking of lints.

@nnethercote
Copy link
Contributor Author

Sure... the question is whether it's useful. Is anyone benchmarking lints with this option?

@jyn514
Copy link
Member

jyn514 commented Nov 29, 2022

r=me for the first commit; if you want to keep the others I think you should make an MCP for removing the flag.

Btw, IMO it would be nice to remove the 'tcx macro argument and just hard-code tcx everywhere it's used, I don't see the advantage of splitting the two.

Copy link
Contributor

@cjgillot cjgillot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really concerned about -Zno-interleave-lints.
It has quite a significant complexity cost for a profiling option, and I'm not sure anybody used it besides who introduced it.
tracing is proably enough for timings, and profiling tools do not need this.

compiler/rustc_lint/src/late.rs Outdated Show resolved Hide resolved
compiler/rustc_lint/src/late.rs Outdated Show resolved Hide resolved
compiler/rustc_lint/src/late.rs Outdated Show resolved Hide resolved
compiler/rustc_lint/src/late.rs Outdated Show resolved Hide resolved
compiler/rustc_lint/src/early.rs Outdated Show resolved Hide resolved
@nnethercote
Copy link
Contributor Author

I have added five new commits that put builtins in with other passes, which then allows some functions to be inlined and removed.

I feel like this doesn't need an MCP. There was no MCP for #102725 which was similar to this -- removing an obscure unstable flag that is only used for compiling rustc itself.

@rust-log-analyzer

This comment has been minimized.

@nnethercote
Copy link
Contributor Author

nnethercote commented Dec 1, 2022

Odd, I don't get these errors when I compile locally:

error[E0277]: `impl EarlyLintPass + 'static` cannot be sent between threads safely
    |
378 |     passes.push(Box::new(builtin_lints));
378 |     passes.push(Box::new(builtin_lints));
    |                 ^^^^^^^^^^^^^^^^^^^^^^^ `impl EarlyLintPass + 'static` cannot be sent between threads safely
    |
    = note: required for the cast from `impl EarlyLintPass + 'static` to the object type `dyn EarlyLintPass + Send`
    |
    |
372 |     builtin_lints: impl EarlyLintPass + 'static + std::marker::Send,

@nnethercote
Copy link
Contributor Author

Looks like the failure was in a parallel-compiler = true build. I've added another commit to fix that.

@rust-log-analyzer

This comment has been minimized.

@nnethercote
Copy link
Contributor Author

Some tests that only run with --stage=2 were failing. Turns out we can't merge the early lints into a single pass, presumably because the second argument to early_lint_node differs in the current two call sites.

@cjgillot
Copy link
Contributor

cjgillot commented Dec 1, 2022

The second argument to early_lint_node just says whether to warn unknown lints. It should be true exactly once in the runtime, so keeping !pre_expansion should be it.

@nnethercote
Copy link
Contributor Author

Well, I got those ui-fulldeps failures with the early lint changes.

The lint definitions use macros heavily. This commit merges some of them
that are split unnecessarily. I find the reduced indirection makes it
easier to imagine what the generated code will look like.
Because it complicates lint implementation greatly.
These were enabled by the removal of `-Zno-interleave-lints`.
This avoids calling the `late_lint_{mod_pass,pass_crate}` twice.
It has a single call site.
It has a single call site.
Required to get the parallel compiler building again.
@nnethercote
Copy link
Contributor Author

I tried the early lint change again. Most of the ui-fulldeps failures are resolvable, because the change causes some duplicate warnings to be removed, which is fine.

But there is one remaining bizarre failure:

---- [ui] src/test/ui-fulldeps/mod_dir_path_canonicalized.rs stdout ----

error: test run failed!
status: exit status: 101
command: "/home/njn/dev/rust1/build/x86_64-unknown-linux-gnu/test/ui-fulldeps/mod_dir_path_canonicalized/a"
stdout: none
--- stderr -------------------------------
thread 'main' panicked at 'WorkerLocal can only be used on the thread pool it was created on', /home/njn/.cargo/registry/src/github.com-1ecc6299db9ec823/rustc-rayon-core-0.4.1/src/worker_local.rs:49:17
stack backtrace:
   0: std::panicking::begin_panic::<&str>
   1: rustc_ast::attr::mk_attr_from_item
   2: <rustc_parse::parser::Parser>::collect_tokens_trailing_token::<rustc_ast::ast::Attribute, <rustc_parse::parser::Parser>::collect_tokens_no_attrs<rustc_ast::ast::Attribute, <rustc_parse::parser::Parser>::parse_attribute::{closure#0}>::{closure#0}>
   3: <rustc_parse::parser::Parser>::parse_inner_attributes
   4: <rustc_parse::parser::Parser>::parse_mod
   5: <rustc_parse::parser::Parser>::parse_crate_mod
   6: mod_dir_path_canonicalized::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
------------------------------------------

failures:
    [ui] src/test/ui-fulldeps/mod_dir_path_canonicalized.rs

@nnethercote
Copy link
Contributor Author

Oh, I was accidentally using a parallel compiler build, sigh.

This avoids calling `early_lint_node` twice.

Note: one `early_lint_node` call had `!pre_expansion` for the second
argument and the other had `false`. The new single call just has
`!pre_expansion`. This results in a reduction of duplicate error
messages in some `ui-fulldeps` tests. The order of some `ui-fulldeps`
output also changes, but that doesn't matter.
It has a single call site.
@nnethercote
Copy link
Contributor Author

Ok, the commits to merge early_lint_node are now working and ready for review.

@cjgillot
Copy link
Contributor

cjgillot commented Dec 2, 2022

@bors r+ rollup=iffy

@bors
Copy link
Contributor

bors commented Dec 2, 2022

📌 Commit 406dace has been approved by cjgillot

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 2, 2022
@bors
Copy link
Contributor

bors commented Dec 2, 2022

⌛ Testing commit 406dace with merge e960b5e...

@bors
Copy link
Contributor

bors commented Dec 2, 2022

☀️ Test successful - checks-actions
Approved by: cjgillot
Pushing e960b5e to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Dec 2, 2022
@bors bors merged commit e960b5e into rust-lang:master Dec 2, 2022
@rustbot rustbot added this to the 1.67.0 milestone Dec 2, 2022
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (e960b5e): comparison URL.

Overall result: ❌ regressions - ACTION NEEDED

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.5% [0.2%, 1.4%] 116
Regressions ❌
(secondary)
0.9% [0.1%, 2.3%] 83
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.5% [0.2%, 1.4%] 116

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
3.2% [3.2%, 3.2%] 1
Improvements ✅
(primary)
-2.7% [-2.7%, -2.7%] 1
Improvements ✅
(secondary)
-2.5% [-3.6%, -1.4%] 2
All ❌✅ (primary) -2.7% [-2.7%, -2.7%] 1

Cycles

This benchmark run did not return any relevant results for this metric.

@rustbot rustbot added the perf-regression Performance regression. label Dec 2, 2022
@nnethercote
Copy link
Contributor Author

Ugh, that's unexpected. Lots of instruction count regressions, but cycles and wall times didn't change. I'll take a look on Monday.

@nnethercote nnethercote deleted the reduce-lint-macros branch December 4, 2022 21:33
@nnethercote
Copy link
Contributor Author

These were the two commits that caused the regressions:

890c5ead204 Merge `builtins` into `LateLintPassObjects`.
a9b02af62b0 Merge `builtins` into `EarlyLintPassObjects`.

The cachegrind diffs indicate that it's not because any extra work was being done. It just appears to be due to minor changes in code generation, mostly around enter_lint_attrs/exit_lint_attrs.

@nnethercote
Copy link
Contributor Author

#105291 improved things slightly, though less than I had hoped based on local measurements.

I have another idea, an idea which I had before I wrote this PR, and and idea for which this PR is a prerequisite. Currently when processing a node such as an expression, we call check_expr for every lint pass, even those which have an empty check_expr methods. And because we use dynamic dispatch for this, the cost of a call to an empty method isn't zero. And we have over 30 check_* methods, and a couple of dozen lint passes, and most of them only define a small number of the check_* methods. Which means that there are a lot of calls to empty check_* methods. It may be possible to avoid these.

@nnethercote
Copy link
Contributor Author

I have another idea, an idea which I had before I wrote this PR, and and idea for which this PR is a prerequisite. Currently when processing a node such as an expression, we call check_expr for every lint pass, even those which have an empty check_expr methods. And because we use dynamic dispatch for this, the cost of a call to an empty method isn't zero. And we have over 30 check_* methods, and a couple of dozen lint passes, and most of them only define a small number of the check_* methods. Which means that there are a lot of calls to empty check_* methods. It may be possible to avoid these.

I was wrong about this. Well, it's relevant for clippy, but for rustc all the builtin lints get merged into a small number of "combined" lints via a macro-based mechanism, which lets the empty check_* methods be inlined and thus disappear.

I also tried tweaking some inlining in #105416. It helped locally, but didn't make much difference for a CI perf run.

@nnethercote
Copy link
Contributor Author

#105485 should fix the perf regression properly, by reverting part of this PR's changes.

@nnethercote
Copy link
Contributor Author

@rustbot label: +perf-regression-triaged

@rustbot rustbot added the perf-regression-triaged The performance regression has been triaged. label Dec 9, 2022
nnethercote added a commit to nnethercote/rust that referenced this pull request Dec 11, 2022
This commit partly undoes rust-lang#104863, which combined the builtin lints pass
with other lints. This caused a slowdown, because often there are no
other lints, and it's faster to do a pass with a single lint directly
than it is to do a combined pass with a `passes` vector containing a
single lint.
bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 12, 2022
…s, r=cjgillot

Fix lint perf regressions

rust-lang#104863 caused small but widespread regressions in lint performance. I tried to improve things in rust-lang#105291 and rust-lang#105416 with minimal success, before fully understanding what caused the regression. This PR effectively reverts all of rust-lang#105291 and part of rust-lang#104863 to fix the perf regression.

r? `@cjgillot`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. perf-regression-triaged The performance regression has been triaged. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants