Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rust: Improve lines-of-code counts. #17588

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Rust: Improve lines-of-code counts. #17588

wants to merge 3 commits into from

Conversation

geoffw0
Copy link
Contributor

@geoffw0 geoffw0 commented Sep 25, 2024

Improve lines-of-code counts:

  • remove Module from the count, which was spuriously adding +1 LOC in main.rs.
  • add back calls to println! in the tests, since these should be deterministically (if not correctly) extracted now.
  • coincidentally the total LOC counted changes after the first commit, but changes back exactly after the second. These changes are not doing nothing!

@aibaars - also note that we have a spuriously located MacroRules (on my_macro.rs line 1) and Struct (on my_struct.rs line 2), which I discovered while debugging this query. I haven't done anything about them, I believe the right fix would be in the extractor.

Obviously types are still not extracted, which is also affecting the counts.

@geoffw0 geoffw0 added the Rust Pull requests that update Rust code label Sep 25, 2024
@@ -44,7 +45,9 @@ class File extends Container, Impl::File {
result =
count(int line |
exists(AstNode node, Location loc |
not node instanceof SourceFile and loc = node.getLocation()
not node instanceof Module and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revert this change. The module isn't the problem. The problem are the /* */ comments which syntactically belong to the mod, struct, macrorule, etc. that follows them. Just verify for yourself using the AST viewer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I see what you mean now, Module is a module decl not a module. I'm going to have to get used to that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Is there an easy way to use the AST viewer on a test?

Is there an issue for the location problems (or should I create one)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the AST viewer works fine on test databases these days. I just built a database running codeql test run --keep-databases, loaded it in vscode, ran from File f select f, click on a file, and right-click "view AST" in that file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That method for tests is a bit of a fiddle unfortunately, but good to know the AST viewer now works for Rust. I had investigated using an adhoc query to print all the AstNode.toString()s associated with each line of each file, which worked - though I didn't get a hierarchy.

Anyway I've created an issue for the locations problems.

@aibaars
Copy link
Contributor

aibaars commented Sep 27, 2024

In general using start and end lines of AstNodes to count lines will never work 100%. Currently we're overcounting a bit for items with attached comments. However, we may also change the location of top-level items to be the span of their names only. That gives a much better user-experience (so we highlightithe name of the function instead of the entire function body). Another tricky thing to deal with are multi-line string literals. The only reliable way to calculate LOC is to look at token, which we do not extract (though we might at some point).

A pretty simple way to deal with this problem is to ignore the start line for top-level items. That would under count in cases like

fn
my_function() {...}

but that would be extremely rare.

Copy link
Contributor

@aibaars aibaars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me. However, you could implement special handling of top-level items to avoid counting doc comments as a line.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Rust Pull requests that update Rust code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants