Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent bumping the parser past the EOF. #32479

Merged
merged 2 commits into from
Mar 29, 2016

Conversation

eddyb
Copy link
Member

@eddyb eddyb commented Mar 25, 2016

Makes Parser::bump after EOF into an ICE, forcing callers to avoid repeated EOF bumps.
This ICE is intended to break infinite loops where EOF wasn't stopping the loop.

For example, the handling of EOF in parse_trait_items' recovery loop fixes #32446.
But even without this specific fix, the ICE is triggered, which helps diagnosis and UX.

This is a [breaking-change] for plugins authors who eagerly eat multiple EOFs.
See docopt/docopt.rs#171 for such an example and the necessary fix.

@nikomatsakis
Copy link
Contributor

@bors r+

@bors
Copy link
Contributor

bors commented Mar 25, 2016

📌 Commit 4b00d0c has been approved by nikomatsakis

@nikomatsakis nikomatsakis added beta-nominated Nominated for backporting to the compiler in the beta channel. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Mar 25, 2016
@nikomatsakis
Copy link
Contributor

@bors r-

@nikomatsakis
Copy link
Contributor

starting a crater run

if p.token == token::Semi {

// Don't bump past the EOF.
if p.token == token::Eof {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me it feels like at this point converting this chain to a match would be better here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish loop-match was a bit more first-class, but I agree.

@nikomatsakis
Copy link
Contributor

Crater report: https://gist.github.com/nikomatsakis/416a719bf29fb0fa8028

4431 crates tested: 2819 working / 1508 broken / 7 regressed / 1 fixed / 96 unknown.

haven't investigated yet.

@eddyb
Copy link
Member Author

eddyb commented Mar 25, 2016

Looks like this breaks syntax extensions (regex, peg and docopt, at least).

@eddyb eddyb force-pushed the eof-not-even-twice branch 2 times, most recently from 88c6a99 to 27a3e24 Compare March 26, 2016 01:06
@nikomatsakis
Copy link
Contributor

cc @BurntSushi @kevinmehall -- curious to get your take here. In response to various bugs where the parser was looping infinitely, @eddyb implemented a change that caused the bump method to panic if you attempt to bump past EOF. This breaks some of your syntax extension crates. We're debating whether to land it and under what conditions. (One option might be to modify your crates.)

@eddyb, I know you were attempting to modify the PR to only abort if we bumped past EOF multiple times, what was the outcome of those experiments?

@eddyb
Copy link
Member Author

eddyb commented Mar 26, 2016

@nikomatsakis Actually, that's the current state, but parse_unspanned_seq is giving me trouble because it's unconditionally bumping a token, even if it's not the ket and the travis failure was because that changes how errors are reported (presumably due to the "expected tokens" list which check, used by eat, pushes onto).

I realize right now that I can just handle the error case by not bumping, and I just pushed that change.

EDIT: Hah, it's not that easy, you can have both errors and a } to bump.

@eddyb eddyb force-pushed the eof-not-even-twice branch 2 times, most recently from 8012a49 to 64936b9 Compare March 26, 2016 13:58
@BurntSushi
Copy link
Member

@nikomatsakis I'm totally fine with breakage of plugins if there's a path to fixing them.

With that said, what part of the parsing code would break? Regex's interaction with the parser is quite small: https://github.com/rust-lang-nursery/regex/blob/master/regex_macros/src/lib.rs#L567

I guess it's clearer where it breaks in docopt: https://github.com/docopt/docopt.rs/blob/master/docopt_macros/src/macro.rs#L234

@eddyb
Copy link
Member Author

eddyb commented Mar 26, 2016

@BurntSushi The original changes would make parser.eat(&token::Eof) always ICE - but that seemed impractical so I'm trying something which will only trigger on the second EOF bump, not the first one.

@kevinmehall
Copy link
Contributor

peg uses parser.eat to obtain the string literal from a peg!("grammar goes here") invocation and check that no additional arguments are passed. If there's a better way to do that, I'm happy to change it. Unstable interfaces are unstable, after all.

@eddyb
Copy link
Member Author

eddyb commented Mar 26, 2016

@kevinmehall Changing that shouldn't be necessary with my latest attempt - just waiting for travis to confirm everything's good with the tests before starting another crater run.

@eddyb
Copy link
Member Author

eddyb commented Mar 26, 2016

@nrc See my changes to compile-fail tests. I believe that if we switch away from parsing everything as TTs first we can handle ; closing (, for example:

{ option.map(|some| 42; }

It currently parses as the following TTs:

{ option.map(|some| 42;) }

It accidentally gave relatively sane errors because parse_unspanned_seq was bumping the ; as if it was ) for the method call, resulting in the same effect as this:

{ option.map(|some| 42)) }

I fixed that so it behaves as it should, given what TTs we end up with, but if we didn't parse it to TTs, we could recover a statement based on ; and parse as:

{ option.map(|some| 42); }

Which is arguably what the intention was, and save for a single parse error, the compilation could continue unhindered.

@nikomatsakis
Copy link
Contributor

Sounds good

On Sat, Mar 26, 2016 at 05:31:54AM -0700, Eduard-Mihai Burtescu wrote:

@nikomatsakis Actually, that's the current state, but parse_unspanned_seq is giving me trouble because it's unconditionally bumping a token, even if it's not the ket and the travis failure was because that changes how errors are reported (presumably due to the "expected tokens" list which check, used by eat, pushes onto).

I realize right now that I can just handle the error case by not bumping, and I just pushed that change.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#32479 (comment)

@nrc
Copy link
Member

nrc commented Mar 28, 2016

lgtm

@eddyb
Copy link
Member Author

eddyb commented Mar 28, 2016

@bors r=nikomatsakis

@bors
Copy link
Contributor

bors commented Mar 28, 2016

📌 Commit 221d0fb has been approved by nikomatsakis

@bors
Copy link
Contributor

bors commented Mar 29, 2016

⌛ Testing commit 221d0fb with merge a111297...

bors added a commit that referenced this pull request Mar 29, 2016
Prevent bumping the parser past the EOF.

Makes `Parser::bump` after EOF into an ICE, forcing callers to avoid repeated EOF bumps.
This ICE is intended to break infinite loops where EOF wasn't stopping the loop.

For example, the handling of EOF in `parse_trait_items`' recovery loop fixes #32446.
But even without this specific fix, the ICE is triggered, which helps diagnosis and UX.

This is a `[breaking-change]` for plugins authors who eagerly eat multiple EOFs.
See docopt/docopt.rs#171 for such an example and the necessary fix.
@bors bors merged commit 221d0fb into rust-lang:master Mar 29, 2016
@eddyb eddyb deleted the eof-not-even-twice branch March 29, 2016 06:07
@nrc nrc added the beta-accepted Accepted for backporting to the compiler in the beta channel. label Mar 31, 2016
@alexcrichton alexcrichton removed the beta-nominated Nominated for backporting to the compiler in the beta channel. label Apr 7, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
beta-accepted Accepted for backporting to the compiler in the beta channel. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Regression on beta: three dots in a trait leads to an infinite loop(?)
8 participants