Report InputMismatchException with original context information #1969

sharwell · 2017-07-26T12:24:43Z

Fixes #1922

📝 This needs more test cases, particularly for LL(k) failures where k>1. However, my current understanding is these cases have been impacted by this bug for much longer (since 4.2.2) than the issue being addressed for the k=1 case.

sharwell · 2017-07-26T12:33:17Z

@Et999 @michaelpj @PayneSun @marcospassos It should be possible to apply this change to your code by extending DefaultErrorStrategy and overriding sync and reportInputMismatch. Can one of you try this and let me know if it resolves the issue in your test suites? Since the InputMismatchException constructor used in this code is not available, you'll have to do a bit of trickery to make this work right:

int priorState = recognizer.getState();
ParserRuleContext priorContext = recognizer.getContext();
try {
  recognizer.setState(nextTokensState);
  recognizer._ctx = nextTokensContext;
  throw new InputMismatchException(recognizer);
} finally {
  recognizer.setState(priorState);
  recognizer._ctx = priorContext;
}

⚠️ Note that the message does still change in some cases relative to 4.6 - you get "mismatched input" instead of "extraneous token". However, the expected set should be reporting good information again.

michaelpj · 2017-07-26T17:57:39Z

Thanks enormously for looking into this! I'll give this a try and see how it looks.

michaelpj · 2017-07-27T14:25:40Z

Built locally, tested. Fixes the regressions in 4.7, and actually improves the errors in several other cases. Big 👍 from me.

Fixes antlr#1922

parrt · 2017-07-29T21:59:55Z

@Et999 and @marcospassos can you build locally and confirm this works for you?

sharwell · 2017-07-30T03:29:33Z

@parrt After confirmation, I'll start the work to implement this for other targets.

marcospassos · 2017-07-30T14:13:33Z

I'll test it by tomorrow :)

parrt · 2017-07-30T15:51:23Z

@antlr/antlr-targets please note Sam's improvements to the error reporting in this PR.

parrt · 2017-07-30T18:06:25Z

ugh. we appear to be getting random travis fluctuations

marcospassos · 2017-08-01T00:30:57Z

@sharwell we caught 160+ tests failing due to the BC break introduced in 4.7, even applying the patch you provided.

The errors can be grouped into the following cases:

No viable alternative at <EOF> is now Mismatched input <EOF> (no big deal)
Missing token ) is now Mismatched input <EOF> (worst error reporting)
Mismatched input <EOF> is now Extraneous token <EOF> (really weird)

Besides this, we found several cases where error recovery got worst. Given the following grammar and invalid input, notice how the error recovery produces worst guesses:

root: variableDeclarationList? expression;

variableDeclarationList
    : variableDeclaration + ;

variableDeclaration
    : 'let' variableDeclarator (',' variableDeclarator)* ';' ;

variableDeclarator
    : identifier '=' expression ;

expression: ...;

Invalid input:

let a = , b = 2; true

In 4.6, the previously input was recognized as:

let a = b; true

While in 4.7 the same input is recognized as:

let a = b;  2

This difference is probably related to the same issue, right?

sharwell · 2017-08-01T00:33:55Z

@marcospassos If you are testing error messages, I would not expect this change to reduce the number of "regressing tests", since the messages will still change. Is your project open source so I can help triage the differences?

Also, before we go too far we should make sure the change was properly applied since the result discrepancy between @michaelpj and @marcospassos is so large.

💭 I notice that your examples all mention EOF. Do all of your failing tests involve EOF handling?

marcospassos · 2017-08-01T00:41:54Z

@sharwell the project is not open source. However, we're not testing only error messages, but also the generated parse tree. Why don't you consider the error message a BC break? This fix not only changed the error messages but also made the parser produce different parse trees, which is a serious BC break.

Btw, is "Extraneous token <EOF>" supposed to be a valid error message?

sharwell · 2017-08-01T00:49:19Z

@marcospassos I sent you an email 👍

sharwell · 2017-08-01T01:58:39Z

After talking with @marcospassos, it appears the negative outcomes from this experiment were likely influenced by uncommon characteristics of the language being parsed. In particular:

Input text is typically short, so error handling strategies which favor error locality to error recovery are favorable
Input is closer to natural language than most, so distinctive tokens that tend to yield good error recovery in many languages (e.g. the braces in Java) don't help for this one

For this case, I suggested creating a type derived from DefaultErrorStrategy using the implementation of sync prior to my change, since it yields better results for this language.

marcospassos · 2017-08-01T02:07:19Z

Thank you @sharwell!

parrt · 2017-08-01T20:36:04Z

Ok, sounds like we have no objections to this improvement.

parrt · 2017-08-01T20:36:20Z

Thanks, @sharwell !!

sharwell force-pushed the propagate-error-sets branch from 278983f to 9a04d09 Compare July 26, 2017 18:19

Report InputMismatchException with original context information

0803c74

Fixes antlr#1922

sharwell force-pushed the propagate-error-sets branch from 9a04d09 to 0803c74 Compare July 27, 2017 23:34

parrt added this to the 4.7.1 milestone Jul 30, 2017

parrt added error-handling target:java type:improvement labels Jul 30, 2017

parrt merged commit a042180 into antlr:master Aug 1, 2017

sharwell deleted the propagate-error-sets branch August 1, 2017 23:25

michaelpj mentioned this pull request Nov 9, 2017

ANTLR 4.7.1? #2109

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Report InputMismatchException with original context information #1969

Report InputMismatchException with original context information #1969

sharwell commented Jul 26, 2017 •

edited

Loading

sharwell commented Jul 26, 2017 •

edited

Loading

michaelpj commented Jul 26, 2017

michaelpj commented Jul 27, 2017

parrt commented Jul 29, 2017

sharwell commented Jul 30, 2017

marcospassos commented Jul 30, 2017

parrt commented Jul 30, 2017

parrt commented Jul 30, 2017

marcospassos commented Aug 1, 2017

sharwell commented Aug 1, 2017 •

edited

Loading

marcospassos commented Aug 1, 2017 •

edited

Loading

sharwell commented Aug 1, 2017

sharwell commented Aug 1, 2017 •

edited

Loading

marcospassos commented Aug 1, 2017

parrt commented Aug 1, 2017

parrt commented Aug 1, 2017

Report InputMismatchException with original context information #1969

Report InputMismatchException with original context information #1969

Conversation

sharwell commented Jul 26, 2017 • edited Loading

sharwell commented Jul 26, 2017 • edited Loading

michaelpj commented Jul 26, 2017

michaelpj commented Jul 27, 2017

parrt commented Jul 29, 2017

sharwell commented Jul 30, 2017

marcospassos commented Jul 30, 2017

parrt commented Jul 30, 2017

parrt commented Jul 30, 2017

marcospassos commented Aug 1, 2017

sharwell commented Aug 1, 2017 • edited Loading

marcospassos commented Aug 1, 2017 • edited Loading

sharwell commented Aug 1, 2017

sharwell commented Aug 1, 2017 • edited Loading

marcospassos commented Aug 1, 2017

parrt commented Aug 1, 2017

parrt commented Aug 1, 2017

sharwell commented Jul 26, 2017 •

edited

Loading

sharwell commented Jul 26, 2017 •

edited

Loading

sharwell commented Aug 1, 2017 •

edited

Loading

marcospassos commented Aug 1, 2017 •

edited

Loading

sharwell commented Aug 1, 2017 •

edited

Loading