Incorrect childs generation with array of element label #1163

KvanTTT · 2016-04-07T13:30:32Z

I have the following rule in grammar:

multiply_expression
    : expression (op=('+' | '-') expression)*
    ;

But after parser generation I got the following childs in Multiply_expressionContext (C# runtime):

public partial class Multiply_expressionContext : ParserRuleContext {
    public IToken op;   // Incorrect - should be array of IToken.
    ...
    public ExpressionContext[] expression() {    // Correct - array of Expression.
        return GetRuleContexts<Expression>();
    }
}

beardlybread · 2016-04-08T20:07:19Z

Is multiply_expression defined the same way as additive_expression in the grammar?

KvanTTT · 2016-04-08T23:41:37Z

Sorry, it's my fault. I replaced additive_expression with multiply_expression.

beardlybread · 2016-04-09T00:45:50Z

I fed this to the Java runtime, and I get analogous results:

grammar Agrammar;
additiveExpression : expression (op=OP expr=expression)* ;
expression : Number ;
Number : [0-9]+ ;
OP : '+' | '-' ;

Looking at the code, though, I think this is the way it's intended to work. Both symbols (op and expr) are only referenced inside what appears to be the loop that is giving the behavior of the Kleene star:

    public final AdditiveExpressionContext additiveExpression() throws RecognitionException {
        ...
        try {
            enterOuterAlt(_localctx, 1);
            {
            setState(4);
            expression();  // match the first expression
            setState(9);
            _errHandler.sync(this);
            _la = _input.LA(1);  // look ahead (I'm guessing) for an operation
            while (_la==OP) {
                {
                {
                setState(5);
                ((AdditiveExpressionContext)_localctx).op = match(OP); // grab op
                setState(6);
                ((AdditiveExpressionContext)_localctx).expr = expression();  // grab expr
                }
                }
                setState(11);
                _errHandler.sync(this);
                _la = _input.LA(1);  // fall out of loop if we don't see another OP
            }
            }
        }

so in a sense they're kind of behaving like thing in

for (Object thing: things) {
...
}

I'll play with this some more and see, but that's what my gut is telling me.

EDIT:

Yeah, the variables are being masked from use in the listeners because there aren't any callbacks exposed for the * inside the rule. Modifying Antlr to make it work how you want would probably be a bit tricky, though.

EDIT AGAIN:

If you do

additiveExpression : expression (opExpression)* ;
opExpression : op=OP expr=expression ;
...

you get the behavior and the listeners.

KvanTTT · 2016-04-09T07:49:17Z

I know about this workaround. But it's yet another rule. And this what I would like to avoid.

ericvergnaud · 2016-04-09T08:12:15Z

I believe it is indeed the way it's supposed to work.
If you think about it, OP being a token, it is an invariant from the parser's point of view, so there's no need to distinguish between '+' and '-' since that is what was written in the lexer (the 2 are equivalent).
If you need to distinguish between them, then you might want to try using 2 different tokens, or a grammar rule.
(I haven't tried it, so I might be wrong)

KvanTTT · 2016-04-09T09:21:38Z

There are cases in which tokens should be distinguished. In this case I want to use the value of operator in Visitor:

for (int i = 1; i < context.expression().Length; i++)
{
    string opValue = context.op(i - 1).GetText();
}

Without op label it would be tricky.
Anyway such behaviour should be considered as bug.

ericvergnaud · 2016-04-09T12:57:03Z

Have you tried the proposed alternatives?

beardlybread · 2016-04-09T14:21:31Z

I did one more thing and tried to assign a label to the star expression directly:

grammar Agrammar;

additiveExpression : expression star=(opExpression)* ;
opExpression : op=OP expr=expression ;
expression : Number ;
Number : [0-9]+ ;
OP : '+' | '-' ;

which gives the error label star assigned to a block which is not a set. This seems to imply that SetAST and BlockAST would need to be semantically identical from the perspective of code generation. Even if we grant that this should be considered a bug, it looks like it would require the code to fundamentally change, so I doubt it will happen. Maybe it could be considered as a major feature addition?

ericvergnaud · 2016-04-09T15:32:52Z

In the above grammar, if you visit opExpression you'll find that op contains the actual token text, so there is neither a bug or a need for a feature

beardlybread · 2016-04-09T17:19:29Z

@ericvergnaud Just to clarify, I was speaking hypothetically. I wasn't actually suggesting a fundamental restructuring of the Antlr AST classes would be a good "feature" to add. 😄

parrt · 2016-12-10T22:15:39Z

intended behavior. += or repeated references (not parses) give arrays.

KvanTTT mentioned this issue Dec 4, 2016

Add support to label rules of mixed types #1409

Closed

parrt closed this as completed Dec 10, 2016

parrt added grammars status:invalid labels Dec 10, 2016

parrt added this to the 4.6 milestone Dec 10, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrect childs generation with array of element label #1163

Incorrect childs generation with array of element label #1163

KvanTTT commented Apr 7, 2016

beardlybread commented Apr 8, 2016

KvanTTT commented Apr 8, 2016

beardlybread commented Apr 9, 2016

KvanTTT commented Apr 9, 2016

ericvergnaud commented Apr 9, 2016

KvanTTT commented Apr 9, 2016

ericvergnaud commented Apr 9, 2016

beardlybread commented Apr 9, 2016

ericvergnaud commented Apr 9, 2016

beardlybread commented Apr 9, 2016

parrt commented Dec 10, 2016

Incorrect childs generation with array of element label #1163

Incorrect childs generation with array of element label #1163

Comments

KvanTTT commented Apr 7, 2016

beardlybread commented Apr 8, 2016

KvanTTT commented Apr 8, 2016

beardlybread commented Apr 9, 2016

KvanTTT commented Apr 9, 2016

ericvergnaud commented Apr 9, 2016

KvanTTT commented Apr 9, 2016

ericvergnaud commented Apr 9, 2016

beardlybread commented Apr 9, 2016

ericvergnaud commented Apr 9, 2016

beardlybread commented Apr 9, 2016

parrt commented Dec 10, 2016