Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Escape issue for characeters #2885

Closed
rogerfar opened this issue Aug 20, 2020 · 2 comments
Closed

Escape issue for characeters #2885

rogerfar opened this issue Aug 20, 2020 · 2 comments

Comments

@rogerfar
Copy link

My parser contains the following characters:

WS                      : (' '|'\r'|'\t'|'\u000C'|'\n') -> skip;
SQ                      : ['];
DQ                      : '\u0022';
SP                      : '\u0020';
HTAB                    : '\u0009';
CR                      : '\u000D';
LF                      : '\u000A';

When I target C# the code is good and compiles fine, no errors in C# or Antlr.

But when I compile it against Java, I get:

error: unclosed character literal "'false'", "'null'", null, null, "'\u0022'", "'\u0020'", "'\u0009'",

When I look in the generated Java file I see:

private static String[] makeLiteralNames() {
	return new String[] {
		null, null, null, "'datetimeoffset'", "'datetime'", "'guid'", "'true'", 
		"'false'", "'null'", null, null, "'\u0022'", "'\u0020'", "'\u0009'", 
		"'\u000D'", "'\u000A'", null, null, null, "'$'", null, null, null, null, 
		"'('", "')'", "'['", "']'", "'{'", "'}'", "'~'", null, null, "'/'", "'.'", 
		"':'", "'%'", "'@'", "'!'", "'?'", "'_'", null, null, null, null, null, 
		null, "'asc'", "'desc'", "'and'", "'or'", "'eq'", "'ne'", "'lt'", "'le'", 
		"'gt'", "'ge'", "'in'", "'nin'", "'nlike'", "'like'"
	};
}

The issue is that "'\u0022'" is not valid Java, this should be '\u0022'.

Now this can be mitigated by changing the grammer to:

WS                      : (' '|'\r'|'\t'|'\u000C'|'\n') -> skip;
SQ                      : ['];
DQ                      : [\u0022];
SP                      : [\u0020];
HTAB                    : [\u0009];
CR                      : [\u000D];
LF                      : [\u000A];

But is this the recommended approach or a bug?

@KvanTTT
Copy link
Member

KvanTTT commented Nov 13, 2021

It looks duplicate to #2281, it's a bug.

The issue is that "'\u0022'" is not valid Java, this should be '\u0022'.

No it should be "'\\u0022'". Your literal also throws compiler erros.

KvanTTT added a commit to KvanTTT/antlr4 that referenced this issue Nov 13, 2021
KvanTTT added a commit to KvanTTT/antlr4 that referenced this issue Dec 23, 2021
… (display \r\n instead of empty line)

fixes antlr#2281, antlr#2885

Restored missed test PredFromAltTestedInLoopBack_1
KvanTTT added a commit to KvanTTT/antlr4 that referenced this issue Dec 24, 2021
… (display \r\n instead of empty line)

fixes antlr#2281, antlr#2885

Restored missed test PredFromAltTestedInLoopBack_1
KvanTTT added a commit to KvanTTT/antlr4 that referenced this issue Dec 27, 2021
… (display \r\n instead of empty line)

fixes antlr#2281, antlr#2885

Restored missed test PredFromAltTestedInLoopBack_1
@KvanTTT
Copy link
Member

KvanTTT commented Dec 28, 2021

@parrt also close

@parrt parrt closed this as completed Dec 28, 2021
@parrt parrt added this to the 4.9.4 milestone Dec 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants