Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lexer and Parser in different maven modules #1779

Open
rslemos opened this issue Mar 21, 2017 · 4 comments
Open

Lexer and Parser in different maven modules #1779

rslemos opened this issue Mar 21, 2017 · 4 comments

Comments

@rslemos
Copy link

rslemos commented Mar 21, 2017

For sufficiently complex lexer and/or parser, with lots of supporting code or automated testcases, it would be reasonable to have them at separate maven modules, and have the parser-module compile-depend on lexer-module .

ANTLR4 (and its maven plugin) supports this mostly, with exception of tokenVocab file, which would be written in lexer-module/target/generated-sources/antlr4, but when generating the parser would be searched for in parser-module/target/generated-sources/antlr4.

Complicating the matters a bit, a maven build can be invoked in two ways:

  • build only a single module (the other modules should be installed at the local repo);
  • through the reactor, to build the whole project.

Maven deals with both ways very nicely, adding either ~/.m2/repository/..../lexer-module.jar or ../lexer-module/target/classes to parser-module compilation classpath.

I wonder if it would be reasonable to look for .tokens files also in the classpath. Conceptually it would mean that a lexer package could provide the full information to be used either at runtime (generated and compiled class files) and at compile-time (tokens file, to be consumed by antlr4 generating a parser).

The only change needed in antlr4 itself would be:

  • the point at which the tokens file is searched for, should look at the classpath;
  • the antlr4 maven plugin should make the dependencies available at the tool's classpath.

I have the done those changes at rslemos@b698019.

I would like to hear from you both about the concept and the proposed patch.

@rslemos
Copy link
Author

rslemos commented Mar 21, 2017

Apart from incorporating the changes I've proposed, users should add the following to their lexer-module's pom.xml:

        <build>
                <resources>
                        <resource>
                                <directory>target/generated-sources/antlr4</directory>
                                <includes>
                                        <include>*.tokens</include>
                                </includes>
                        </resource>
                </resources>
        </build>

So that the generated .tokens file gets copied over to target/classes.

Everything else (packaging, classpath handling and so on) would be handled by maven itself.

@rslemos
Copy link
Author

rslemos commented Mar 21, 2017

Personally I think a more reasonable path to store the .tokens file would be inside META-INF/antlr4.

Although changes to antlr4 code to lookup on that folder would be easy to do, actually moving the file there is a difficult task in maven (ok, not that it is really hard, but adds a lot more lines than just a single and simple <resource> element).

@rslemos
Copy link
Author

rslemos commented Mar 21, 2017

If you want to try it on a pet-project, rslemos/pet-grammars@640118e, contains the modules l (for lexer) and g (for grammars). The build should fail without my proposed patch. [please, only build, don't test that pet-project, because some tests will fail on purpose and may confuse the reader]

@rslemos
Copy link
Author

rslemos commented Apr 19, 2017

Perhaps linked to #638.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant