Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Native executable with GraalVM #70

Closed
tani opened this issue May 5, 2021 · 3 comments
Closed

Native executable with GraalVM #70

tani opened this issue May 5, 2021 · 3 comments
Labels
1-feature-request ✨ Issue type: Request for a desirable, nice-to-have feature 3-out-of-scope Issue resolution: Issue is out of LTeX's feature scope, or fixing this would be too complicated

Comments

@tani
Copy link

tani commented May 5, 2021

Is your feature request related to a problem? Please describe.

Thank you for the great software. I like this project because the Language Server Protocol is more stable and solid than LanguageTool Protocol. However, I have one suggestion to be a faster startup. We currently have two issues.

  • Downloading Java VM
  • JavaVM startup is slow.

Describe the solution you'd like

A few years ago, Oracle releases the new Java VM called GraalVM, which provide an AOT compiler that compiles .jar file to a native executable. The optimization makes the binary faster magically and also reduces the largest dependency i.e., Java VM because this is the native compilation.

Describe alternatives you've considered

There is another native compilation is known in Java Community but it is not so actively developed. I guess the GraalVM is more stable since the developer is Oracle.

Additional context

I also consider using this language server in my local service. I'd like to reduce external dependencies. You can see a real-world example like clj-kondo.

@tani tani added the 1-feature-request ✨ Issue type: Request for a desirable, nice-to-have feature label May 5, 2021
@valentjn
Copy link
Owner

valentjn commented May 8, 2021

Thanks for the feature request. This is basically a duplicate of valentjn/vscode-ltex#5, where I tried native-image a year ago.

I just tried it again and the situation definitely got better, due to fewer missing features in native-image and due to better support from the community. However, it's still pretty complicated. The main problem is not the code of LTEX LS, but that native-image has to compile all transitive dependencies, and there are a lot of them for LanguageTool.

Some of the problems I see include:

  • native-image crashed with an out of memory error for me, so I needed to increase Java's maximum heap size to 8GB. However, that's only possible if you disable the native-image server (that's not really documented), and that slows the compilation process down to 50mins. So, every time I try to fix something, I have to wait an hour. This might also be a problem for the resources of our GitHub Actions runners.
  • native-image is a mess with class initialization. Unfortunately, the dependencies make use of old logging libraries that heavily rely on that.
  • The resulting binaries were at least 300MB in size in my tests, which is 50% larger than the JAR version.
  • The transitive dependencies don't only consist of Java code, but also external stuff like resources and even platform-dependent binaries (e.g., Hunspell), all of which is loaded dynamically at run-time. So even if it seemed to work for some simple setup, there would be no guarantee it would work in general, as changing the language or the set of enabled LanguageTool rules might load some obscure, not properly embedded external file, and it breaks.
  • The same holds for reflection, which is still a weak point of native-image.
  • The available help on the internet is little as the user base of native-image is still too small.

Right now, I'm stuck with this error at run-time (despite trying the fix from here):

Exception in thread "main" java.lang.NoClassDefFoundError: org.apache.commons.logging.LogFactory
        at org.apache.commons.logging.LogFactory.class$(LogFactory.java:1021)
        at org.apache.commons.logging.LogFactory.<clinit>(LogFactory.java:1674)
        at com.oracle.svm.core.hub.ClassInitializationInfo.invokeClassInitializer(ClassInitializationInfo.java:350)
        at com.oracle.svm.core.hub.ClassInitializationInfo.initialize(ClassInitializationInfo.java:270)
        at java.lang.Class.ensureInitialized(DynamicHub.java:496)
        at net.loomchild.segment.srx.io.Srx2SaxParser.<clinit>(Srx2SaxParser.java:53)
        at com.oracle.svm.core.hub.ClassInitializationInfo.invokeClassInitializer(ClassInitializationInfo.java:350)
        at com.oracle.svm.core.hub.ClassInitializationInfo.initialize(ClassInitializationInfo.java:270)
        at java.lang.Class.ensureInitialized(DynamicHub.java:496)
        at org.languagetool.tokenizers.SrxTools.createSrxDocument(SrxTools.java:52)
        at org.languagetool.tokenizers.SRXSentenceTokenizer.<init>(SRXSentenceTokenizer.java:53)
        at org.languagetool.tokenizers.SimpleSentenceTokenizer.<init>(SimpleSentenceTokenizer.java:38)
        at org.languagetool.Language.<clinit>(Language.java:61)
        at com.oracle.svm.core.hub.ClassInitializationInfo.invokeClassInitializer(ClassInitializationInfo.java:350)
        at com.oracle.svm.core.hub.ClassInitializationInfo.initialize(ClassInitializationInfo.java:270)
        at java.lang.Class.ensureInitialized(DynamicHub.java:496)
        at com.oracle.svm.core.hub.ClassInitializationInfo.initialize(ClassInitializationInfo.java:235)
        at java.lang.Class.ensureInitialized(DynamicHub.java:496)
        at org.languagetool.Languages.<clinit>(Languages.java:42)
        at com.oracle.svm.core.hub.ClassInitializationInfo.invokeClassInitializer(ClassInitializationInfo.java:350)
        at com.oracle.svm.core.hub.ClassInitializationInfo.initialize(ClassInitializationInfo.java:270)
        at java.lang.Class.ensureInitialized(DynamicHub.java:496)
        at org.bsplines.ltexls.languagetool.LanguageToolJavaInterface.<init>(LanguageToolJavaInterface.java:53)
        at org.bsplines.ltexls.settings.SettingsManager.reinitializeLanguageToolInterface(SettingsManager.java:52)
        at org.bsplines.ltexls.settings.SettingsManager.<init>(SettingsManager.java:36)
        at org.bsplines.ltexls.settings.SettingsManager.<init>(SettingsManager.java:31)
        at org.bsplines.ltexls.server.LtexLanguageServer.<init>(LtexLanguageServer.java:53)
        at LtexLanguageServerLauncher.launch(LtexLanguageServerLauncher.java:125)
        at LtexLanguageServerLauncher.internalCall(LtexLanguageServerLauncher.java:114)
        at LtexLanguageServerLauncher.call(LtexLanguageServerLauncher.java:88)
        at LtexLanguageServerLauncher.call(LtexLanguageServerLauncher.java:1)
        at picocli.CommandLine.executeUserObject(CommandLine.java:1953)
        at picocli.CommandLine.access$1300(CommandLine.java:145)
        at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2352)
        at picocli.CommandLine$RunLast.handle(CommandLine.java:2346)
        at picocli.CommandLine$RunLast.handle(CommandLine.java:2311)
        at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2179)
        at picocli.CommandLine.execute(CommandLine.java:2078)
        at LtexLanguageServerLauncher.main(LtexLanguageServerLauncher.java:140)

Compared to these downsides, the benefits you mention are quite small IMO:

  • Downloading Java can be avoided if we bundle Java with LTEX LS. I can create platform-dependent archives that include Java (only 40MB extra; still smaller than the binaries created by native-image), so that the bin/ltex-ls scripts automatically use the bundled Java. As mentioned in Remove Java dependency vscode-ltex#5 (vscode-ltex automatically downloads Java if no Java is found on the system), I don't really see the difference between a platform-dependent binary and a platform-dependent archive, when both work out-of-the-box. From the user's perspective, they're the same.
  • Java's startup is not that slow. bin/ltex-ls --version takes approximately 1s for me. As the server has to be started only once per session, I think that's acceptable. The run-time (checking) speed is also acceptable.
  • The largest dependency is not Java like you mention. Currently, the largest dependency is LanguageTool at 110MB when only counting LanguageTool itself, and 191MB when including all of its dependencies. Compared to that, Java is pretty small with its 40MB. (LTEX LS itself and the remaining dependencies are only 5MB.)

@tani
Copy link
Author

tani commented May 8, 2021

Thank you for explaining me. I got the context and am sorry for the duplicating issues.

native-image crashed with an out of memory error for me, so I needed to increase Java's maximum heap size to 8GB. However, that's only possible if you disable the native-image server (that's not really documented), and that slows the compilation process down to 50mins. So, every time I try to fix something, I have to wait an hour. This might also be a problem for the resources of our GitHub Actions runners.

OMG, that is a serious problem around your coding environment.

native-image is a mess with class initialization. Unfortunately, the dependencies make use of old logging libraries that heavily rely on that.
...
The transitive dependencies don't only consist of Java code, but also external stuff like resources and even platform-dependent binaries (e.g., Hunspell), all of which is loaded dynamically at run-time. So even if it seemed to work for some simple setup, there would be no guarantee it would work in general, as changing the language or the set of enabled LanguageTool rules might load some obscure, not properly embedded external file, and it breaks.
The same holds for reflection, which is still a weak point of native-image.

I see that is a problem only for this repository but also the problem of LanguageTool. It looks hard to fix it,

Is the source code of the script here?

ltex-ls/ltexls/pom.xml

Lines 305 to 354 in 8dd2120

<chmod file="target/appassembler/bin/ltex-ls" perm="755"/>
<replace file="target/appassembler/bin/ltex-ls.bat">
<replacetoken>if "%JAVACMD%"=="" set JAVACMD=java</replacetoken>
<replacevalue><![CDATA[if "%JAVACMD%" NEQ "" goto init
@rem Find java.exe
if defined JAVA_HOME goto findJavaFromJavaHome
set JAVACMD=java.exe
%JAVACMD% -version >NUL 2>&1
if "%ERRORLEVEL%" == "0" goto init
echo.
echo ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.
echo.
echo Please set the JAVA_HOME variable in your environment to match the
echo location of your Java installation.
goto error
:findJavaFromJavaHome
set JAVA_HOME=%JAVA_HOME:"=%
set JAVACMD=%JAVA_HOME%/bin/java.exe
if exist "%JAVACMD%" goto init
echo.
echo ERROR: JAVA_HOME is set to an invalid directory: %JAVA_HOME%
echo.
echo Please set the JAVA_HOME variable in your environment to match the
echo location of your Java installation.
goto error
:init]]></replacevalue>
</replace>
<replace file="target/appassembler/bin/ltex-ls.bat">
<replacetoken>"%REPO%"\*</replacetoken>
<replacevalue>"%REPO%"\ltexls-languagetool-patch-${project.version}.jar;"%REPO%"\*</replacevalue>
</replace>
<replace file="target/appassembler/bin/ltex-ls.bat">
<replacetoken>%JAVACMD% %JAVA_OPTS% -classpath %CLASSPATH%</replacetoken>
<replacevalue>"%JAVACMD%" %JAVA_OPTS% -classpath %CLASSPATH%</replacevalue>
</replace>
<replace file="target/appassembler/bin/ltex-ls.bat">
<replacetoken>set ERROR_CODE=%ERRORLEVEL%</replacetoken>
<replacevalue>set ERROR_CODE=%ERRORLEVEL%
if %ERROR_CODE% EQU 0 set ERROR_CODE=1</replacevalue>
</replace>
</target>

I'd like to the Linux shell script. Can I read it online? Finally, I got enough explanation in your answer. Thanks!

valentjn added a commit that referenced this issue May 8, 2021
@valentjn
Copy link
Owner

valentjn commented May 8, 2021

Thank you for your response. I have added support for platform-dependent archives (which include LTEX LS and Java via AdoptOpenJDK) for all future releases. You can test it with the new 12.2.0-alpha.1 pre-release, which is feature-wise equivalent to 12.1.0. Just download the version for your platform and run the bin/ltex-ls or the bin\ltex-ls.bat script. It automatically uses the bundled Java (except if you have JAVA_HOME set to a different Java installation), so no Java installation is needed anymore. I hope this is sufficient for your purposes. I'll close this issue as currently out-of-scope. Thanks anyway for the feature request.

The bin/ scripts are dynamically created by a Maven plugin called AppAssembler (using this template) during the build process of new releases (via GitHub Actions), so the shell script is not included in the repository itself. You need to download a release to read the actual script.

The code you mentioned modifies/patches the Windows BAT script generated by AppAssembler to add support for JAVA_HOME (the Windows BAT script generated by AppAssembler doesn't support JAVA_HOME). The modification is done via a search and replace operation in the script, so the code you cite is only a part of the script, and it's the Windows script, not the Linux script.

@valentjn valentjn closed this as completed May 8, 2021
@valentjn valentjn added the 3-out-of-scope Issue resolution: Issue is out of LTeX's feature scope, or fixing this would be too complicated label May 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1-feature-request ✨ Issue type: Request for a desirable, nice-to-have feature 3-out-of-scope Issue resolution: Issue is out of LTeX's feature scope, or fixing this would be too complicated
Projects
None yet
Development

No branches or pull requests

2 participants