Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issues with version 11.0.19 #332

Open
Mateusz-Krzyszpien opened this issue Jun 13, 2023 · 8 comments
Open

Performance issues with version 11.0.19 #332

Mateusz-Krzyszpien opened this issue Jun 13, 2023 · 8 comments
Labels
bug Something isn't working

Comments

@Mateusz-Krzyszpien
Copy link

Thank you for taking the time to help improve OpenJDK and Corretto 11.

If your request concerns a security vulnerability then please report it by email to aws-security@amazon.com instead of here.
(You can find more information regarding security issues at https://aws.amazon.com/security/vulnerability-reporting/.)

Otherwise, if your issue concerns OpenJDK 11 and is not specific to Corretto 11 we ask that you raise it to the OpenJDK community. Depending on your contributor status for OpenJDK, please use the JDK bug system or
the appropriate mailing list for the given problem area or update project.

If your issue is specific to Corretto 11, then you are in the right place. Please proceed with the following.

Describe the bug

After upgrade from version 11.0.13 to 11.0.19 we could observer that our production Wildfly servers started to utilize CPU up to 100%. Such high utilization made application servers unresponsive. This of course made application unavailable.

After rollback from 11.0.19 to 11.0.13 we could observe that CPU utilization went back to normal 20 - 30 %.

To Reproduce

Unfortunately we were not able to reproduce this issue at lower environments.

Expected behavior

I would like you to check if there are any recent changes to Java code that could create performance issues.

Platform information

OS: Windows Server 2022 Datacenter
Version: java -version

openjdk version "11.0.19" 2023-04-18 LTS
OpenJDK Runtime Environment Corretto-11.0.19.7.1 (build 11.0.19+7-LTS)
OpenJDK 64-Bit Server VM Corretto-11.0.19.7.1 (build 11.0.19+7-LTS, mixed mode)

@Mateusz-Krzyszpien Mateusz-Krzyszpien added the bug Something isn't working label Jun 13, 2023
@caojoshua
Copy link
Contributor

I would like you to check if there are any recent changes to Java code that could create performance issues.

There is 1.5 years between release of 11.0.13 and 11.0.19. I'm not aware of what might cause such significant increase in CPU utilization.


I can suggest some options which can help us narrow down the issue

  1. Check top -H to see which threads have high usage.
  2. Upgrade to an older Corretto-11. First try upgrade 11.0.13 -> 11.0.16 and monitor for increased CPU utilization. If we can find the exact Corretto-11 version that is causing the increase CPU utilization, it will be easier to find a patch that is causing issues.
  3. Use async profiler and look for differences in data between 11.0.13 and 11.0.19.

@mikebell90
Copy link

@Mateusz-Krzyszpien I bet you run on a container?

11.0.17 added this change

CPU Shares Ignored When Computing Active Processor Count (JDK-8281181) Previous JDK releases used an incorrect interpretation of the Linux cgroups parameter "cpu.shares". This might cause the JVM to use fewer CPUs than available, leading to an under utilization of CPU resources when the JVM is used inside a container.
Starting from this JDK release, by default, the JVM no longer considers "cpu.shares" when deciding the number of threads to be used by the various thread pools. The -XX:+UseContainerCpuShares command-line option can be used to revert to the previous behavior. This option is deprecated and may be removed in a future JDK relea

@benty-amzn
Copy link
Contributor

Since there's been no response here, assuming you were able to resolve your regression. Please feel free to re-open or cut a new issue if you need any additional assistance.

@mnunna-broadcom
Copy link

@benty-amzn
We see the same issue and ended up reverting to 11.0.16. We can try using -XX:+UseContainerCpuShares but it doesnt feel right to depend on a deprecated feature to tune this. Can you please re-open this?

@benty-amzn benty-amzn reopened this Aug 30, 2023
@benty-amzn
Copy link
Contributor

Sure, happy to reopen.

  • Are you seeing the same behavior where CPU usage grows to 100% on the newer version?
  • Have you confirmed that 11.0.16 is the latest version which does not trigger the behavior?
  • Can you confirm whether enabling the -XX:+UseContainerCpuShares restores the expected behavior on newer versions?

@mikebell90
Copy link

  • Another choice is of course to use the xx availableprocessorcount (or whatever it is), and pre-choose. This bypasses the auto select.
  • The tickets from oracle go into detail why they chose to do this, though I found it indeed caused issues.

@mnunna-broadcom
Copy link

We will run these tests and get back to you

@stetze
Copy link

stetze commented Sep 4, 2024

I have the same issue on a Windows Server 2022, with 11.0.23.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants