Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ISSUE #10020] add MemorySafeLinkedBlockingQueue #10021

Merged
merged 3 commits into from
Jun 20, 2022
Merged

[ISSUE #10020] add MemorySafeLinkedBlockingQueue #10021

merged 3 commits into from
Jun 20, 2022

Conversation

loongs-zhang
Copy link
Member

@loongs-zhang loongs-zhang commented May 9, 2022

What is the purpose of the change

Can completely solve the OOM problem caused by {@link java.util.concurrent.LinkedBlockingQueue}, does not depend on {@link java.lang.instrument.Instrumentation} and is easier to use than {@link MemoryLimitedLinkedBlockingQueue}.

Brief changelog

Verifying this change

Checklist

  • Make sure there is a GitHub_issue field for the change (usually before you start working on it). Trivial changes like typos do not require a GitHub issue. Your pull request should address just this issue, without pulling in other changes - one PR resolves one issue.
  • Each commit in the pull request should have a meaningful subject line and body.
  • Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
  • Check if is necessary to patch to Dubbo 3 if you are work on Dubbo 2.7
  • Write necessary unit-test to verify your logic correction, more mock a little better when cross module dependency exist. If the new feature or significant change is committed, please remember to add sample in dubbo samples project.
  • Add some description to dubbo-website project if you are requesting to add a feature.
  • GitHub Actions works fine on your own branch.
  • If this contribution is large, please follow the Software Donation Guide.

@codecov-commenter
Copy link

codecov-commenter commented May 9, 2022

Codecov Report

Attention: Patch coverage is 40.54054% with 22 lines in your changes missing coverage. Please review.

Project coverage is 65.71%. Comparing base (ec98ff3) to head (c4207a5).
Report is 174 commits behind head on 3.0.

Files Patch % Lines
...mmon/threadpool/MemorySafeLinkedBlockingQueue.java 38.88% 10 Missing and 1 partial ⚠️
...dubbo/common/threadpool/MemoryLimitCalculator.java 61.53% 5 Missing ⚠️
.../apache/dubbo/common/threadpool/MemoryLimiter.java 0.00% 2 Missing and 1 partial ⚠️
...on/threadpool/support/cached/CachedThreadPool.java 0.00% 0 Missing and 1 partial ⚠️
...mmon/threadpool/support/fixed/FixedThreadPool.java 0.00% 0 Missing and 1 partial ⚠️
.../threadpool/support/limited/LimitedThreadPool.java 0.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##                3.0   #10021      +/-   ##
============================================
- Coverage     65.75%   65.71%   -0.04%     
  Complexity      320      320              
============================================
  Files          1218     1222       +4     
  Lines         53052    53174     +122     
  Branches       8037     8057      +20     
============================================
+ Hits          34883    34943      +60     
- Misses        14368    14417      +49     
- Partials       3801     3814      +13     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@wangchengming666
Copy link
Member

@dragon-zhang That's a good idea. Can you provide some results of memory analysis, such as jmap

@loongs-zhang
Copy link
Member Author

@dragon-zhang That's a good idea. Can you provide some results of memory analysis, such as jmap

Hi, I believe org.apache.dubbo.common.threadpool.MemorySafeLinkedBlockingQueueTest has proved enough.

Copy link
Member

@wangchengming666 wangchengming666 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@chickenlj chickenlj added this to the 3.0.9 milestone May 11, 2022
@loongs-zhang
Copy link
Member Author

:)

@loongs-zhang
Copy link
Member Author

Let me ask a question, where should I add description to dubbo-website?

@lingfengcoder
Copy link

nice idea

@Guiqu1aixi
Copy link

我很困惑,为什么MemorySafeLinkedBlockingQueue#hasRemainedMemory()中要使用usage.getCommitted()与maxFreeMemory相比较。
MemoryLimitCalculator后面又提供了两个方法分别是calculate 和 defaultLimit都是和当前jvm commited内存的比例值相比较,这样可能更加科学一点,
而hasRemainedMemory()直接使用当前commited相比较(第78行),感觉并不能预防OOM。

@HashJang
Copy link

HashJang commented Jun 6, 2022

嗯,思路挺好的~不过,我这里有些疑问以及思考,仅供交流:MemorySafeLBQ 是限制每个队列对象的使用内存大小,但是使用 Instrumetation.getObjectSize() 只能检查这个对象的占用,这个对象引用的对象的内存占用米有算进去。实际业务使用引用非常复杂, 放入队列的一般是门面对象,主要内存占用不在这里。然后是 MemoryLimitedLBQ 通过 MemoryMXBean 查看当前堆内存实际占用,但是没有考虑 GC 的因素,当前内存占用也许大部分都是可以回收的的,也许 YoungGC 之后就有那么多内存可以使用了。总体来看,MemorySafeLBQ 限制并没有限制住,MemoryLimitedLBQ 限制的可能太死。这个可能还是需要底层 JDK + JNI 去实现,查看队列中的对象的存活时间,结合整个堆对象的存活时间以及占用情况,判断是否真的内存紧缺了。或者是限制一个队列用的内存大小,但是检查里面放入对象的本身内存以及所有引用内存。

不过这些,其实引入 Project Valhalla 以及 Project Loom 之后就都好说了,不需要线程池了。但是这个限制内存占用的队列还是有意义的,我觉得可以向 Java 社区提一下这个建议

@lingfengcoder
Copy link

我很困惑,为什么MemorySafeLinkedBlockingQueue#hasRemainedMemory()中要使用usage.getCommitted()与maxFreeMemory相比较。 MemoryLimitCalculator后面又提供了两个方法分别是calculate 和 defaultLimit都是和当前jvm commited内存的比例值相比较,这样可能更加科学一点, 而hasRemainedMemory()直接使用当前commited相比较(第78行),感觉并不能预防OOM。

我觉得也是有点问题,可用内存应该是 long available = usage.getMax() - usage.getUsed(); 吧 根据这个进行限制 而不是commit

@loongs-zhang
Copy link
Member Author

CN
请试着用逆向思维来思考,MemorySafeLinkedBlockingQueue做限制的判断是基于JVM可用内存的,比如JVM一共有1GB的内存,MSLBQ使用默认构造方法。MSLBQ随心所欲地使用内存,但是会尽可能地保证JVM剩余的内存>=256MB;当发现JVM剩余内存<256MB时,MSLBQ拒绝写入
基于剩余内存判断的好处是,一方面能够自适应不同的硬件(1GB有756MB可用,2GB有1780MB),另一方面是能够以较低成本兼容大多数情况(不关心到底发生了多少次GC,不关心到底是YoungGC还是FullGC,因为这些最终都会导致JVM可用内存增加)。

EN
Please try to think in a reverse way. The judgment of MemorySafeLinkedBlockingQueue is based on JVM available memory. For example, the JVM has a total of 1GB of memory. MSLBQ uses the default construction method. MSLBQ will use memory at will, but will try to ensure that the remaining memory in the JVM is >=256MB; When the JVM's remaining memory is found to be <256MB, MSLBQ will reject writing.
The advantage of judging based on the remaining memory is that, on the one hand, it can adapt to different hardware(756MB available for 1GB and 1780mb available for 2GB), and on the other hand, it can be compatible with most cases at a lower cost (it doesn't care how many GCs occur, or whether it is YoungGC or FullGC, because these will eventually lead to an increase in JVM available memory).

@Guiqu1aixi
Copy link

上文提到:"当发现JVM剩余内存<256MB时,MSLBQ会拒绝写入"

个人肯定通过可用内存来判断MSLBQ是否可写这一方案的价值。
但根据代码来看,MSLBQ是否会拒绝写入,在于MemoryLimitCalculator.maxAvailable() > maxFreeMemory,
这里并非是使用剩余可用内存来进行判断的,而是使用commited的内存大小进行判断的,
当前剩余可用内存大小应该等于commited - used 或者 max - used;
不知道我是否表达的清楚,😄😄😄

@loongs-zhang
Copy link
Member Author

loongs-zhang commented Jun 6, 2022

Wait a minute, I'll make sure. Maybe I should use java.lang.Runtime#freeMemory

@Guiqu1aixi
Copy link

MemoryUsage#getMax() - MemoryUsage#getUsed()即可
😄😄😄😄😄😄

@loongs-zhang
Copy link
Member Author

loongs-zhang commented Jun 6, 2022

MemoryUsage#getMax() - MemoryUsage#getUsed()即可 😄😄😄😄😄😄

I've test in local machine, Runtime#freeMemory equals MemoryUsage#getMax() - MemoryUsage#getUsed().

Thanks for pointing out my mistake, I'll fix it.

@cybbeta
Copy link

cybbeta commented Jun 7, 2022

-Xms -Xmx配置不相同的时候,可用内存=Runtime.maxMemory() - Runtime.totalMemory() + Runtime.freeMemory() 这样可用内存应该更准确

@loongs-zhang
Copy link
Member Author

-Xms -Xmx配置不相同的时候,可用内存=Runtime.maxMemory() - Runtime.totalMemory() + Runtime.freeMemory() 这样可用内存应该更准确

CN
我认为不应该把JVM自身使用的内存认为可用内存。

EN
I don't think the memory used by the JVM itself should be regarded as available memory.

Copy link
Contributor

@chickenlj chickenlj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@jingzhouzhao
Copy link

MemoryUsage#getMax() - MemoryUsage#getUsed()即可 😄😄😄😄😄😄

MemoryUsage#getMax() 表示可用于内存管理的最大内存量(以字节为单位)。它的值可能是未定义的。如果已定义,最大内存量可能会随时间变化。如果定义了max ,则已使用和已提交的内存量将始终小于或等于max 。如果尝试增加已使用的内存,即使used <= max仍然为 true(例如,当系统的虚拟内存不足时),内存分配可能会失败。

max定义的内存可能会由于系统内存不足分配失败,是不是commited - used 更妥😆😆😆😆😆

@Guiqu1aixi
Copy link

MemoryUsage#getMax() - MemoryUsage#getUsed()即可 😄😄😄😄😄😄

MemoryUsage#getMax() 表示可用于内存管理的最大内存量(以字节为单位)。它的值可能是未定义的。如果已定义,最大内存量可能会随时间变化。如果定义了max ,则已使用和已提交的内存量将始终小于或等于max 。如果尝试增加已使用的内存,即使used <= max仍然为 true(例如,当系统的虚拟内存不足时),内存分配可能会失败。

max定义的内存可能会由于系统内存不足分配失败,是不是commited - used 更妥😆😆😆😆😆

问题的起源在于剩余内存如何较为科学与准确的计算,从而引申出一系列的内存概念,一共有四个分别是:init、used、commited、max,如果jvm启动之时没有将xmx、mxs两者配置为同值,运行之时堆大小是会随着实际情况根据高低水位线进行占用内存的扩缩容,commit此时代表的是此时jvm堆内存总量,max - used带来的计算可能会导致内存再分配,你的担心在于硬件条件不允许。
这个担心不无道理,事实上我在第一次跟作者交流之际原话:“当前剩余可用内存大小应该等于commited - used 或者选择 max - used”。但是个人依然没认为使用commited - used会更加妥善,理由如下:
1.绝大多数使用者都会把xmx、xms配置为明确的同样大小的值;
2.jvm xmx参数的配置人员应该是清醒的,明确的知道当前服务需要占用的内存最大量,以及物理机器实际的承载能力;


private static volatile long maxAvailable;

private static final ScheduledExecutorService SCHEDULER = Executors.newSingleThreadScheduledExecutor();
Copy link
Member

@AlbumenJ AlbumenJ Jun 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This scheduledExecutorService can be managed by Dubbo Framework. I will patch it later. #10178

@loongs-zhang
Copy link
Member Author

loongs-zhang commented Jun 20, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants