Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server memory overflow due to stream receivers retain payloads regardless requestN #887

Closed
mostroverkhov opened this issue Jul 7, 2020 · 3 comments · Fixed by #932
Closed
Assignees
Labels
enhancement superseded Issue is superseded by another

Comments

@mostroverkhov
Copy link
Member

Rogue clients may not respect requestN from server, send frames firehose which are queued by receivers - UnicastProcessors, backed by unbounded queue.

Affects server Responder channel; also Requester stream & channel - if server sends requests to client.

Reproducible with test below, It roughly models system that processes batches of messages asynchronously, calls new batch with requestN once current one is completed

  @Test
  void serverOom() {
    EmitterProcessor<Void> completed = EmitterProcessor.create();

    TestServerTransport testTransport = new TestServerTransport();
    Closeable server =
        RSocketServer.create()
            .acceptor(
                (setup, sendingSocket) ->
                    Mono.just(
                        new RSocket() {
                          @Override
                          public Flux<Payload> requestChannel(Publisher<Payload> payloads) {
                            Flux.from(payloads)
                                .doFinally(s -> completed.onComplete())
                                .subscribe(new BatchProcessingSubscriber());
                            return completed.cast(Payload.class);
                          }
                        }))
            .bind(testTransport)
            .block();

    byte[] bytes = bytes(1_000_000);
    TestDuplexConnection testDuplexConnection = testTransport.connect();

    /*setup*/
    testDuplexConnection.addToReceivedBuffer(
        SetupFrameCodec.encode(
            ByteBufAllocator.DEFAULT,
            false,
            1111111,
            1111111,
            Unpooled.EMPTY_BUFFER,
            "application/binary",
            "application/binary",
            EmptyPayload.INSTANCE));

    /*request-channel which sends many frames regardless received requestN*/
    testDuplexConnection.addToReceivedBuffer(
        RequestChannelFrameCodec.encode(
            ByteBufAllocator.DEFAULT,
            1,
            false,
            false,
            1111111,
            Unpooled.EMPTY_BUFFER,
            Unpooled.EMPTY_BUFFER));

    int framesCount = 8000;
    for (int i = 0; i < framesCount; i++) {
      boolean complete = i == framesCount - 1;
      testDuplexConnection.addToReceivedBuffer(
          PayloadFrameCodec.encode(
              ByteBufAllocator.DEFAULT,
              1,
              false,
              complete,
              true,
              Unpooled.EMPTY_BUFFER,
              Unpooled.wrappedBuffer(bytes)));
    }

    completed
        .mergeWith(testDuplexConnection.onClose())
        .as(StepVerifier::create)
        .expectComplete()
        .verify(Duration.ofSeconds(30));
  }

  private static final Random random = new Random();

  static byte[] bytes(int size) {
    byte[] data = new byte[size];
    random.nextBytes(data);
    return data;
  }

  private static class BatchProcessingSubscriber implements Subscriber<Payload>, Runnable {
    static final int BATCH_SIZE = 8;
    final ScheduledExecutorService executorService = Executors.newSingleThreadScheduledExecutor();
    volatile Subscription s;
    int received;

    @Override
    public void onSubscribe(Subscription s) {
      this.s = s;
      /*request initial batch*/
      s.request(BATCH_SIZE);
    }

    @Override
    public void onNext(Payload payload) {
      /*emulate batch processing: request next batch once current is finished*/
      payload.release();
      if (++received == BATCH_SIZE) {
        received = 0;
        executorService.schedule(this, 1_000, TimeUnit.MILLISECONDS);
      }
    }

    @Override
    public void onError(Throwable t) {}

    @Override
    public void onComplete() {}

    @Override
    public void run() {
      s.request(BATCH_SIZE);
    }
  }

Test fails with error

Caused by: java.lang.OutOfMemoryError: Direct buffer memory at
 java.base/java.nio.Bits.reserveMemory(Bits.java:175) at
 java.base/java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:118) at
 java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:317) at
 io.rsocket.frame.decoder.DefaultPayloadDecoder.apply(DefaultPayloadDecoder.java:55) at
 io.rsocket.frame.decoder.DefaultPayloadDecoder.apply(DefaultPayloadDecoder.java:18) at
 io.rsocket.core.RSocketResponder.handleFrame(RSocketResponder.java:332) at
 reactor.core.publisher.LambdaSubscriber.onNext(LambdaSubscriber.java:160) at
 reactor.core.publisher.MonoFlatMapMany$FlatMapManyInner.onNext(MonoFlatMapMany.java:242) at
 reactor.core.publisher.FluxGroupBy$UnicastGroupedFlux.drainRegular(FluxGroupBy.java:554) at
 reactor.core.publisher.FluxGroupBy$UnicastGroupedFlux.drain(FluxGroupBy.java:630) at
 reactor.core.publisher.FluxGroupBy$UnicastGroupedFlux.onNext(FluxGroupBy.java:670) at
 reactor.core.publisher.FluxGroupBy$GroupByMain.onNext(FluxGroupBy.java:205) at
 reactor.core.publisher.FluxHandle$HandleSubscriber.onNext(FluxHandle.java:112) at
 reactor.core.publisher.DirectProcessor$DirectInner.onNext(DirectProcessor.java:333) at
 reactor.core.publisher.DirectProcessor.onNext(DirectProcessor.java:142) at
 reactor.core.publisher.FluxCreate$IgnoreSink.next(FluxCreate.java:618) at
 reactor.core.publisher.FluxCreate$SerializedSink.next(FluxCreate.java:153) at
 io.rsocket.test.util.TestDuplexConnection.addToReceivedBuffer(TestDuplexConnection.java:132) at
 io.rsocket.ServerOomTest.serverOom(ServerOomTest.java:86)
@OlegDokuka OlegDokuka added this to the 1.0.2 milestone Jul 27, 2020
@OlegDokuka
Copy link
Member

This one should be applied only for 1.0.x. In 1.1 we are fixing (#761) dedicated Request Operators to not store anything in memory

@OlegDokuka OlegDokuka self-assigned this Jul 27, 2020
@OlegDokuka OlegDokuka modified the milestones: 1.0.2, 1.0.3 Aug 12, 2020
rstoyanchev added a commit to rstoyanchev/rsocket-java that referenced this issue Sep 22, 2020
Closes rsocketgh-887

Signed-off-by: Rossen Stoyanchev <rstoyanchev@vmware.com>
rstoyanchev added a commit to rstoyanchev/rsocket-java that referenced this issue Sep 22, 2020
Closes rsocketgh-887

Signed-off-by: Rossen Stoyanchev <rstoyanchev@vmware.com>
@rstoyanchev rstoyanchev linked a pull request Sep 22, 2020 that will close this issue
rstoyanchev added a commit to rstoyanchev/rsocket-java that referenced this issue Sep 22, 2020
Closes rsocketgh-887

Signed-off-by: Rossen Stoyanchev <rstoyanchev@vmware.com>
@rstoyanchev rstoyanchev added the superseded Issue is superseded by another label Sep 23, 2020
@rstoyanchev rstoyanchev removed this from the 1.0.3 milestone Sep 23, 2020
@lkolisko
Copy link

I recently started to run into the following exception for a requestStream call on the client. It seems somehow related to changes fixed to address this issue. But I am struggling to root cause if this is wrong usage or framework issue. Is there any guidance on where to look?

reactor.core.Exceptions$OverflowException: The receiver is overrun by more signals than expected (bounded queue...)
at reactor.core.Exceptions.failWithOverflow(Exceptions.java:221)
at reactor.core.publisher.UnicastProcessor.onNext(UnicastProcessor.java:394)
at io.rsocket.core.RSocketRequester.handleFrame(RSocketRequester.java:647)
at io.rsocket.core.RSocketRequester.handleIncomingFrames(RSocketRequester.java:609)
at reactor.core.publisher.LambdaSubscriber.onNext(LambdaSubscriber.java:160)

@OlegDokuka
Copy link
Member

OlegDokuka commented Nov 18, 2020

@lkolisko in your case it is a little bit different issues and not directly related to that one. (Although it might be the result of the commit landed to fix this one)
I will appreciate it if you can open a new ticket and explain how we can reproduce your problem.

Your issues might be the result of the recent changes that we landed into 1.0.3, but to make sure it is the real bug of misusage of the frame, we have to see the code

Cheers,
Oleh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement superseded Issue is superseded by another
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants