Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak when not using connection pooling #1139

Closed
perlun opened this issue Feb 4, 2016 · 4 comments
Closed

Memory leak when not using connection pooling #1139

perlun opened this issue Feb 4, 2016 · 4 comments

Comments

@perlun
Copy link
Contributor

perlun commented Feb 4, 2016

Hi,

We've just concluded a debugging session for one of our apps, which has been running in production for a few years. The application server (using JRuby 1.7) tends to run out of memory over time, so the customer has scheduled reboots of the machine in place, as a workaround.

Today, we debugged this a bit more and found out the following very interesting facts:

  • If connection pooling is enabled, all is fine; the memory usage over time does not increase drastically. After 100 000 connect/disconnect cycles, the memory usage for the Java process was still only around 30 MiB which is completely fine.
  • If connection pooling is disabled (as in the script below), the process leaks a huge amount of memory. 100 000 connect/disconnect cycles brought it up to around 750 MiB on my machine. Note: the is specifically tested with JRuby, with the AS400 driver (which is the one used in production also). We tested with the Postgres driver first, which also seemed to leak memory (and we didn't investigate that so carefully, since the "real" use case is the AS400 driver for us).

Here are the results when using unpooled connections (the full script below):

image

...and when using the connect_pooled method instead:

image


Running Sequel 4.31.0, JRuby 1.7.22, Java 1.7.0_80.

#!/usr/bin/env java -XX:+UseConcMarkSweepGC -XX:MaxPermSize=128m -XX:+CMSClassUnloadingEnabled -Djruby.jit.max=-1 -Djruby.thread.pooling=true -Djjruby.compile.mode=FORCE -Dfile.encoding=UTF-8 -Djava.net.preferIPv4Stack=true -jar jruby-complete-1.7.22.jar

require 'pry'
require 'sequel'

require 'java'
require './jt400.jar'

def connection_string
  'jdbc:as400://127.0.0.1;prompt=false;user=foo;password=bar;libraries=baz;query timeout mechanism=cancel;'
end

def connect
  Sequel.connect(connection_string).tap { |db|
    db.identifier_output_method = nil
    db.identifier_input_method = nil
  }
end

def connect_pooled
  @connection ||= Sequel.connect(connection_string, max_connections: 10, pool_timeout: 60).tap { |db|
    db.extension(:connection_validator)
    db.identifier_output_method = nil
    db.identifier_input_method = nil
  }
end

100_000.times do
  db = connect
  db.disconnect
  print '.'
end

binding.pry

I certainly know that connection pooling should be used in production-level systems. 😄 And we will "solve" this by enabling it in this specific use case. Still, I think it's a bad sign that "something" is leaking memory in this way when not having it enabled. Any ideas on how to proceed with trying to pinpoint this?

@jeremyevans
Copy link
Owner

You are leaking memory in connect because you aren't using @connection ||= there. References to all the Sequel::Database objects created using connect are stored in Sequel::DATABASES, which is why they aren't garbage collected.

Basically, you are using the API wrong. You should be storing the Sequel::Database instance in a constant and referencing it when you need it, not creating a new Sequel::Database object every time you need to access the database. If you only need to use a Sequel::Database object for a short time, you can pass a block to Sequel.connect (which will remove the reference from Sequel::DATABASES before returning), or manually clear the connection out of Sequel::DATABASES

@perlun
Copy link
Contributor Author

perlun commented Feb 4, 2016

Hi Jeremy,

Thanks for a prompt reply. I feel like a true fool now... 😄

You're absolutely right - I checked and if I pass a block to the connect method, there is no leakage whatsoever even after connecting 100 000 times. (then, I didn't actually do any database call in the connect block but still).

I looked at the opening database page in the documentation now, and it doesn't seem to be very clear that this happens. It says "it is recommended that you store the database object in a constant named DB", but perhaps we should make it more clear that there are great dangers with not storing it in a constant. (obviously, we missed this)

Anyway, our specific use case is within an application server and we can change this code fairly simply to use the block form instead. But removing it from Sequel::DATABASES is actually even simpler. How would you do that, is there any "reasonably sane" way of doing it, without mucking too much with the Sequel internals?

Thanks a lot for your help.

@jeremyevans
Copy link
Owner

I agree, this could be better documented. It is documented, but it probably should be made more prominent in the guide. Could you please send a documentation pull request?

Sequel::DATABASES.delete(db) should work for removal in single threaded code. In multi-threaded code: Sequel.synchronize{Sequel::DATABASES.delete(db)}

@perlun
Copy link
Contributor Author

perlun commented Feb 5, 2016

Thanks a lot for the help, will try that. The use case here is definitely multi-threaded (a Rack-based web server), so it was great to get a proper example that works in that use case as well.

pedrobranco pushed a commit to pedrobranco/sequel that referenced this issue Feb 10, 2016
This is a basically a followup to jeremyevans#1139, trying to help other people avoid making the same mistake that I have been doing. 😄
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants