Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After some time delayed jobs process is taking all memory of the development machine after deployment. #823

Closed
vitulicny opened this issue Jun 15, 2015 · 25 comments

Comments

@vitulicny
Copy link

Deployment via capistrano.

using:
delayed_job (4.0.6)
capistrano3-delayed-job (1.4.0)
delayed_job_active_record (4.0.3)

server:
Ubuntu 12.04.5 LTS (GNU/Linux 3.2.0-24-virtual x86_64)

Any idea what we could check or how to debug this issue.

Everything is fine on stage and production with production rails env.

@albus522
Copy link
Member

Probably something getting lost in code reloading. My guess would be that your web server does the same but you restart it more often.

@glaszig
Copy link

glaszig commented Aug 9, 2015

same here. two systems: staging and production. memory on staging is being filled up by delayed_job over time.

production

  • Ubuntu 12.04.2 LTS
  • 16gb ram
  • ruby 2.2.2 (via rbenv)
  • delayed_job (4.0.6)
  • delayed_job_active_record (4.0.3)

staging

  • Ubuntu 14.04.2 LTS
  • 4gb ram
  • ruby 2.2.2 (via rbenv)
  • delayed_job (4.0.6)
  • delayed_job_active_record (4.0.3)

i'm at a loss as to where to look. the only significant difference are ubuntu release and ram size and i'd be damned if one of these are the reason rbenv compiles ruby in a way it introduces memory leaks.

suggestions anyone?

@glaszig
Copy link

glaszig commented Aug 9, 2015

update: i tamed delayed_job. enabling rails' class cache prevents the leakage as in #776.

# config/environments/staging.rb
config.cache_classes = true

@grexican
Copy link

for what it's worth, config.cache_classes was my issue, too. Setting that to true solved my problems.

@dgobaud
Copy link

dgobaud commented Mar 2, 2016

I'm still seeing this problem and I have cache_classes set to true

ruby '2.2.2'
gem 'rails', '4.2.5'
delayed_job (4.0.6)
delayed_job_active_record (4.0.3)

It's just a steady march up...

image

image

@albus522
Copy link
Member

albus522 commented Mar 2, 2016

There are many memory things ruby does not clear up well. If you have any jobs that involve a lot of data, ruby will naturally grow and not due to DJ. You would see the same thing if the jobs were run inline in the server. There is no easy answer but you can search around for ways to identify what type of ruby objects are building up over time.

@dgobaud
Copy link

dgobaud commented Mar 2, 2016

I don't think my jobs consume a lot of data (or use a lot of memory I guess is what you mean?) but I have a lot of jobs... I don't think I had this problem until the number of jobs became high.

I have at least 2,400 jobs that run every 15 minutes.

@albus522
Copy link
Member

albus522 commented Mar 2, 2016

That in and of itself will generate a lot of objects. How ruby handles the cleanup is out of our hands and that graph is very common to many/most ruby apps. The most common "memory leak" is references to objects that survive the normal work loop, whether that be a single job, a web request, or other. There are ways to get ruby to tell you what type of objects are stacking up. Search for object count reports.

@krtschmr
Copy link

i have the same issues. but around 160.000 jobs per hour (!). each job is doing http request to scrape data.
how can i avoid this?

  • throw 100 objects into 1 job instead of creating 100 jobs
  • kill sidekiq every 30 minutes and start new instance

i have not find any other ideas. which one is best?

@glaszig
Copy link

glaszig commented Apr 21, 2016

kill sidekiq

this is about delayed_job. or so i thought.

@1c7
Copy link

1c7 commented Dec 9, 2016

Same here.
My delayed_job also take up all the memory in about 1 day.
Cause server down. not even able to ssh login to system, I have to manually restart server on Azure.

ENV

Server: Ubuntu 14.04
Ruby 2.3.1
gem 'delayed_job', "~> 4.1.1"
gem 'delayed_job_active_record', "~> 4.1.0"

@krtschmr
Copy link

krtschmr commented Dec 9, 2016 via email

@gregblass
Copy link

@dgobaud What are you using there to monitor your memory usage?

@gregblass
Copy link

gregblass commented Apr 26, 2017

I think I may be experiencing this too. I've got two workers running delayed jobs on my production server and after about a day of low/moderate usage, I run out of memory. 1GB EC2 instance. Then my capistrano deploys fail on asset precompile.

@krtschmr
Copy link

@gregblass nothing you can do about. i actually restart all my processed via cronjob every 6 hour to prevent bloating

@gregblass
Copy link

So I don't have a lot of processes running. Maybe 20-30 a day I'd think. But in theory isn't this not OK? How can processes be spawned and then memory just lost into the void?

I added a swap of 1GB and it fixed my Capistrano issues.

But in the long run, if delayed job is going to slowly eat way my server's memory, I will avoid it and use something that doesn't. Does Sidekiq have this issue?

@gregblass
Copy link

Oh nevermind, you're talking about 500K+ processes per day! Thats crazy. Congrats to you on whatever you're working on. There's no way I'm experiencing the same types of effects as you. I think it may be the 2 workers I'm spawning and the fact that I have only 1GB of real memory allocated, plus a ton of JS/CSS to precompile?

@krtschmr
Copy link

krtschmr commented Apr 26, 2017 via email

@djdarkbeat
Copy link

Taking a moment to share how I resolved this:

Set config/environments/development.rb to use this -
config.cache_classes = ENV['CACHE_CLASSES'] || false

Then we ran delayed job with -
CACHE_CLASSES=true bundle exec rake jobs:work

Clearly if you are developing a job you want to run backgrounded you will need to just omit the ENV, then it will use the reloading..albeit with the memory creep. This solution is helpful if you just need to run it a long time in dev, and keeps your laptop from blowing up if you leave it all running in tmux or something.

@ghiculescu
Copy link

cache_classes = false solve the issue but isn't very satisfying. We dug into it a bit more and ended up with this solution: #1115 (comment)

Now reloading only runs when files are changed - the problem before was that your whole app was reloaded every 5 seconds.

@AhMohsen46
Copy link

@ghiculescu I’m sorry I got confused, should cache_classes be set to true or false in the whole discussion above?

Second, the plugin you created, should it be only loading the files needed by the worker instead of loading the whole env?

should one limit cache_classes=false to the worker, since it’s gonna handle the needed files by the plugin, and set cache_classes=true everywhere else?

thanks

@ghiculescu
Copy link

ooops, I meant cache_classes = true solves the issue but isn't very satisfying. It's not satisfying because it makes the memory leak issue go away, but your code won't reload while in development mode, so if you change a file you'd need to restart your job worker.

#1115 (comment) reloads the entire app, but only does so when:

  1. a job is about to be run, and
  2. you've changed files in your autoload paths (typically your app dir) since the last time you ran a job

In other words it behaves exactly the same as your Rails server.

You should always set cache_classes = false in development so you get code reloading. In production, you want cache_classes = true; with that the plugin will do nothing (which is fine - you don't want code reloading in production)

@bubbaspaarx
Copy link

I am having a related issue but need some help.
I have two laptops running the same code and on one, even if i run 'rake jobs:clear', the ruby process opens up and never closes. On the other laptop, the same code opens up the ruby process but it closes within a few seconds. I can't seem to figure out why 2 v similar laptops (macbook pro's) with the same code base be acting completely differently. Any help is greatly appreciated.

@chiperific
Copy link

chiperific commented Apr 22, 2022

I had a stair-stepping memory leak in Heroku running Delayed Job and my app was daily hitting the swap memory line.
Heroku worker memory w: DelayedJob

I really didn't want to change config.cache_classes to false so it was recommended I switch to Sidekiq and see if the issue remained.
Heroku worker memory w: Sidekiq

I think you can pinpoint the moment when my Sidekiq code was deployed.

I just swapped in Sidekiq in place of DelayedJob, I didn't make any other changes. I've really enjoyed using DelayedJob over the last 7 years, but I think these charts say it all.

As a disclaimer, I'm a pretty junior dev, so it's definitely possible other solutions might have fixed my issue and allowed me to keep using DelayedJob, but it's hard to argue for DelayedJob at this point.

@albus522
Copy link
Member

Closing stale

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests