Update for new flyctl, add instructions, fix media volume #2

indirect · 2022-11-17T10:34:23Z

This set of changes updates the steps for the changes to flyctl options since the guide was originally written.

It also uses Overmind to run both Rails and Sidekiq inside the same VM, so they can share the volume with cached media. Without sharing a volume, network errors generate background jobs to try again later, and Sidekiq downloads the media into a container that the web server can't get to, causing random broken image tags.

Also added some notes about setting up SMTP, as well as upgrading Mastodon, and using custom domains with SSL.

Fixes #1.

this lets us update to newer versions of mastodon by updating the docker image and deploying

with binstubs you can run eg `bin/production status` or `bin/production-redis status` without having to remember or type `-c fly.whatever.toml`

db:setup actually generates an error, since fly has already created the database when we ran `fly pg create`. we just need to load the schema before the first deploy, so that we don't have to migrate from zero during the release command.

including how to run migrations before the new code is live via a second, temporary app

tmm1 · 2022-11-18T18:02:23Z

Without sharing a volume, network errors generate background jobs to try again later, and Sidekiq downloads the media into a container that the web server can't get to, causing random broken image tags.

Ah, interesting. This is when you're not using S3 etc for attachments?

indirect · 2022-11-18T18:38:26Z

Yes, if you use Fly volumes to hold the fetched remote media, the files downloaded by Sidekiq become random 404s. On Fly, volumes are 1-1 with VMs.

tmm1 · 2022-11-18T19:05:15Z

Gotcha. Thanks for working on this!

I just remembered I think there's one more piece missing, the streaming nodejs server.

IIRC, you have to run the nginx inside docker too and tell it to send /api/v1/streaming requests to the nodejs server. We'll have to add the nodejs streaming entry to the Procfile as well.

See jesseplusplus/decodon#7

Puma uses less memory when it is completely out of cluster mode, which we actively want while running in a tiny Fly VM. This does not reduce the number of Puma processes, since we were already only running one.

This is the default on Heroku, to save RAM: https://devcenter.heroku.com/changelog-items/1683 Also recommended by Nake Berkopec, noted Rails performance optimizer, at https://www.speedshop.co/2017/12/04/malloc-doubles-ruby-memory.html. Also recommended by Mike Perham, author of Sidekiq, at https://www.mikeperham.com/2018/04/25/taming-rails-memory-bloat/.

indirect · 2022-11-18T23:45:06Z

Thanks for calling that out, I added the node streaming server to the Procfile and added Caddy to reverse proxy just the one URL.

tmm1 · 2022-11-19T00:06:20Z

Awesome, this is looking great.

Re: #1, since the issue there is specific to local assets.. it'd be nice to still allow separate sidekiq workers that can be scaled up separately from the web app. Maybe we can document the choices in the readme? So if you choose to store assets locally then you have to run in the combined mode, otherwise you can separate them via env vars or something?

I would guess most mastodon instances are using s3 or other cloud storage. It might make sense to make that the default here too, so you're not stuck with a single puma/sidekiq if you need to scale?

indirect · 2022-11-19T00:12:34Z

Yeah, I think that's probably true. Part of the reason I removed the [processes] directive is that it's a little half-baked--you can never remove process groups once they've been created, and VMs seem to stop correctly associating with regions. In my original testing, Sidekiq kept starting up in a random far-away region, even when it was supposedly only allowed to use the same region as the redis and web servers.

Let me think about how to implement an option or transition plan and get back to you later today or tomorrow.

indirect · 2022-11-19T07:33:10Z

Okay, I think I've figured out a reasonable plan that includes a way to transition from volumes to cloud storage, and from all-in-one to "web" and "sidekiq" apps that can scale their count separately. You can even keep image URLs identical via Caddy, if you want that. I've tested this out on my own instance, migrating from a Fly Volume to Wasabi (which is API-identical to S3), and then migrating the sidekiq worker out into a separate app so it can scale on its own.

However, Mastodon seems very shouty about how you must never run more than one sidekiq process with the scheduler job, and it's not at all clear to me how to scale a sidekiq process or app while keeping that true. I would love to hear how you have solved this elsewhere.

indirect · 2022-11-19T10:24:44Z

After a bunch more fiddling, I think [processes] actually is the way to go for separately scalable worker processes that share env vars and secrets, we just have to be sure to only ever have one process running the schedule queue. I've updated the readme and the fly.toml file to have pre-written sections that you can uncomment to have horizontally scalable Rails and non-schedule-queue Sidekiq VMs.

tmm1 · 2022-11-20T18:02:09Z

cc @jsierles @tqbf

indirect added 16 commits November 17, 2022 00:15

extract Redis into its own Dockerfile

6a0c0a2

add a regular release command to migrate

0e07a66

this lets us update to newer versions of mastodon by updating the docker image and deploying

update mastodon for the big security issue etc

3ccdcbf

add binstubs to make some fly commands easier

d87a5e8

with binstubs you can run eg `bin/production status` or `bin/production-redis status` without having to remember or type `-c fly.whatever.toml`

add a binstub and fly.toml for the postgres db

4d42f6a

turn off static files from rails, we have statics

16033cb

add some example SMTP env vars and comment them

11cd685

install tmux, overmind, and run both inside one VM

49c2ee4

make fly app names find-and-replaceable

2e9251e

fixup binstubs

2370b84

update the database setup

973b407

db:setup actually generates an error, since fly has already created the database when we ran `fly pg create`. we just need to load the schema before the first deploy, so that we don't have to migrate from zero during the release command.

update fly commands, tested as of 2022-11-17

40cfada

explain how to add a custom domain with SSL cert

89cf5e6

add a script to update the app name everywhere

43d80ee

explain email delivery and env vars

ebfbd4c

instructions for upgrading Mastodon versions

c4d88a2

including how to run migrations before the new code is live via a second, temporary app

indirect added 6 commits November 18, 2022 11:23

take Puma out of cluster mode, we only have 1 core

f57b907

Puma uses less memory when it is completely out of cluster mode, which we actively want while running in a tiny Fly VM. This does not reduce the number of Puma processes, since we were already only running one.

add MALLOC_ARENA_MAX to the predeploy app too

2b8e783

keep the predeploy app by scaling to 0

656026a

move domain guide before final deploy

7a52ba7

add Caddy and the node streaming server

831b78f

comment and example for cloud assets reverse-proxy

3d9b6b0

tmm1 mentioned this pull request Nov 19, 2022

Updated to add new features, better document some things, and prevent misconfigurations #3

Closed

indirect added 10 commits November 18, 2022 23:33

make sure imported secrets don't pick up a \r

8892a80

explain when you might want caddy to proxy assets

d10ae5b

default to fast deploys, note migrate on upgrade

0a58507

explain when to remove [mount]

b94bbab

explain how to confirm/admin your user with ssh

8ceac6e

add process and thread controls, and explain use

5692336

fix the new header sizes, oops

8bedfa0

update sidekiq process scale to OVERMIND_FORMATION

1217f95

explain migrating sidekiq to a separate Fly app

fc00f87

respect MAX_THREADS in overmind, too

66ae448

indirect added 2 commits November 19, 2022 02:28

actually include the helpful resources

d8e22db

remove fly-sidekiq in favor of [processes]

98e8b50

tmm1 merged commit df9ee64 into tmm1:main Nov 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update for new flyctl, add instructions, fix media volume #2

Update for new flyctl, add instructions, fix media volume #2

indirect commented Nov 17, 2022 •

edited

Loading

tmm1 commented Nov 18, 2022

indirect commented Nov 18, 2022

tmm1 commented Nov 18, 2022 •

edited

Loading

indirect commented Nov 18, 2022

tmm1 commented Nov 19, 2022

indirect commented Nov 19, 2022

indirect commented Nov 19, 2022

indirect commented Nov 19, 2022

tmm1 commented Nov 20, 2022

Update for new flyctl, add instructions, fix media volume #2

Update for new flyctl, add instructions, fix media volume #2

Conversation

indirect commented Nov 17, 2022 • edited Loading

tmm1 commented Nov 18, 2022

indirect commented Nov 18, 2022

tmm1 commented Nov 18, 2022 • edited Loading

indirect commented Nov 18, 2022

tmm1 commented Nov 19, 2022

indirect commented Nov 19, 2022

indirect commented Nov 19, 2022

indirect commented Nov 19, 2022

tmm1 commented Nov 20, 2022

indirect commented Nov 17, 2022 •

edited

Loading

tmm1 commented Nov 18, 2022 •

edited

Loading