The first 7 months and the last 72 hours

Since April of this year I've been running my own Mastodon server and 3 days ago we hit 100 users which was a huge milestone for my tiny little server... and then all of a sudden something happened, the other Mastodon servers started to get full and new users were looking for homes. Less than 72 hours after being excited for hitting 100 users we hit 10,000 users.

10,007 active users

Awesome right? The more users the better when it comes to social media! But the server was not ready for that kind of load, 2-3 new users every few seconds plus all of the new users all active at once was putting some strain on the server, and to make things worse my e-mail provider had our account capped at 7200 e-mails per day which was more than I ever expected to need. For the first 48 hours I kept having to open and close signups to let the server catch-up even after upgrading the server hardware, it was pretty insane.

My server was just falling over from all of the Sidekiq jobs queueing up after 2500 new users and I needed to optimize the server better to handle them or we would have been stuck at 2500 forever. If you're a Mastodon server, the Sidekiq dashboard is super important and basically tells you when your server is about to throw in the towel. Prior to October 27th Sidekiq was handling less than 10,000 jobs per day, October 27th to November 3th it was averaging just under 100,000 jobs per day (this was due to heavier usage on the Fediverse as a whole), Novermber 4th to the 6th it rose to 300,000 jobs per day, then on November 7th we hit 3,048,305 jobs for that day. November 8th we did just under 3,000,000 jobs because we had to disable new signups every few hours and disable them overnight to ensure stability. We turned signups back on November 9th and hit 3,492,541 jobs for the day. Looking at all the data together it looks like the numbers for those busy days can be attributed to A LOT of retries because the server could not keep up to the demand at all, the number of jobs for today after doing all of the tweaks is much more reasonable even after doubling our user count today.

Sidekiq graph showing number of jobs


Crash Course Server Admining

The rest of this post will mostly be about the technical side of things, I will do other posts later about non-technical stuff, but I felt this was the most important information I can share now for anybody who is running or wants to run their own Mastodon server.

I've been a server admin since college (many, many moons ago) so managing a server isn't new to me, but PostgreSQL, Puma, and Ruby/Sidekiq are all fairly new to me so this whole situation forced me to do research for all 3 to figure out what I could do better instead of just throwing hardware resources at the problem. There are some good tuning guides for Mastodon out there, but they were a little too basic to be helpful for me. They did point me in the right direction though and I was able to read more guides (especially a Mastodon blog post from 2017 that was very helpful).

I'm going to make this simple so you don't have to do all of the research I did (although it's still recommended because I might actually be lucky and not right), these are the top things you need to do to optimize your Mastodon server:

  1. Break out your Sidekiq services from a single service.
  2. Adjust your WEB_CONCURRENCY and MAX_THREADS for your mastodon-web service.
  3. Install Pgbouncer for your PostreSQL server.

Break out your Sidekiq services

This is the absolute easiest thing you can do to get the best performance benefits thanks to Stux over at mstdn.social for putting the files on GitHub basically ready to go for you here: https://github.com/mstdn/Mastodon/tree/main/dist

The process to set this up is super simple:

  1. SSH into your server.
  2. Navigate to /etc/systemd/system/

    cd /etc/systemd/system/

  3. Now download each of the files named "mastodon-sidekiq-*" to that directory (note I had to remove the @ in the filename for it to work for me for some reason).

    wget https://raw.githubusercontent.com/mstdn/Mastodon/main/dist/mastodon-sidekiq-default-%40.service
    wget https://raw.githubusercontent.com/mstdn/Mastodon/main/dist/mastodon-sidekiq-mailers-%40.service
    wget https://raw.githubusercontent.com/mstdn/Mastodon/main/dist/mastodon-sidekiq-pull-%40.service
    wget https://raw.githubusercontent.com/mstdn/Mastodon/main/dist/mastodon-sidekiq-push-%40.service

    mv mastodon-sidekiq-default-@.service mastodon-sidekiq-default.service
    mv mastodon-sidekiq-pull-@.service mastodon-sidekiq-pull.service
    mv mastodon-sidekiq-push-@.service mastodon-sidekiq-push.service
    mv mastodon-sidekiq-mailers-@.service mastodon-sidekiq-mailers.service

UPDATE THE WORKING DIRECTORY FOR THESE FILES TO YOUR MASTODON'S LIVE DIRECTORY

  1. Now reload systemctl for the server to see the new files.

    systemctl daemon-reload

  2. Now enable and start those new services (keep the main service running still, it has the scheduler queue in it)

systemctl enable mastodon-sidekiq-default.service
systemctl enable mastodon-sidekiq-pull.service
systemctl enable mastodon-sidekiq-push.service
systemctl enable mastodon-sidekiq-mailers.service
systemctl start mastodon-sidekiq-default.service
systemctl start mastodon-sidekiq-pull.service
systemctl start mastodon-sidekiq-push.service
systemctl start mastodon-sidekiq-mailers.service

  1. Now navigate to your Sidekiq dashboard and check under the "Busy" tab to confirm the new processes have started up. Sidekiq process list

And that's it, you've just increased your performance substantially in 5 steps!


Adjusting WEB_CONCURRENCY and MAX_THREADS

Now this tweak is also super simple and could potentially offer a nice performance boost based on your server's configuration and how you set these fields (this is the part where I'm not 100% sure about and copied the settings from the 2017 blog post).

I recommend you go read the blog post here to better understand the specifics. The tldr; is basically WEB_CONCURRENCY = RAM and MAX_THREADS = CPU.

  1. SSH into your server.
  2. Navigate to /etc/systemd/system/

    cd /etc/systemd/system/

  3. Edit the mastodon-web.service file.

    vi mastodon-web.service

  4. Now add the following lines above the "ExecStart=" line while replacing the number symbol (#) with the number you want to use (for my quad-core server I did 8 processes and 2 threads each because I have more RAM to spare than CPU cores):

    Environment="WEB_CONCURRENCY=#" Environment="MAX_THREADS=#"

  5. Now save the file.

    :wq

  6. Now reload systemctl for the server to see the new files.

    systemctl daemon-reload

  7. Lastly restart the service.

    systemctl restart mastodon-web

  8. As long as you don't see an error you can now check that you have the new number of workers and threads.

    systemctl status mastodon-web

    SSH output showing mastodon-web with 4 workers

    This example shows 4 workers with 5 threads, for your specific server you may need to do some trial and error or reach out to somebody who understands Puma to get a better idea of what you should set them to.


Installing Pgbouncer for PostgreSQL

This was the hardest tweak for me because none of the tutorials I found worked for me, I had to figure this out on my own after watching way more tutorials than I had planned on for apps that weren't Mastodon. The good news is I figured it out so you don't have to.

Ultimately this guide at Masto.host was the best one out there and it's 100% correct aside from 2 things it's missing so instead of re-writing the guide I'm just going to explain the two issues I had with it so you can follow the guide and then fix those two things:

  1. I could not get the MD5 hashed password to work for the userlist.txt file, I opted to use the password in plaintext because who cares since the password is in plaintext in the .env.production file for Mastodon.
  2. In the pgbounder.ini file make sure the "max_client_conn" number is higher than the "max_connections" value in your PostgreSQL config file (you can find this in the PgHero page in your Mastodon Admin dashboard under "Tune" or find the postgresql.conf file for your server). The default for both is 100 so I set the "max_client_conn" number to 1000 because I have the extra RAM for it.

In the even that guide and this info still doesn't work for you please reach out to me and I'll do either a video tutorial or a step-by-step write up with screenshots because I do remember it was a tough process for somebody unfamiliar with PostgreSQL.

Parting words

I'd like to end this post with some other random observations I've made these past few days:

  • Backups are important, but too frequently and they can kill your server. I was doing hourly backups of the database and it did not make the server happy during those few minutes.
  • Disk space can get out of hand fast if you're not using object storage like AWS S3 or, my personal favorite, Backblaze B2. The vast majority of the storage will be media you cache from other servers on the Fediverse. You can clean this cache up by running the follow command in your /live directory: RAILS_ENV=production bin/tootctl media remove --days 30 and this will remove any cached files older than 30 days (this number can be changed), if your users access any of those files in the future it will automatically re-download them again so no harm in doing this frequently if space is a concern.
  • Don't be afraid to disable signups to give you server a break. Your current users' experience is more important than your user count, don't give them a bad experience if you can avoid it.
  • The Sidekiq dashboard is your friend, I recommend becoming familiar with it and understanding the numbers. Also, during busy times you'll see the queued jobs jump up, it's okay as long as it eventually starts to go down and doesn't keep climbing.
  • When users upload an image or video your server will compress/transcode those files using ImageMagick or ffmpeg and these can be killer on your CPU for a minute so be sure to have extra CPU headroom for these processes, running your CPU at or near 100% is a bad idea because of these 2 processes specifically.
  • Setup a method for accepting donations even if you aren't accepting them yet. I was adamant about not accepting financial support for my project, that quickly changed as I found myself upgrading the server faster than I had planned and while I was ready to shell out the money, I had no clue how much I would need to pay to make things stable (before all of the tweaks of course). I setup a PayPal, Liberapay, and Ko-Fi account that same night and a few people have already pitched in to help cover the costs for a few months which is super awesome of them and I'm very appreciative of.
  • If you get your server listed on joinmastodon.org, be prepared for an influx of users at a rate you're not expecting. Right now there are so few servers accepting new signups on that list so users will jump on any that are listed with open signups (meaning no manual approval required). Good luck if you get on this list. ;)

Wanna take a guess what day we got listed? Graph of the total hits to the site in the past 30 days

Lastly, I figured I should post about my actual server setup for those interested.

The server is a CPX31 plan from Hetzner.Cloud* which is only about $16.25 per month for the server which includes the server and automatic daily backups.

I am using Backblaze B2 for the Mastodon file storage for 3 reasons: 1. Cost - Backblase B2 is the cheapest object storage out there at around $5 per TB and I only pay for what I use. 2. Scalable - I can use as much or as little storage as I need and I can easily move the storage to another server if I want without having to copy all of the files over (currently over 600,000 files totaling over 136GB). 3. DIY CDN - Backblaze has a deal with Cloudflare which includes unlimited and free bandwidth if you put Cloudflare in front of the object storage. Here's a great guide to building your own CDN: https://www.backblaze.com/blog/free-image-hosting-with-cloudflare-transform-rules-and-backblaze-b2/

As previously mentioned, I'm also using Cloudflare for my DNS/CDN. I'm currently using the free plan which has been great and thanks to the caching I've saved quite a bit on bandwidth (which also means better performance for the users). I've gone from using about 100MB of bandwidth per day to 217GB per day with most of it being cached so according to Hetzner, my server has only used about 100GB of bandwidth for the past 10 days which is impressive. Bandwidth graph for the past 30 days

I'm going to wrap things up here and say I'd like to wish you luck with your endeavor. If you're interested in starting your own Mastodon server hopefully this information helps you. If you're just planning to host a small Mastodon server for a you and maybe a few friends, you don't need anything fancy or expensive. Heck the Raspberry Pi team is hosting their own Mastodon server on a Raspberry Pi 4 and there's even a person hosting theirs on a Chromebook! The possibilities are nearly endless.

Go out and do good things! 👍 -KuJoe

**Affiliate link*