Skip to content

Instantly share code, notes, and snippets.

@lazyatom
Last active September 3, 2024 11:39
Show Gist options
  • Save lazyatom/61a8b93e9aaca0ac8d8284ee6d89511f to your computer and use it in GitHub Desktop.
Save lazyatom/61a8b93e9aaca0ac8d8284ee6d89511f to your computer and use it in GitHub Desktop.
Cleaning up Ruby.social

Hey folks.

Firstly, thanks for being here.

Secondly, while you read this, I want to be totally clear that I am not asking you for money, just your thoughts.

Ruby.social has been running for 6 years, and for that whole time we've been hosted by Hugo at https://masto.host. This was one of the best decisions I made in the early life of the instance; we've had almost no issues at all, and any time we've needed help, Hugo has been right there.

All that great service doesn't come for free though (nor should it!); right now the instance hosting costs about $300/month. Even though their numbers are gradually decreasing, our existing patrons and sponsors easily cover this (thank you!) so there's no financial cliff looming.

But there is a trend, and I want to get ahead of it if we can.

The way Mastodon works results in every instance storing copies of accounts and posts from all across the Fediverse. Every person you follow, we store copies of all of their posts, even if they're on other instances. Even if we shut the doors to the instance and didn't let anyone new join, the database would continue to slowly grow, as copies of remote statuses percolate across the network.

The end result of this is that the database gradually but steadily grows and grows.

Right now our database is around 100GB. Every couple of months I check in on the size and upgrade our Masto.host account if required, to make sure that we're still within our plan. And that's all fine! But at some point in the future, the lines of patron funding vs hosting costs are going to cross, unless we either continually grow our patron pool, or figure out a way to keep on top of the database size.

Like I said at the start, I'm not writing this to ask you for money. What I want to ask about is one of the options we have for staying on top of the database size.

There's a backend command we could run:

tootctl statuses remove --days 365

which would clear out all the old remote statuses that have not been interacted with (no boosts, no favourites, no bookmarks etc), by accounts that nobody follows, that are more than a year old. Running this command is not trivial, but it would result in basically no impact to your experience of the instance. (The only thing you might notice is that statuses from accounts that you used to follow are no longer locally cached.)

BUT: there's another variant of this we could run:

tootctl statuses remove --days 365 --clean-followed

which does the same, but also removes statuses for accounts that you do follow.

Basically, right now we host an ever-growing archive of statuses from the people that you follow, even though the "source of truth" for those statuses is whatever instance that account is on. Running that command would remove all statuses that have not been interacted with (boosted, bookmarked, replied to and so on), meaning that if you wanted to go read an old status from @[email protected], you'd need to go to https://mastodon.social/@Gargron rather than https://ruby.social/@[email protected]. Which is pretty much the same as what you'd need to do to find any older statuses posted before you started following anyone.

So here's the question: is it better that Ruby.social contains a more-complete archive of statuses for people that we are collectively following? Or does it not matter, and we can try to more thoroughly prune the data and then use our patron funds for other things?

I'm conscious that even asking the question is somewhat leading -- you'll probably implicitly lean towards what you think I want you to permit me to do. But I'm really happy either way, I just want to gauge the overall leaning of the members of our instance. (And on the flip-side, I also don't guarantee to bind myself to any action implied by the results of this poll!)

Here's a link to status where you can share what you think:

https://ruby.social/@james/113072897248579619

Like I said -- this is not a plea for money. Just curious what you think our policy ought to be.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment