Looking to maybe self host my own instance, I’m still learning about the fediverse. If a different instance that I federate with hosts something illegal are there risks to me? Is anything from other instances hosted on my server like a copy of it? Or would I only end up hosting things my users post? I’m paranoid and sorry if this is a silly question.
How much disk space would some need to plan for a small lemmy instance?
Well, here’s my first post on the fediverse!
Background in IT and server administration here. I however do not know much about the intricacies of the fediverse, but am interested in learning. Here’s my two cents based on a background of LAMP stacks for web hosting.
The required space would likely scale and vary greatly depending on how much content is hosted locally. Assuming minimum space similar to a basic LAMP server it’d likely have starting space requirements of less than 1GB. If local content is primarily text/links to content hosted elsewhere it would take a lot to drastically change that space requirement. Image hosting can vary greatly depending on size, quality, and number of images. Video hosting is an absolute space hog even at fairly low resolutions by today’s standards.
Bandwidth requirements would scale similar to storage requirements.
Other specs would also start very low if fediverse requirements are similar to a LAMP stack. Cores are typically more important than core speed in web server hosting as each request will try to use a separate core, but doesn’t need much processing power to provide that request since the server isn’t actually rendering anything.
Likewise, you shouldn’t need much memory on a web host. Will scale with the number of scripts running on the host but I suspect that shouldn’t be many unless you’re also running moderation bots, but those should ideally be run on a different server instance.
That said, I’d also be curious to hear from other people that have experience with the fediverse though and other recommended specs to potentially host an instance.
If anyone has other questions I’m happy to try to help :)
I’m running it in the smallest VPS of vultr with 25GB of disk.
This instance only has 3 users, with me being the only active. It says it’s been up for almost a month and I’ve only used 3GB.
Here are the docker volumes which have the actual data of your instance, and from inside the DB the biggest table is the one called
activity
which the devs said it’s only sometimes used to validate the data, but could be truncated if needed (there’s a schedule task which only keeps up to 6 months).Also the thing to have in mind is to properly configure the logs of whichever installation guide you follow.
After that I’ve seen other admins say the next biggest is the media uploaded (from bigger instances).
$ du -h --max-depth=1 640K ./pictrs 3.2G ./postgres 3.2G . lemmy=# select table_name, pg_size_pretty(pg_relation_size(quote_ident(table_name))), pg_relation_size(quote_ident(table_name)) from information_schema.tables where table_schema = 'public' order by 3 desc; table_name | pg_size_pretty | pg_relation_size ----------------------------+----------------+------------------ activity | 2187 MB | 2292867072 comment | 56 MB | 58212352 person | 48 MB | 50307072 comment_like | 45 MB | 47161344 post_like | 22 MB | 22781952 comment_aggregates | 14 MB | 14811136 post | 13 MB | 13623296
The
activity
table is also used to deduplicate incoming federation data, so instead of truncating it, I’d suggest deleting rows after a certain amount of time.For my personal instance, I set up a cron to delete entries older than 3 days, and my db is only ~500MB with a few weeks of content! I also haven’t seen any duplicated posts or comments. Even with Lemmy’s retries, 3 days seems to be long enough before dropping rows from that table.
Could you share the cron/script you use to do this? I’m interested in hosting my own Lemmy at some point, and having a script for that cleanup would be hugely helpful for me.
Definitely! I’m hosting in Kubernetes so I won’t post the full thing, but here’s the actual command that I run hourly. Make sure to replace the values for
database
,username
, andpassword
.Awesome, that was just as straightforward as I was hoping it was, thanks! I am more familiar with MySQL as I haven’t used Postgres a ton but SQL is SQL after all lol
You’re welcome! Makes sense. They’re somehow so similar yet so different lol
How are you keeping your
pictrs
directory so small?Mine is at about 5GB after two weeks with just a single user. 😬
I also have around 3GB used for
pictrs
and I’m not really sure the best way to see what all content is in there.Yeah I haven’t uploaded any images on my instance myself. So none of those images are mine. Might do some reading tomorrow and see if there’s any mention of this in the past on other communities. It’s not an emergency but I’m curious.
That’s strange. Please let me know what you find out.
I had found an old post which indicates that post thumbnails are cached. So I guess there’s that.
In case you didn’t see it, the OP of this thread realized they didn’t setup their
pictrs
API key… so I guess it’s possible to omit that and lemmy should still work. Not sure about the downsides.Haha, I don’t know xP.
Just checked and it has only one image.
Did you configure the pictrs API keys for Lemmy and for pictrs?
If they’re not configured then I could see Lemmy not even using pictrs.
Ohh!!
That’s what’s happening, I haven’t uploaded any pictures so I didn’t noticed, aside from that I’m not sure what are the other use cases of pictrs
Don’t quote me on it but I think it, besides handling image uploads, caches thumbnails for link posts.