this post was submitted on 11 Jun 2023
142 points (97.3% liked)

Asklemmy

43856 readers
1784 users here now

A loosely moderated place to ask open-ended questions

Search asklemmy ๐Ÿ”

If your post meets the following criteria, it's welcome here!

  1. Open-ended question
  2. Not offensive: at this point, we do not have the bandwidth to moderate overtly political discussions. Assume best intent and be excellent to each other.
  3. Not regarding using or support for Lemmy: context, see the list of support communities and tools for finding communities below
  4. Not ad nauseam inducing: please make sure it is a question that would be new to most members
  5. An actual topic of discussion

Looking for support?

Looking for a community?

~Icon~ ~by~ ~@Double_A@discuss.tchncs.de~

founded 5 years ago
MODERATORS
 

I'm really enjoying lemmy. I think we've got some growing pains in UI/UX and we're missing some key features (like community migration and actual redundancy). But how are we going to collectively pay for this? I saw an (unverified) post that Reddit received 400M dollars from ads last year. Lemmy isn't going to be free. Can someone with actual server experience chime in with some back of the napkin math on how expensive it would be if everyone migrated from Reddit?

you are viewing a single comment's thread
view the rest of the comments
[โ€“] Sal@mander.xyz 7 points 1 year ago (2 children)

This is what I think, but if anyone understands it differently please correct me.

Vertical scalability refers to scaling within a single instance. More users join and they post more content, increasing the amount of disk space needed to hold that memory, network bandwidth to handle many users downloading comments and images at once, and processing power.

Horizontal scaling refers to the lemmyverse growing because of the addition of new instances. The problem in this form of scaling is due to the resources that an instance has to use due to its interactions with other instances. So, you may create a small instance without a lot of users, but the instance might still need a lot of resources if it attempts to retrieve a lot of information (posts, comments, user information, etc) from the other larger instances. For example, at some point a community in lemmy.ml might be so popular that subscribing to that community from a small instance would be too much of a burden on the smaller instance because of the amount of memory required to save the constant stream of new posts. The horizontal scaling is a problem when the lemmyverse becomes so large that a machine with only a small amount of resources is no longer able to be part of the lemmyverse because its memory gets filled up in a few hours or days.

[โ€“] jeremy_sylvis@midwest.social 6 points 1 year ago (1 children)

You can summarize by thinking of vertical scaling as "make machine bigger / more powerful" with horizontal scaling as "make more machines".

Kind of like building a very large/tall building vs having multiple buildings!

[โ€“] honk@feddit.de 2 points 1 year ago (2 children)

I don't believe this is how it works though.

Let's say your tiny 3 person instance is connected to a big one. I believe it only pulls in content from the communities somebody from the small instance is subscribed too. Correct me if I'm wrong.

[โ€“] panoptic@fedia.io 4 points 1 year ago (1 children)

That's what they're saying.

Essentially - if someone from the small instance subscribes to a community that has a ton of data (huge post volume, images, whatever), the small instance needs to pull data over from the larger instance. At some point there may be communities that are so large small instances can't pull them in without tanking.

[โ€“] ShadowAether@sh.itjust.works 1 points 1 year ago (1 children)

Could that be solved by caching? Can't the smaller instance avoid some duplication?

[โ€“] panoptic@fedia.io 1 points 1 year ago

If I'm reading the protocol right, it's probably larger instances that will avoid more duplication, since:

  1. There's a higher chance they're going to have more communities shared among users (for really tiny instances you're probably going to get a lot of overlap since those people likely have interconnected interests, but I expect that would fall off quickly, but then converge at scale).
  2. The larger number of users will mean they 'use' more of the content they're pulling down (I can't read all of a highly active community in a day, but 1000 people together checking through the day might 'use' it all).

I'm not sure I see where you see caching fitting in.
I am surprised I don't see some kind of lower resolution digest concept in the protocol (which might be what you're looking for)

[โ€“] flambonkscious@sh.itjust.works 1 points 1 year ago (1 children)

That's what I've gathered, but I don't believe there's a way for instance owners to limit what's fetched - a user crafts the query and the server does the needful.

I imagine this could amount to a denial of service attack of sorts, if some high-churn communities are imported into tiny instances. How bad that could be, I have no idea - I'm speaking pretty theoretically, here. Text is tiny, after all, so it's probably not much of a concern, since most of the media is actually handled elsewhere...

[โ€“] honk@feddit.de 3 points 1 year ago

I'm not a web developer. I'm sort of a sysadmin so i have some experiences maintaining machines for web apps for other people. And you are right...text will not create massive amounts of data. But a lot of tiny transactions can bring down machines surprisingly fast even if the total amount of data is relatively small.

I guess we are here to experience it first hand. I don't think anybody...not even the developers have a clear idea of how well this will scale. There is only one way to find out lol