this post was submitted on 06 Aug 2023
799 points (96.6% liked)
Fediverse
28519 readers
855 users here now
A community to talk about the Fediverse and all it's related services using ActivityPub (Mastodon, Lemmy, KBin, etc).
If you wanted to get help with moderating your own community then head over to !moderators@lemmy.world!
Rules
- Posts must be on topic.
- Be respectful of others.
- Cite the sources used for graphs and other statistics.
- Follow the general Lemmy.world rules.
Learn more at these websites: Join The Fediverse Wiki, Fediverse.info, Wikipedia Page, The Federation Info (Stats), FediDB (Stats), Sub Rehab (Reddit Migration), Search Lemmy
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I'm sure the search problem will be solved somehow. Like all the content is on each instance so its just a case of it being accessible and indexed by google I guess?
I'm sure it's already being indexed by Google. But people like to add site filters like
site:Reddit.com
orsite:stackoverflow.com
to prevent google from barfing up a bunch of garbage results on the front page, when they know that's probably where the results they want will be. There is no way to add a Lemmy-wide filter to a Google search, because Lemmy instances are all different sitesDoes it actually matter though because Lemmy contents are replicated by federated servers, thus big Lemmy instances such as lemmy.world might have contents from smaller federated instances as well. Try using
site:lemmy.world
next time and see if it'll improve the search result, though Lemmy.world is just 2 months old so maybe Google hasn't indexed it allThat's a good point. If you filter by a major site, then it'll have content from all the major communities.
That won't help if you're looking for niche content, but that's not as important.
I wonder how replicated data shows up to the indexer. I don't know enough about search engine indexing or SEO. Will google index replicated data? Presumably it won't index feeds or searches, it'll index the actual posts, and I wonder if replicated posts are considered posts for the purposes of indexing or if the indexer will only look at local posts.
Google isn't thrilled with duplicate content. Following this thread here, it sounds like identical content might be hosted on multiple servers? If that is so, it's not going to be high value in Google's eyes.
If it's indexed, you'll be able to search it with Boolean modifiers, but it might not get priority in organic searches.
Yes, contents are replicated across federated instances. For example, here is the link to this thread on my instance: https://lemmy.institute/post/49173
If you check the html source there, there is a canonical link in the header that points to https://sh.itjust.works/post/2334723 , which is in the OP's instance. I think google will respect canonical links when indexing duplicated contents, so maybe the SEO aren't affected too much?
Presumably how it should work is that that even if content is duplicated, the crawlers would only index the "local" for Mastodon/Lemmy/etc servers, so they wouldn't see the duplication.
But idk how it actually works, and we're right back with my original concern of
site
filters