this post was submitted on 17 Feb 2024
1088 points (98.7% liked)

Technology

59533 readers
3431 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
top 50 comments
sorted by: hot top controversial new old
[–] Fake4000@lemmy.world 200 points 9 months ago (8 children)

Shit move from Reddit. Glad I jumped ship to lemmy.

Honestly, lemmy has less users compared to Reddit, yet you still get more engagement.

[–] Boozilla@lemmy.world 159 points 9 months ago (5 children)

I don't miss the dipshits, pun spammers, and smug power mods of reddit at all. I do miss their niche subs and smarter users. Like it or not, they do have some brainy folks peppered among the shit posters.

We have some good folks here, too. Just need more of them.

It's a shame reddit has been dialing up the shit faucet slowly enough that most of their users don't notice how awful it is now. They've grown accustomed to the poor quality of the content and weaponized greed of the owners.

[–] Fake4000@lemmy.world 78 points 9 months ago (2 children)

In all honesty, when I joined Reddit right after digg went to shit. It was amazing. Reddit was great, 3rd party apps were welcome, their interface was straightforward, and they had none of those NFT gold shit.

It just went downhill.

[–] NotSteve_@lemmy.ca 31 points 9 months ago

At that point, they were also open source which was super cool. I always wanted that profile badge you got for submitting a merged PR.

Reddit really went downhill fast after ~2015. I think Lemmy will get there eventually. I remember reddit being a lot smaller back then as well. It took a while to get to the point where niche communities could thrive and I do believe we'll see that happen here as well (even if it takes a decade or so)

load more comments (1 replies)
[–] deweydecibel@lemmy.world 24 points 9 months ago* (last edited 9 months ago) (1 children)

smug power mods of reddit at all.

Oh they're here too. They're not causing too much drama because there's not enough going on, but they're here. Some of them are admins of certain instances.

The ones that aren't here yet will eventually find their way here when Lemmy continues to grow. And the most concerning thing about that is how many more tools Lemmy is providing them to fuck with users.

At least on Reddit, mods couldn't see votes. Lemmy actually just made it easier for them.

load more comments (1 replies)
[–] Ragnarok314159@sopuli.xyz 21 points 9 months ago (1 children)

I left Reddit. Had over 600k Karma after a few years answering all kinds of questions from Veteran help to complex engineering.

Fuck Reddit. Will never go back. It’s a shell of what it was only a few years ago.

load more comments (1 replies)
load more comments (2 replies)
[–] pixxelkick@lemmy.world 75 points 9 months ago* (last edited 9 months ago) (14 children)
  1. Called this awhile back, this is why Reddit has such a high evaluation.

  2. Poisoning your data won't do anything but give them more data, do you seriously think reddit servers don't track every edit you make to posts? You'd literally just be providing training data of original human vs poisoned. They'd still have your original post, and they have a copy of everytime you edit it.

  3. Whoever buys reddit will have sole access to one of the larger (I don't think largest though) pools of text training Data on the internet, with full licensed usage of it. I expect someone like Google, FB, MS, OpenAI, etc would pay big $$$ for that.

"But can't people already scrape it?"

  1. Well yes, but it's at best legally dubious in some places

  2. Scraping Data off reddit only gets you current versions of posts (which means you can get poisoned dara, and cant see deleted content), and is extremely slow... if you own the server you have first class access to all posts in a database, including g the originals and diffs of everytime soneone edited a post, and all the deleted posts too.

Think about if you perhaps wanted to train an AI to detect posts that require flagging for moderation, if you scrape reddit data, you can't find deleted posts that got moderated...

But, if you have the raw original data, you 100% would have a list of every post that got deleted by mods and even the mod message on why it was deleted

You surely can see the value of such data, that only owners of reddit are currently privy to atm...

[–] DAMunzy@lemmy.dbzer0.com 19 points 9 months ago (2 children)

Poison it by randomly posting copywrited materials by big corps like Disney?

load more comments (2 replies)
[–] Buddahriffic@lemmy.world 16 points 9 months ago (9 children)

They've also got vote counts and breakdowns of who is making those votes. This data will be worth more for AI training than any similar volume of data other than maybe the contents of Wikipedia. Assuming they didn't have it set up to delete the vote breakdowns when they archived threads.

Why are those breakdowns worth so much? Because they can be used to build profiles on each voter (including those who only had lurker accounts to vote with), so they can build AIs that know how to speak with the MAGA cult, Republicans who aren't MAGA, liberals, moderates, centrists, socialists, communists, anarchists. Not only that, they'll be able to look at how sentiments about various things changed over time with each of these groups, watch people move from one to another as their opinions evolved, see how someone pretends to be a member of whatever group (assuming they voted honestly and posted under their fake persona).

Oh and also, all of that data is available through the fediverse but it's free to train on to anyone who sets up a server. Which makes me question whether the fediverse is a good thing because even changing federation to opt-in instead of opt-out just covers whether your server accepts data from another. It's always shared.

Open and private are on opposite sides of a spectrum. You can't have both, best you can do is settle for something in the middle.

load more comments (9 replies)
load more comments (12 replies)
[–] prex@aussie.zone 60 points 9 months ago (3 children)

I assume AI is training off the content here for free.

[–] Bishma@discuss.tchncs.de 38 points 9 months ago (3 children)

Yes, but there's no contract to give them legal cover if anyone ever does anything about all the content they steal.

[–] deweydecibel@lemmy.world 27 points 9 months ago* (last edited 9 months ago) (3 children)

And ya know what? Frankly, if AI is going to harvest all this shit, I'd rather fuckers like spez couldn't get rich off it in the process. Granted I'm not happy the tech bros running these AI companies are getting rich with these fucking things, but I can at least take solace that, for Lemmy at least, there isn't some asshole middle man making bank off the work and words of users they never paid a dime to.

Genuinely, why does Sepz and Reddit deserve to make money off anything we posted? Why does any social media site? They make the site, pay for the servers, maintain the apps, sure, and they can get compensation for that, I don't see a problem there. But why does any social media company deserve to get rich when the only thing that makes their platform valuable is the people that post to it? Reddit didn't even have paid mods, the community did all the work on the content of that site, why in the fuck do we tolerate these assholes making profit off it like this?

load more comments (3 replies)
load more comments (2 replies)
[–] OmanMkII@aussie.zone 14 points 9 months ago (4 children)

I was curious if a robots.txt equivalent exists for AI training data, and there was some solid points here:

If I go to your writing, I read it & learn from it. Your writing influences my future writing. We've been okay with this as long as it's not a blatant forgery.

If a computer goes to your writing, it reads it & learns from it. Your writing influences its future writing. It seems we are not okay with this, even if it isn't blatant forgery.

[AI at the moment is] different because the company is re-using your material to create a product they are going to sell. I'm not sure if I believe that is so different than a human employee doing the same thing.

https://news.ycombinator.com/item?id=34324208

I still think we should have the ability to opt out like we do with search engines and webcrawlers, but if the algorithm works ideally and learns but does not recycle content, is it truly any different from a factory of workers pumping out clones of popular series on Amazon? I honestly don't know the answer to that.

load more comments (4 replies)
load more comments (1 replies)
[–] axo@lemmy.world 55 points 9 months ago (2 children)

I barely post on reddit, just lurk but this made me finally sign up for an account here.

[–] Fake4000@lemmy.world 34 points 9 months ago

Welcome to lemmy.

load more comments (1 replies)
[–] red_pigeon@lemm.ee 45 points 9 months ago* (last edited 9 months ago) (2 children)

I stopped using reddit after they dropped the bomb on the devs and I'm not a fan of the company.

I understand the hatred towards them, but this is definitely expected from a company like reddit, and any other social media for that matter. As users we must be aware that we don't own the content in their platform.

I wouldn't be surprised if the same story comes from Instagram tomorrow, though I suppose there will be a bigger outcry then.

[–] jivandabeast@lemmy.browntown.dev 17 points 9 months ago (3 children)

Honestly over the last year since the great migration, the discussions on lemmy have really grown and matured to the point where i don't really see the value of reddit anymore

[–] crimroy@sopuli.xyz 13 points 9 months ago

The real value of reddit for me lies in its cache of information contained in answers to questions from over the years. Whenever I'm looking online for a solution to a problem I'm trying to solve I'll eventually add "reddit" to the search and I almost always find the answer that way.

load more comments (2 replies)
load more comments (1 replies)
[–] Embarrassingskidmark@lemmy.world 44 points 9 months ago (4 children)

If they build an AI based on reddit content it will be the devil incarnate.

[–] Pinecone@lemmy.world 34 points 9 months ago

If you thought gpt4 was confidently incorrect wait until you see this next ai.

[–] SpaceCowboy@lemmy.ca 14 points 9 months ago

A devil incarnate that makes a lot of puns.

load more comments (2 replies)
[–] mtchristo@lemm.ee 35 points 9 months ago (6 children)

I bet they can scrape Lemmy content for free then. There are no legal mechanisms to prevent them from doing so.

[–] FiskFisk33@startrek.website 25 points 9 months ago (3 children)

I rather my data I've chosen to make public is free and accessible to all, than it being sold to the highest bidder.

load more comments (3 replies)
[–] Trollception@lemmy.world 23 points 9 months ago (1 children)

Yes but i think reddit is many times more valuable than Lemmy. I just haven't found the same level of very specific subreddits that have lots and lots of activity. Most of the traffic here is memes, politics, news and Linux lovin. On reddit if I needed to find a community about my local town it's no problem and there are tens or hundreds of daily posts. The same community does exist on Lemmy but the last post was 6 months ago.

load more comments (1 replies)
load more comments (4 replies)
[–] mellowheat@suppo.fi 30 points 9 months ago* (last edited 9 months ago) (1 children)

Well of course, that's the #1 reason why everyone stopped providing free-to-use APIs last year. Because AI companies were getting all that data for free via those APIs.

load more comments (1 replies)
[–] JigglypuffSeenFromAbove@lemmy.world 28 points 9 months ago (6 children)

Slightly unrelated question, but is there an easy way to delete all my Reddit posts and comments? I used the Nuke add-on in the past, but it doesn't work anymore.

I wanna delete my Reddit account, but I'd prefer to erase my history before doing that.

[–] FeelThePower@lemmy.dbzer0.com 28 points 9 months ago* (last edited 9 months ago)

back when I made my Lemmy account I used a tool called redact to masse edit my Reddit comments into gibberish and then after a few days of making sure it got them all, I deleted them all and then my account.

load more comments (5 replies)
[–] Morcyphr@lemmy.one 27 points 9 months ago (1 children)

Who cares? Fuck reddit. Half the content is bots anyway. So, bots stealing content to train AI to make content, which the bots will steal and repost. Circle of death for reddit. Good luck with that IPO.

load more comments (1 replies)
[–] COASTER1921@lemmy.ml 27 points 9 months ago (1 children)

If they hadn't applied the same charges to legitimate 3rd party applications they could still do this and have avoided the massive community backlash.

Considering their horrible track record with advertising and selling Reddit premium this should be the single best way for them to finally monetize their platform. They didn't need to destroy what little credibility they had remaining to their users to get to this point, but for whatever reason they did.

[–] Fake4000@lemmy.world 13 points 9 months ago (1 children)

What I don’t understand is that they had the option of providing a free service to all third party apps provided there was no commercial use.

They could have easily asked for a cut from any AI company using their data for training.

load more comments (1 replies)
[–] v4ld1z@lemmy.zip 26 points 9 months ago* (last edited 9 months ago) (1 children)

I just Googled my reddit handle and it's appalling that I found websites on the internet that archived a bunch of my posts on there including pictures I posted. I'm not sure what I expected, but it's still kinda annoying. Even though I deleted my comments after editing them and deleting my entire account

[–] Lojcs@lemm.ee 14 points 9 months ago

That's been an issue for a long time. Fake "blogs" made of scraped reddit posts.

[–] darko8472 26 points 9 months ago (1 children)

Glad I deleted all of my content over there, then.

[–] echo64@lemmy.world 29 points 9 months ago (1 children)

This may shock you, but it's not deleted.

[–] Fake4000@lemmy.world 17 points 9 months ago (5 children)

Yeah. There was this guy who deleted his account but Reddit restored it. Apparently he was going to take them to court based on some GDPR article.

load more comments (5 replies)
[–] xantoxis@lemmy.world 21 points 9 months ago (2 children)

Damn. I keep meaning to use one of those things that deletes all your reddit data. I doubt it'll actually do anything (reddit has no ethical framework so they won't think twice about indexing "deleted" data) but I still need to do that.

[–] ipkpjersi@lemmy.ml 23 points 9 months ago (2 children)

I'd bet a year of my salary that it only deletes it from public view so people can no longer get helped from Reddit's Google search results, but a copy (or more than one copy) is still retained on their internal servers.

[–] Dettweiler42@lemm.ee 31 points 9 months ago (5 children)

The trick is to turn everything into randomized garbage and then delete it later. A lot of those purge services offer that feature. It just swaps the words with others; so on the surface it looks like proper written text, but it makes absolutely no sense.

Aside from removing your content that they're profiting from, it also feeds AI scrapers pure garbage in the event that your content is restored.

load more comments (5 replies)
[–] HonorIsDead@lemmy.world 13 points 9 months ago (2 children)

Maybe I'm miss remembering but weren't they restoring stuff users deleted during the API protest?

[–] philodendron@lemdro.id 17 points 9 months ago

They were. One user got so upset he live-streamed himself individually deleting every post and comment he’d ever made. Reddit restored it all right after.

load more comments (1 replies)
[–] Alpha71@lemmy.world 20 points 9 months ago (1 children)

Yeah, I deleted a banned account only to still find the posts I made still up. So I went in and manually deleted EVEY. SINGLE. ONE.

Guess what. They still show up.

load more comments (1 replies)
[–] bbkpr@lemmy.world 18 points 9 months ago

Good, so let's train crappy AI on posts by crappier AI, which was trained by posts from even crappier AI before it.

[–] 13esq@lemmy.world 18 points 9 months ago (1 children)

If you're not paying for the product, you are the product.

load more comments (1 replies)
[–] ME5SENGER_24@lemmy.world 15 points 9 months ago (2 children)

FUCK REDDIT! FUCK U/SPEZ! The Red-exit shall endure, VIVA LA LEMMY!!

[–] bobs_monkey@lemm.ee 28 points 9 months ago* (last edited 9 months ago) (3 children)

Just because the coffee is free doesn't mean you have to drink the entire carafe

[–] dual_sport_dork@lemmy.world 19 points 9 months ago (9 children)

Yes it does. I'll get bullet-time superpowers eventually, just watch...

load more comments (9 replies)
load more comments (2 replies)
load more comments (1 replies)
[–] db2@lemmy.world 13 points 9 months ago

Greedy little pigboy Steve couldn't resist. Every day they seem to do something that reaffirms leaving was the best plan.

load more comments
view more: next ›