Technology

37717 readers

459 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:

This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 2 years ago

MODERATORS

alyaza@beehaw.org

TheRtRevKaiser@beehaw.org

gyrfalcon@beehaw.org

rs5th@beehaw.org

coldredlight@beehaw.org

Los@beehaw.org

SemioticStandard@beehaw.org

TheRtRevKaiser@kbin.social

remington@beehaw.org

The AI feedback loop: Researchers warn of ‘model collapse’ as AI trains on AI-generated content (venturebeat.com)

submitted 1 year ago by Fallstar@mander.xyz to c/technology@beehaw.org

14 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] simple@lemmy.world 9 points 1 year ago (1 children)

Feels like AI creators can only get away with using pre-2022 data for so long. At some point the information will be outdated and they'll have to train on newer data, and it'll be interesting to see if this is a problem that can be solved without harming the dataset's quality.

My guess is they'd need to have an AI that tries to find blatantly AI generated data and take it out of the dataset. It won't be 100% accurate, but it'll be better than nothing.

[–] agressivelyPassive@feddit.de 2 points 1 year ago (2 children)

I'm surprised, these models don't have something like a "ground truth layer" by now.

Given that ChatGPT for example is completely unspecialized, I would have expected that relatively there's a way to hand encode axiomatic knowledge. Like specialized domain knowledge or even just basic math. Even tieried data (i.e. more/less trusted sources) seem not to be part of the design.

[–] Drewelite@sopuli.xyz 3 points 1 year ago

I think this is something that's easier said than done. Maybe at our current level, but as these AI get more advanced... What is truth? Sure mathematics seems like an easy target until we consider one of the best use cases for AI could be theory. An AI could have a fresh take on our interpretation of mathematics, where these base level assumptions would actually be a hindrance.

[–] Splodge5@lemmy.world 3 points 1 year ago

Because it's not designed to be a knowledge base, it's designed to imitate human communication. It's the same reason why ChatGPT can't do maths - it doesn't "know" anything, it just predicts the most likely word/bit-of-a-word to come next. ChatGPT being as good as it is at, say, writing code given a natural language prompt is sort of just a happy accident, but people now expect that to be it's primary function.