this post was submitted on 28 Jul 2023
463 points (93.6% liked)
Technology
59237 readers
3364 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
People still tap into real world while AI does not do that yet. Once AI will be able to actively learn from realworld sensors, the problem might disappear, no?
They already do. where do you think the training corpus comes from? The real world. It's curated by humans and then fed to the ml system.
Problem is that the real world now has a bunch of text generated by ai. And it has been well studied that feeding that back into the training will destroy your model (because the networks would then effectively be trained to predict their own output, which just doesn't make sense)
So humans still need to filter that stuff out of the training corpus. But we can't detect which ones are real and which ones are fake. And neither can a machine. So there's no way to do this properly.
The data almost always comes from the real world, except now the real world also contains "harmful" (to ai) data that we can't figure out how to find and remove.