this post was submitted on 21 Nov 2024
153 points (97.5% liked)

Technology

59588 readers
3180 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

Today, a prominent child safety organization, Thorn, in partnership with a leading cloud-based AI solutions provider, Hive, announced the release of an AI model designed to flag unknown CSAM at upload. It's the earliest AI technology striving to expose unreported CSAM at scale.

you are viewing a single comment's thread
view the rest of the comments
[–] floofloof@lemmy.ca 13 points 2 days ago (5 children)

This seems like a potential actual good use of AI. Can't have been much fun to train it though.

And is there any risk of people turning these kinds of models around and using them to generate images?

[–] Jimbabwe@lemmy.world 24 points 2 days ago (3 children)

If AI was reliable, maybe. MAYBE. But guess what? It turns out that “advanced autocomplete” does a shitty job of most things, and I bet false positives will be numerous.

[–] Chozo@fedia.io 11 points 2 days ago

This is not that kind of AI.

It's possible to have a good AI system, but it takes millions of dollars and several thousand manhours to do, and most companies won't put in the effort.

But, there should always be a human in the loop.

[–] AwesomeLowlander@sh.itjust.works 1 points 2 days ago (2 children)

"detect new or previously unreported CSAM and child sexual exploitation behavior (CSE), generating a risk score to make human decisions easier and faster."

False positives don't matter if they stick to the stated intended purpose of making it easier to detect CSAM manually.

[–] spankmonkey@lemmy.world 10 points 2 days ago

if they stick to the stated intended purpose

They never do.

[–] Voroxpete@sh.itjust.works 11 points 2 days ago* (last edited 2 days ago) (1 children)

The problem is that they won't.

Yes, AI tools, in the hands of skilled people, can be very helpful.

But "AI" in capitalism doesn't mean "more effective workers", it means "fewer workers." The issue isn't technological so much as cultural. You fundamentally cannot convince an MBA not to try to automate away jobs.

(It's not even a money thing; it's about getting rid of all those pesky "workers rights" that workers like to bring with us)

Here's the thing. This technology is unequivocally one of the things AI would be very useful for. It can potentially do a lot of good. Yes, MBAs could screw it up like they screw anything else up in society. That doesn't mean we shouldn't be happy that we've created this new tech.

[–] Hoimo@ani.social 3 points 1 day ago

Available image generators are already capable of generating those images and they weren't even trained on it. Once a neural network can detect/generate two separate concepts, it can detect/generate the overlap. It won't be as fine-tuned obviously, but can still turn out scarily accurate.

[–] FaceDeer@fedia.io 14 points 2 days ago

And is there any risk of people turning these kinds of models around and using them to generate images?

There isn't really much fundamental difference between an image detector and an image generator. The way image generators like stable diffusion work is essentially by generating a starting image that's nothing but random static and telling the generator "find the cat that's hidden in this noise."

It'll probably take a bit of work to rig this child porn detector up to generate images, but I could definitely imagine it happening. It's going to make an already complicated philosophical debate even more complicated.

[–] mspencer712@programming.dev 8 points 2 days ago

I think image generators in general work by iteratively changing random noise and checking it with a classifier, until the resulting image has a stronger and stronger finding of “cat” or “best quality” or “realistic”.

If this classifier provides fine grained descriptive attributes, that’s a nightmare. If it just detects yes or no, that’s probably fine.

[–] catloaf@lemm.ee 7 points 2 days ago

Nobody would have been looking directly at the source data. The FBI or whoever provides the dataset to approved groups, but after that you just say "use all the images in this folder" and it goes. But I don't even know if they actually provide real full-resolution images, or just perceptual hashes, or downsampled images.

And while it's possible to use the dataset to generate new images assuming the training data had full-res images, like I said, I know they investigate the people making the request before allowing access. And access is probably supervised and audited.