With the endless ethical and legal issues around GenAI, I would very much hope that Valve continues being cautious (even if it's evidently just to cover their own arses). Once we have models and datasets for AI generated game assets that are trained from entirely ethical sources (artist permissions, licences, etc.) and not just the "scrape everything and train our models from that" approach that is currently used, then maybe it could be a good thing for games. Even still, the generated assets will likely have no copyright (as is the case now), so we'll surely end up at "AI generated content flip games" flooding Steam.
PC Gaming
Discuss Games, Hardware and News on PC Gaming **Discord** https://discord.gg/4bxJgkY **Mastodon** https://cupoftea.social **Donate** https://ko-fi.com/cupofteasocial **Wiki** https://www.pcgamingwiki.com
Honestly I don't understand how there could be a copyright issue. "Training" your brain on copyrighted works of art and making similar art does not violate copyright, so why should an AI doing the same be considered copyright infringement?
Because its very different what you refer to training your brain vs what is training an AI, which is basically photobashing stuff to the point of including watermarks from stuff they stole while scraping images they dont have the license to use.
I'm not sure about the photobashing thing.
If I remember correctly a generated image starts from noise and the AI refines that noise to form shapes. When watermarks show up, it's not because it's bashing the original images, but because it learnt to put watermark on an image.
I wish it was like that, and while its "true" that they "should" be starting purely from noise and form from what they have learnt the reality is that they end up using big chunks of pieces that are regularly found online.
A properly trained AI should not be able to do that. The data AI stores is not images or fragments of images. It's a set of weights for various attributes for each term. Like the concept of "cat" would be stored as the set of most common values for various attributes it analyzed and found to match on all training images labeled as "cat". With thousands of cat images as input, having proper variations between them, the result will always be unique.
It's the same as a child learning to draw. If they see a drawing of a cat, they might try to copy that as best they can. But if they see many different representations of a cat then they will also learn to express themselves creatively and make up their own variation. And nobody is going to sue the kid for having looked at copyrighted pictures of cats.
Pretty sure you could use your brain to put a Getty Images watermark on an image. You're probably already imagining it now. That argument doesn't hold water.
If the training data contains a lot of copyrighted material, then when the AI could end up trained to favor results that include parts That resemble copyrighted material.
How the copyrighted material was acquired could matter. If scraped with a valid API from social media, odds are the social media company claims a license to redistribute any content uploaded to it. But what if a lot of that content was uploaded illegally to begin with? The AI company could be unwittingly paying for stolen art. Or what if the AI company buys a curated collection of training data a different group put together? Odds are that may include copyrighted work, and the company selling it is unlikely to be licensed to do so…
A third issue is the intent of copyright. There is nothing natural or real about copyright. It is a concept we invented to aid creators so they can profit from their work and continue to produce more content which then benefits society. If AI art threatens that system, a court might decide protected art cannot be included in training data in order to maintain that intent.