The Irony of 'You Wouldn't Download a Car' Making a Comeback in AI Debates

FatCat@lemmy.world · 3 months ago

The Irony of 'You Wouldn't Download a Car' Making a Comeback in AI Debates

vrighter@discuss.tchncs.de · 3 months ago

except that it can, and regularly does, regurgitate copyrighted works verbatim.

Cyyy@lemmy.world · 3 months ago

no it doesn’t. i tried to achieve this multiple times myself and it never worked. and the cases where journalists say it did, they needed to specific ask a lot of times and in a highly specific way till they got a short snippet. Chatgpt dont spits out the exact same phrases over and over again if you ask the same, but has a variable defining how “random” and “far away from the perfect next predicted text” the output is, and by default this makes sure that the answers are never the same. Otherwise it wouldn’t be chat like but more like a simple database spitting out always the same answers for the same question. But that’s not how chatgpt works.

sugar_in_your_tea@sh.itjust.works · 3 months ago

The problem isn’t that it does it regularly, but that it can do it, meaning that the copyrighted works are reproducible, regardless of how much the interface tries to hide that. That means the model isn’t really “learning” the same way a human would in any capacity (that should be obvious), but that it’s storing data that would violate fair use, and could generate copyright-violating portions of works.

Humans read and don’t retain the originals. The argument is that LLMs retain the originals, and that’s where the issue lies.