@TechLich

TechLich@lemmy.world · 2 months ago

Not entirely true. You don’t need your own personal data centre, you can use GPU cloud instances for a lot of that stuff. It’s expensive but not so expensive that it would be impossible without being a huge tech company (only 1000s of dollars, not billions). This can be done by anyone with a credit card and some cash to burn. Also, you don’t need to train a model from scratch, you can build on existing models that others have published to cut down on training.

However, to impersonate someone’s voice you don’t need any of that. You only need about 5-10 seconds of audio for a zero-shot impersonation with a pre-trained model. A minute or so for few-shot. This runs on consumer hardware and in some cases even in real time.

Even to build your own model from scratch for high quality voice audio, there doesn’t need to be a huge amount of initial training data. Something like xtts was trained with about 10-15K hours of English audio which is actually pretty easy to come by in the public domain. There are a lot of open and public research datasets specifically for this kind of thing, no copyright infringements necessary. If a big tech company wants more audio data than what’s publically available, they just pay people to record audio, no need to steal it or risk copyright claims and breaking surveillance laws, they have a budget to exploit people to record whatever they want.

This tech wasn’t invented by some evil giant tech company stealing everybody’s data, it was mostly geeky computer scientists presenting things at computer speech synthesis conferences. That’s not to say there aren’t a bunch of huge evil tech companies profiting from this or contributing to this kind of tech, but in the context of audio deepfakes being accessible to scammers, it’s not on them and I don’t think that some kind of extra copyright regulation on data centres would do anything about it.

The current industry leader in this space in terms of companies trying to monetize speech synthesis is elevenlabs which is a private start-up with only a few dozen employees.

The current tech is not perfect but definitely good enough to fool someone who isn’t thinking too hard over a noisy phone call and a scammer doesn’t need server time or access to a data centre to do it.

TechLich@lemmy.world · edit-2 2 months ago

One thing you gotta remember when dealing with that kind of situation is that Claude and Chat etc. are often misaligned with what your goals are.

They aren’t really chat bots, they’re just pretending to be. LLMs are fundamentally completion engines. So it’s not really a chat with an ai that can help solve your problem, instead, the LLM is given the equivalent of “here is a chat log between a helpful ai assistant and a user. What do you think the assistant would say next?”

That means that context is everything and if you tell the ai that it’s wrong, it might correct itself the first couple of times but, after a few mistakes, the most likely response will be another wrong answer that needs another correction. Not because the ai doesn’t know the correct answer or how to write good code, but because it’s completing a chat log between a user and a foolish ai that makes mistakes.

It’s easy to get into a degenerate state where the code gets progressively dumber as the conversation goes on. The best solution is to rewrite the assistant’s answers directly but chat doesn’t let you do that for safety reasons. It’s too easy to jailbreak if you can control the full context.

The next best thing is to kill the context and ask about the same thing again in a fresh one. When the ai gets it right, praise it and tell it that it’s an excellent professional programmer that is doing a great job. It’ll then be more likely to give correct answers because now it’s completing a conversation with a pro.

There’s a kind of weird art to prompt engineering because open ai and the like have sunk billions of dollars into trying to make them act as much like a “helpful ai assistant” as they can. So sometimes you have to sorta lean into that to get the best results.

It’s really easy to get tricked into treating like a normal conversation with a person when it’s actually really… not normal.

TechLich@lemmy.world · 2 months ago

It’s a bit late for that. This particular nuclear reactor is open source, free to download and runs on consumer hardware. Can’t really unfry that egg and the quality is getting better all the time. Identity fraud is already illegal in most places so not sure exactly what regulation would be appropriate here.

TechLich@lemmy.world · 2 months ago

“No worries!” means “Yes, that’s fine, there is nothing to worry about.”

He thought it meant “No! You should worry about that!”

TechLich@lemmy.world · 2 months ago

This is a really interesting cultural one that always kinda surprises me.

Where I am, cooking has always been a very masculine thing. Cutting up meat with sharp knives, setting things on fire, etc. The chef industry here is very male dominated and men cook together as a social thing when hanging out. In most families I encounter, the dad does most of the cooking with the exception of maybe baking? It’s weird to hear that it would ever be thought of as insufficiently masculine.

In fact I think it would be seen as maybe a bit embarrassing/weak if you were a man who couldn’t cook.

TechLich@lemmy.world · 4 months ago

Friendship drive charging…

TechLich@lemmy.world · 4 months ago

I feel like this a cultural thing because that sounds wild to me.

The penalty for burglary where I am is not death, nor am I a judge or executioner.

We’ve been broken into a lot and it’s usually just some poor asshole who wants to steal things to buy meth. It’s horrible and scary and feels like a massive violation but shooting someone in that scenario just feels like straight up murder.

TechLich@lemmy.world · 4 months ago

Not American, or really knowledgeable about it but from the outside, I think this looks like ordinary politicking.

IVF is a proxy war for abortion. Dems want the talking point that abortion bans hurt/block IVF. Republicans/Trump want to remove that talking point by saying they love IVF “we want more babies right?” and will support laws to protect it as a separate and unrelated issue to abortion.

Dems put forward a bill that not only protects it but makes insurance companies pay for it. Trump is fine with that because it benefits him but Republicans in Congress get big money from insurance lobbyists and so they can’t vote for it. They also have fears that they’ll piss off their homophobic supporters by making them pay for something the gays might use (insurance costs will go up to help someone who isn’t me!").

Republicans put forward another bill that protects IVF without hurting their insurance company buddies but the Dems block it. Republicans then have to vote against the IVF bill and the Dems can now say “see! They really don’t care about reproductive rights at all!”

Feels a bit like nobody involved actually cares about IVF at all and just wants votes and lobbyist money.

In case this take comes across too centrist: Republicans and Trump are really quite shit.

TechLich@lemmy.world · 5 months ago

You forgot interrobang‽ The most important and incredulous reason for a compose key.

TechLich@lemmy.world · edit-2 5 months ago

Hmmm…

That looks pretty paywally to me. That said, I’m all for people supporting independent media.

TechLich@lemmy.world · 5 months ago

This is called “semantic satiation” which are both pleasingly weird words now that I think about it…

TechLich@lemmy.world · edit-2 5 months ago

I don’t think that’s how it works? It’s the client application that has the key for the end to end encryption, not the server. I don’t think you need to trust the matrix server you use? I could be wrong, I don’t know matrix particularly well.

TechLich@lemmy.world · 5 months ago

Yeah, that’s fair enough, though I’m not sure it’s very different from malicious instances creating normal user accounts?

You can see when users from an instance are all suspiciously voting the same way at the same time regardless of whether they are usernames or IDs.

There’s lots of legitimate users that only vote but never post so doing it based on that doesn’t seem very effective?

The second problem is solved using public key cryptography, the same way that you can’t impersonate someone else’s username to post comments. Votes and comments are digitally signed (There would need to be a different public key for voting to maintain pseudonymity though).

TechLich@lemmy.world · 5 months ago

How about pseudonymous as a compromise? Votes could be publicly federated but tied to some uuid instead of the username. That way you still have the same anti spam ability (can see that a user upvoted these things from this instance at this time) but can’t tie it directly to comments or actual user accounts without some extra osint.

It might be theoretically possible to correlate the uuids with an account’s activity and dox the user in some cases, especially with some instances having a single user, but it would be very difficult or impossible to do on larger instances and would add an extra layer. Single user instances would be kind of impossible to make totally private anyway because they can be identified by instance.

TechLich@lemmy.world · 6 months ago

It does sound very strange. What kind of anti-China content would ever help a student’s application process? Most of the application documents are about things like English language competency, visa requirements and prior qualifications, not political opinions.

TechLich@lemmy.world · 6 months ago

The most popular brand of matches in Denmark is called Tordenskjold. In the late 1800s, Sweden had a large export production of matches, so a Danish manufacturer put Tordenskiold’s portrait on his matchbox in 1882, in the hope he could once more strike at the Swedish (Danish: give de svenske stryg).[13] The Tordenskjold brand was bought by a Swedish company in 1972.[14]

Ouch.

TechLich@lemmy.world · 6 months ago

Haha, the internet did not fit on a 1.44mb floppy in 1998. Curious to know what was on this‽

1998 was well into the CD-ROM era and the internet was full of .mp3s and .isos by then.

TechLich@lemmy.world · 7 months ago

It’s not that it’s on the 172.16.0.0/12 range. That’s totally normal and used for all kinds of stuff.

It’s that it’s in 172.16.42.0/24 which is the default dhcp settings for a wifi pineapple. It’s the /24 mask given on the .42 that’s a little suspicious because that’s not a common range for anything else.

Being assigned one of those specific 253 hosts with that subnet mask would definitely make me think twice.

TechLich@lemmy.world · edit-2 7 months ago

Pretty sure it’s an autocomplete (like copilot or something)

They were typing

progress != “Hold”

And the ai autocomplete suggested

progress != “Hold onto your butts!”

Hence why the completion part is in grey (it’s a suggestion)

TechLich@lemmy.world · 10 months ago

Looks more /usr/bucket/cat