Artificial intelligence makes voice cloning easy and ‘the monster is already on the loose’

Must read

Nvidia’s China sales are down to a ‘mid-single digit percentage,’ as U.S. controls restrict exports of the $1.7 trillion chipmaker’s leading AI chips

Nvidia crushed expectations with a bumper quarterly earnings report on Wednesday, reporting a 265% increase in revenue from the same period a year ago,...

$1.7 trillion chip giant Nvidia just gained over $100 billion in value after a blowout quarter, but this mega-bearish analyst says the tech industry...

Investors have long had a love affair with U.S. tech stocks from the boom cycle of the late ‘90s and early 2000s that famously...

‘Space junk’ unicorn backed by Japanese billionaire wants to clear the skies for Elon Musk and Jeff Bezos

For more than a decade, an abandoned piece of a Japanese rocket has been speeding uncontrolled around Earth, at risk of colliding with active...

Ritzy ski resort towns like Telluride are going into debt to build affordable housing because their teachers can’t afford to live there

Telluride, a ski resort destination in Colorado, is the first vacation town to sell municipal bonds for affordable housing this year. It likely won’t...

In a video from a Jan. 25 news report, President Joe Biden talks about tanks. But a doctored version of the video has amassed hundred of thousands of views this week on social media, making it appear he gave a speech that attacks transgender people.

Digital forensics experts say the video was created using a new generation of artificial intelligence tools, which allow anyone to quickly generate audio simulating a person’s voice with a few clicks of a button. And while the Biden clip on social media may have failed to fool most users this time, the clip shows how easy it now is for people to generate hateful and disinformation-filled “deepfake” videos that could do real-world harm.

“Tools like this are going to basically add more fuel to fire,” said Hafiz Malik, a professor of electrical and computer engineering at the University of Michigan who focuses on multimedia forensics. “The monster is already on the loose.”

It arrived last month with the beta phase of ElevenLabs’ voice synthesis platform, which allowed users to generate realistic audio of any person’s voice by uploading a few minutes of audio samples and typing in any text for it to say.

The startup says the technology was developed to dub audio in different languages for movies, audiobooks and gaming to preserve the speaker’s voice and emotions.

Social media users quickly began sharing an AI-generated audio sample of Hillary Clinton reading the same transphobic text featured in the Biden clip, along with fake audio clips of Bill Gates supposedly saying that the COVID-19 vaccine causes AIDS and actress Emma Watson purportedly reading Hitler’s manifesto “Mein Kampf.”

Shortly after, ElevenLabs tweeted that it was seeing “an increasing number of voice cloning misuse cases,” and announced that it was now exploring safeguards to tamp down on abuse. One of the first steps was to make the feature available only to those who provide payment information. Initially, anonymous users were able to access the voice cloning tool for free. The company also claims that if there are issues, it can trace any generated audio back to the creator.

But even the ability to track creators won’t mitigate the tool’s harm, said Hany Farid, a professor at the University of California, Berkeley, who focuses on digital forensics and misinformation.

“The damage is done,” he said.

As an example, Farid said bad actors could move the stock market with fake audio of a top CEO saying profits are down. And already there’s a clip on YouTube that used the tool to alter a video to make it appear Biden said the U.S. was launching a nuclear attack against Russia.

Free and open-source software with the same capabilities have also emerged online, meaning paywalls on commercial tools aren’t an impediment. Using one free online model, the AP generated audio samples to sound like actors Daniel Craig and Jennifer Lawrence in just a few minutes.

“The question is where to point the finger and how to put the genie back in the bottle?” Malik said. “We can’t do it.”

When deepfakes first made headlines about five years ago, they were easy enough to detect since the subject didn’t blink and audio sounded robotic. That’s no longer the case as the tools become more sophisticated.

The altered video of Biden making derogatory comments about transgender people, for instance, combined the AI-generated audio with a real clip of the president, taken from a Jan. 25 CNN live broadcast announcing the U.S. dispatch of tanks to Ukraine. Biden’s mouth was manipulated in the video to match the audio. While most Twitter users recognized that the content was not something Biden was likely to say, they were nevertheless shocked at how realistic it appeared. Others appeared to believe it was real – or at least didn’t know what to believe.

Hollywood studios have long been able to distort reality, but access to that technology has been democratized without considering the implications, said Farid.

“It’s a combination of the very, very powerful AI based technology, the ease of use, and then the fact that the model seems to be: let’s put it on the internet and see what happens next,” Farid said.

Audio is just one area where AI-generated misinformation poses a threat.

Free online AI image generators like Midjourney and DALL-E can churn out photorealistic images of war and natural disasters in the style of legacy media outlets with a simple text prompt. Last month, some school districts in the U.S. began blocking ChatGPT, which can produce readable text – like student term papers – on demand.

ElevenLabs did not respond to a request for comment.

Learn how to navigate and strengthen trust in your business with The Trust Factor, a weekly newsletter examining what leaders need to succeed. Sign up here.

More articles

Latest article

Nvidia’s China sales are down to a ‘mid-single digit percentage,’ as U.S. controls restrict exports of the $1.7 trillion chipmaker’s leading AI chips

Nvidia crushed expectations with a bumper quarterly earnings report on Wednesday, reporting a 265% increase in revenue from the same period a year ago,...

$1.7 trillion chip giant Nvidia just gained over $100 billion in value after a blowout quarter, but this mega-bearish analyst says the tech industry...

Investors have long had a love affair with U.S. tech stocks from the boom cycle of the late ‘90s and early 2000s that famously...

‘Space junk’ unicorn backed by Japanese billionaire wants to clear the skies for Elon Musk and Jeff Bezos

For more than a decade, an abandoned piece of a Japanese rocket has been speeding uncontrolled around Earth, at risk of colliding with active...

Ritzy ski resort towns like Telluride are going into debt to build affordable housing because their teachers can’t afford to live there

Telluride, a ski resort destination in Colorado, is the first vacation town to sell municipal bonds for affordable housing this year. It likely won’t...

$24.6 billion mega deal rocked by Colorado AG’s claim supermarkets colluded not to hire each other’s workers

Kroger’s $24.6 billion acquisition of competing grocery chain Albertsons could be in jeopardy after two challenges by state attorneys general—and a claim by Colorado’s...