Final week, some voters in New Hampshire obtained an AI-generated robocall impersonating President Biden, telling them to not vote within the state’s major election. It’s not clear who was chargeable for the decision, however two separate groups of audio specialists inform WIRED it was possible created utilizing know-how from voice-cloning startup ElevenLabs.
ElevenLabs markets its AI instruments for makes use of like audiobooks and video video games; it not too long ago achieved “unicorn” standing by elevating $80 million at a $1.1 billion valuation in a brand new funding spherical co-led by enterprise agency Andreessen Horowitz. Anybody can join the corporate’s paid service and clone a voice from an audio pattern. The company’s safety policy says it’s best to acquire somebody’s permission earlier than cloning their voice, however that permissionless cloning could be OK for quite a lot of non-commercial functions, together with “political speech contributing to public debates.” ElevenLabs didn’t reply to a number of requests for remark.
Pindrop, a safety firm that develops instruments to identify synthetic audio, claimed in a blog post on Thursday that its evaluation of audio from the decision pointed to ElevenLabs’ know-how or a “system utilizing related parts.” The Pindrop analysis crew checked patterns within the audio clip in opposition to greater than 120 completely different voice synthesis engines in search of a match, however wasn’t anticipating to search out one as a result of figuring out the provenance of AI-generated audio could be troublesome. The outcomes had been surprisingly clear, says Pindrop CEO Vijay Balasubramaniyan. “It got here again nicely north of 99 p.c that it was ElevenLabs,” he says.
The Pindrop crew labored on a 39-second clip the corporate obtained of one of many AI-generated robocalls. It sought to confirm its outcomes by additionally analyzing audio samples identified to have been created utilizing ElevenLabs’ know-how and in addition with one other voice synthesis instrument to verify over the methodology.
ElevenLabs affords its personal AI speech detector on its web site that it says can inform whether or not an audio clip was created utilizing the corporate’s know-how. When Pindrop ran its pattern of the suspect robocall by means of that system, it got here again as 84 p.c prone to be generated utilizing ElevenLabs instruments. WIRED independently acquired the identical outcome when checking Pindrop’s audio pattern with the ElevenLabs detector.
Hany Farid, a digital forensics specialist on the UC Berkeley College of Info, was initially skeptical of claims that the Biden robocall got here from ElevenLabs. “Whenever you hear the audio from a cloned voice from ElevenLabs, it’s actually good,” he says. “The model of the Biden name that I heard was not significantly good, however the cadence was actually funky. It simply did not sound of the standard that I’d have anticipated from ElevenLabs.”
However when Farid had his crew at Berkeley conduct its personal, impartial evaluation of the audio pattern obtained by Pindrop, it too reached the identical conclusion. “Our mannequin says with excessive confidence that it’s AI-generated and prone to be ElevenLabs,” he claims.
This isn’t the primary time that researchers have suspected ElevenLabs instruments had been used for political propaganda. Final September, NewsGuard, an organization that tracks on-line misinformation, claimed that TikTok accounts sharing conspiracy theories utilizing AI-generated voices, together with a clone of Barack Obama’s voice, used ElevenLabs’ know-how. “Over 99 p.c of customers on our platform are creating attention-grabbing, revolutionary, helpful content material,” ElevenLabs stated in an emailed statement to The New York Occasions on the time, “however we acknowledge that there are cases of misuse, and we’ve been regularly creating and releasing safeguards to curb them.”