2 min read

AI Clones Unique Voiceprints in Under 5 Minutes

Everyone has a voiceprint, just like a fingerprint. Unique intonations, pacing, accents, etc. make all voices one-of-a-kind. Over the last decade, voiceprints have become a form of biometric security used instead of passwords and two-factor authentication. However, AI may make voice an obsolete form of security.

WTF? Identity Theft by Voice

Centrelink and the Australian Taxation Office (ATO) both give people the option of using a “voiceprint”, along with other information, to verify their identity over the phone, allowing them to then access sensitive information from their accounts.

Guardian Australia has confirmed that the voiceprint system can also be fooled by an AI-generated voice.

Using just four minutes of audio, a Guardian Australia journalist was able to generate a clone of their own voice and was then able to use this, combined with their customer reference number, to gain access to their own Centrelink self-service account.

The self-service phone system allows people to access sensitive material such as information on their payment of benefits and to request documents to be sent by mail, including replacement concession or healthcare cards. – The Guardian

This security vulnerability directly puts around 7 million residents at risk who’ve verified their voice with ATO. Although the ATO phone system doesn’t directly access any funds or actions that could seriously impact someone, they know at least one bank – Bank Australia – uses the same phone verification systems.

A handful of high-profile crimes have already been committed with voice clones. But what crime will voice clones enable on a mass-scale, akin to the 419 scams (Nigerian prince scam) or the business email compromises that have been so notoriously effective and lucrative?

It puts into perspective how quickly AI can supersede contemporary tech practices. It’s a reminder that although OpenAI has made AI seem like the center of the universe, AI is still far from a standard in most things we use.

On a related note, Waze allows you to record your own voice directions to guide you on the road. The feature has been around on Android for a month or so, but is now available on iOS as well under “Sound & Voice” > “Voice Directions.” It’s a cumbersome process since the system has you manually record every possible direction prompt. It doesn’t use AI to synthesize your voice, not that it really needs to.

I cloned my voice a couple of years ago. The recording process took about half a day. I’m curious if it’s gotten much better with recent advances in language models. This service says they can do it with 3 minutes of recording, while this service says they need 3 hours of recording time.

Anyways, that’s why Waze has celebrity voices. Much more streamlined. I believe there’s still much more room to offer customization and bring more personality into directions. But I can see why Waze is a little more stringent on keeping prompts within the lines.

Lastly, I like that Waze allows you to share your voice pack with a family member or friend. Personally, I think it would make for a nice memento of someone you have lost. It’s not as comprehensive as a total voice clone, but hearing my grandma occasionally give me directions would be a pleasant way to remember her.