A single “aahhh” contains enough vocal DNA to be an identity cue. I recently read that in ExpMag and it blew my mind. There is so much undiscussed innovation in voice synthesis technology. And I believe it is one of the most useful applications of AI we can pursue.
Voice Clones for Crime
Voice cloning technology has had a rocky beginning:
- In early 2019, thieves used an artificially-generated voice of a company’s CEO to convince an employee to wire $243,000 to the criminal’s account.
- In 2021, someone upped that same crime to a $35 million heist.
- Another frightening instance was in 2019 when a site called NotJordanPeterson.com allowed anyone to generate deepfake audio clips of Jordan Peterson.
Certainly, voice cloning technology can and will be abused. Let’s look beyond the crimes, though.
Voice Clones for Good
- Descript’s Overdub feature allows podcasters to synthesize their voice, thus making text edits that change the audio output.
- Synthesia combines voice synthesis tech with AI video editing, allowing people to generate “talking head” videos simply by typing text.
- VocaliD is a company that creates customized synthetic voices for people who have lost their speech or were born with speech disorders.
- Modulate makes voice recognition software to help identify vocal abuse in communities, on video games, and in other public forums.
- Sonatic cloned Val Kilmer’s voice to give his character, Iceman, the ability to speak in Top Gun: Maverick (since Kilmer lost his voice due to cancer).
I personally have synthesized my voice on two different platforms. And it helps my team produce content magnitudes quicker.
There are also implications of this tech beyond productivity and work. Something much more meaningful.
I often think about how technology will be used to help us memorialize one another. Eternava will turn your loved one’s ashes into diamonds. Capsule Mundi makes organic burial pods which a tree will grow from. Of course, the invention of the camera gave us visual mementos.
But I think that voice clones are a truly unique memorial of a loved one.
This past summer, Amazon shocked attendees of the re:MARS conference, showcasing how Amazon Alexa can produce a high-quality voice with less than a minute of voice recording. The result is that Alexa can read a story aloud in a grandmother's voice (or anyone for that matter) if someone requests it.
The idea of being able to synthesize a loved one’s voice and hear it after they’ve passed is controversial. It’s your prerogative to believe in its power or despise the idea of it.
Personally, I think it could be one of the most powerful creations for coping with death.
“Every year on my birthday, my mom sends me the same voicemail she’s saved of my grandparents sending me birthday wishes. Without fail, it provokes a flood of emotions in me. And all I want is to hear a little more of their voices.” – Ryan
Anyone who has gone through the painful loss of a loved one knows the deep desire to hear their voice just one more time. Now we have the technology to make that happen.
Is there a possibility that people abuse the ability to type what they want to hear from the deceased? Yes. Is there a possibility that it helps people cope with their losses? Absolutely.
In my opinion, the positive change this could bring far outweighs the negatives. Some of us need to hear our grandparents give us one last bit of motivation. Some of us need to hear our parents assure us that everything will be okay. Some of us may want to wake up to the sound of our friend telling us to get up and get after it.
That is why I cloned my voice. And this is why I believe you should clone your voice too.
Lastly, the only thing this is missing is a good dose of branding. Because “voice clone” will not make people feel warm and fuzzy about this idea.