As explained in a blog post, the Dessa team managed this feat by developing a deep learning system called RealTalk that uses text inputs to produce life-like speech in the style of a real person.
It’s perhaps the best example of an audio deepfake yet. Even those well-acquainted with Rogan’s voice will likely have a hard time telling apart the fake audio from things the comedian has actually said — and that ability to fool listeners could have terrifying implications for the future.
In the demonstration shared by Dessa, faux Rogan keeps the conversation light, showing off its ability to recite tongue-twisters and contemplating whether chimpanzees could beat humans at hockey.
However, the company is well aware of the potential dangers of such a system. They even provide a bullet-pointed list of all the ways it could go wrong, noting that someone could impersonate a government official to enter a high-security facility or a politician to manipulate an election.
These potentially nefarious uses are why Dessa says it won’t publicly release its research, model, or datasets for the project.
“[It’s] one of the coolest, but scariest, things I’ve seen yet in artificial intelligence,” Alex Krizhevsky, Dessa’s principal machine learning architect, said in the post. “Unlike The Singularity, which is this theoretical thing that could happen in 40, 100 years, speech synthesis is soon going to be a reality everywhere.”
READ MORE: This AI-generated Joe Rogan fake has to be heard to be believed [The Verge]