Fake America great again
|Technology Review 17 Aug 2018 at 04:04|
Guess what? I just got hold of some embarrassing video footage of Texas senator Ted Cruz singing and gyrating to Tina Turner. His political enemies will have great fun showing it during the midterms. Donald Trump will call him “Dancin’ Ted.”
Okay, I’ll admit it—I created the video myself. But here’s the troubling thing: making it required very little video-editing skill. I downloaded and configured software that uses machine learning to perform a convincing digital face-swap. The resulting video, known as a deepfake, shows Cruz’s distinctively droopy eyes stitched onto the features of actor Paul Rudd doing lip-sync karaoke. It isn’t perfect—there’s something a little off—but it might fool some people.
Photo fakery is far from new, but artificial intelligence will completely change the game. Until recently only a big-budget movie studio could carry out a video face-swap, and it would probably have cost millions of dollars. AI now makes it possible for anyone with a decent computer and a few hours to spare to do the same thing. Further machine-learning advances will make even more complex deception possible—and make fakery harder to spot.
These advances threaten to further blur the line between truth and fiction in politics. Already the internet accelerates and reinforces the dissemination of disinformation through fake social-media accounts. “Alternative facts” and conspiracy theories are common and widely believed. Fake news stories, aside from their possible influence on the last US presidential election, have sparked ethnic violence in Myanmar and Sri Lanka over the past year. Now imagine throwing new kinds of real-looking fake videos into the mix: politicians mouthing nonsense or ethnic insults, or getting caught behaving inappropriately on video—except it never really happened.
The final video shows Ted Cruz’s face stitched almost seamlessly onto Paul Rudd.
The Tonight Show with Jimmy Fallon/CNN
“Deepfakes have the potential to derail political discourse,” says Charles Seife, a professor at New York University and the author of Virtual Unreality: Just Because the Internet Told You, How Do You Know It’s True? Seife confesses to astonishment at how quickly things have progressed since his book was published, in 2014. “Technology is altering our perception of reality at an alarming rate,” he says.
Are we about to enter an era when we can’t trust anything, even authentic-looking videos that seem to capture real “news”? How do we decide what is credible? Whom do we trust?
These still images of Ted Cruz and Paul Rudd were taken from the footage that was fed to a face-swapping program.
CNN/The Tonight Show with Jimmy Fallon
Several technologies have converged to make fakery easier, and they’re readily accessible: smartphones let anyone capture video footage, and powerful computer graphics tools have become much cheaper. Add artificial-intelligence software, which allows things to be distorted, remixed, and synthesized in mind-bending new ways. AI isn’t just a better version of Photoshop or iMovie. It lets a computer learn how the world looks and sounds so it can conjure up convincing simulacra.
I created the clip of Cruz using OpenFaceSwap, one of several face-switching programs that you can download for free. You need a computer with an advanced graphics chip, and this can set you back a few thousand bucks. But you can also rent access to a virtual machine for a few cents per minute using a cloud machine-learning platform like Paperspace. Then you simply feed in two video clips and sit back for a few hours as an algorithm figures out how each face looks and moves so that it can map one onto the other. Getting things to work is a bit of an art: if you choose clips that are too different, the result can be a nightmarish mishmash of noses, ears, and chins.
But the process is easy enough.
The reverse face-swap, showing Paul Rudd’s face pasted onto Ted Cruz, is less convincing—and creepier.
The TOnight Show with Jimmy Fallon/CNN
Face-swapping was, predictably, first adopted for making porn. In 2017, an anonymous Reddit user known as Deepfakes used machine learning to swap famous actresses’ faces into scenes featuring adult-movie stars, and then posted the results to a subreddit dedicated to leaked celebrity porn. Another Reddit user then released an easy-to-use interface, which led to a proliferation of deepfake porn as well as, for some odd reason, endless clips of the actor Nicolas Cage in movies he wasn’t really in. Even Reddit, a notoriously freewheeling hangout, banned such nonconsensual pornography. But the phenomenon persists in the darker corners of the internet.
OpenFaceSwap uses an artificial neural network, by now the go-to tool in AI. Very large, or “deep,” neural networks that are fed enormous amounts of training data can do all sorts of useful things, including finding a person’s face among millions of images. They can also be used to manipulate and synthesize images.
OpenFaceSwap trains a deep network to “encode” a face (a process similar to data compression), thereby creating a representation that can be decoded to reconstruct the full face. The trick is to feed the encoded data for one face into the decoder for the other. The neural network will then conjure, often with surprising accuracy, one face mimicking the other’s expressions and movements. The resulting video can seem wonky, but OpenFaceSwap will automatically blur the edges and adjust the coloring of the newly transplanted face to make things look more genuine.
Left: OpenFaceSwap previews attempted face swaps during training. Early tries can often be a bit weird and grotesque.
Right: The software takes several hours to produce a good face swap. The more training data, the better the end result.
Similar technology can be used to re-create someone’s voice, too. A startup called Lyrebird has posted convincing demos of Barack Obama and Donald Trump saying entirely made-up things. Lyrebird says that in the future it will limit its voice duplications to people who have given their permission—but surely not everyone will be so scrupulous.
There are well-established methods for identifying doctored images and video. One option is to search the web for images that might have been mashed together. A more technical solution is to look for telltale changes to a digital file, or to the pixels in an image or a video frame. An expert can search for visual inconsistencies—a shadow that shouldn’t be there, or an object that’s the wrong size.
Dartmouth University’s Hany Farid, one of the world’s foremost experts, has shown how a scene can be reconstructed in 3-D in order to discover physical oddities. He has also proved that subtle changes in pixel intensity in a video, indicating a person’s pulse rate, can be used to spot the difference between a real person and a computer-generated one. Recently one of Farid’s former students, now a professor at the State University of New York at Albany, has shown that irregular eye blinking can give away a face that’s been manipulated by AI.