
According to patent documents first seen by Apple, Apple’s technology uses machine learning to create synthetic images of human faces based on a reference image provided by the user. Once the tech creates a synthetic face, it can manipulate that face to change expressions. Given a reference image or “target shape” that depicts an entire person (not just a face), the image generator can also create synthetic images in which the reference person is posed differently.
The generator’s neural network is trained to constrain the generation enough that the synthetic image looks convincingly like the reference person, not an entirely new—or simply “inspired” creation. These constraints are combined using a generative adversarial network (GAN) where multiple synthetic images are generated, after which a discriminator tries to determine which images are real or synthetic. The findings of the discriminator are used to further train both the generator and the discriminator.
Apple is well aware that its technology is clearly associated with deepfakes (although not as ridiculous as the one above). “The resulting image is a simulated image that appears to depict the subject of the reference image, but is not actually a real image,” the patent reads. So much for working together to get rid of photos and videos of people they didn’t actually do
Some believe Apple’s motivation for the patent comes from a desire to squash facial motion capture and related technologies — certainly not out of the goodness of its heart, but to ensure it has no competitors when it comes to things like Memoji. (Apple supposedly spent years buying Faceshift, PrimeSense, Perceptio, and Metaio for this exact reason.) If this is the case, Apple may never use the patent; After all, companies patent new technologies all the time without actually using them.
Others think Apple is working toward an app or feature that puts a “fun” or “convenient” twist on deepfakes. If that happens, at what point does the Biden administration’s proposed AI Bill of Rights involve?
Read now: