
Imagine this: a beautiful, balmy afternoon at Dolores Park in San Francisco. I’m singing “Happy Birthday” to a prehistoric dinosaur, and as I finish my serenade, a cupcake with a pink candle magically appears in my hand. When I blow out the flame, a calm look of contentment washes over the CGI-esque creature. While the man in this AI video looks and sounds just like me, the clip was actually generated using one of the new features available in Google’s Gemini app: avatars.
These digital recreations are essentially a clone of you that can be inserted into AI videos. Much like the core features of OpenAI’s now-defunct Sora app, these avatars are powered by Google’s new Omni video model. Currently, this exciting new feature is exclusively available to subscribers of Google’s AI Pro plan.
Meeting My Digital Twin
As a paying subscriber to Google’s AI Pro plan, which costs $20 a month, I was eager to try out this new avatar feature. However, I quickly hit Gemini’s usage limits, which reset every five hours. After asking a few questions and generating just two 10-second clips featuring my avatar, I was told to wait until later to create more.
My first two glimpses of what Omni could do with my likeness were both simultaneously impressive and a little freaky. The first clip showed me singing to a dinosaur in San Francisco, and the second featured me surfing under the Golden Gate Bridge. While the content was undeniably cringeworthy, with some jumbled moments and nonsensical outfits, that man in the video was undeniably me.
I zoomed in on the face, watching the mouth move; the teeth were slightly off, but otherwise, it was Reece, right down to the chin fat. Unlike OpenAI, which previously allowed users to decide if others could generate AI videos using their likeness, Google only permits adult users to create videos with their own avatar, emphasizing a focus on personal identity and control.
Creating Reece 2.0
Setting up my avatar through the Gemini app was surprisingly simple, taking about five minutes. The process involved sitting in a well-lit room with my phone’s camera pointed at my face and reading a string of two-digit numbers. Then, I slowly looked to the right and swiveled my head to the left, and just like that, it was done. Reece 2.0 was born, ready to be my deepfake star. (A quick tip: be mindful of what you’re wearing during this process, as your chosen outfit will likely show up in the AI generations!)
Let’s dive into the birthday clip frame by frame to truly unpack my feelings about it. The full prompt I used was: “Generate a video of me singing the happy birthday song to an aging dinosaur at the top of the hill at Dolores Park.” The video starts with a slight “millennial pause,” because apparently, even AI Reece has some ingrained habits.
What’s most striking initially is the photorealistic setting. Instead of placing my avatar on some generic hill, the background is remarkably similar to the actual Dolores Park, from the palm-tree-lined sidewalks to the Salesforce Tower in the distance. It makes perfect sense that a company known for mapping the planet could achieve such a convincing backdrop, even if it wasn’t absolutely perfect.
As AI me began to sing in a less pitchy baritone than I can actually manage, the first few bars seemed natural; I bounced my hands to the beat like a mini conductor. Then, I stuttered on the word “to,” and Gemini cut to a wider-angle shot where the real chaos began. A vanilla cupcake randomly appeared, and I exhaled a cloud of smoke to blow out the celebration candle. Honestly, it was a bit rude of AI Reece, as it wasn’t even his special day!
Beyond the Birthday Song: Surfing and Safety
The other AI clip I generated using the avatar feature also blended chaotic moments with surprisingly lifelike shots. For this one, my prompt was: “Generate a video of me surfing beneath the Golden Gate Bridge.” Instead of putting me in a wetsuit, the AI decided I should wear head-to-toe denim, although thankfully, I had no shoes on the surfboard. This generation even included shots that looked as if they were captured on a GoPro attached directly to the surfboard.
Here’s a breakdown of the key elements:
- Photorealistic Environments: Google’s Omni model excels at creating highly recognizable real-world locations.
- Avatar Accuracy: Despite minor glitches, the digital likeness is remarkably accurate to the individual.
- Creative Freedom: Users can generate imaginative scenarios, even if the AI’s clothing choices are sometimes questionable.
- Safety Protocols: Google states it prioritizes preventing harm, allowing only adults to create avatars of themselves.
As more people use generative AI, concerns about misuse, particularly nonconsensual deepfakes targeting women, are growing. Google claims to have safety at the forefront as it rolls out this new feature. “We try to prevent harm,” says Nicole Brichtova, who leads the product team working on Omni at Google DeepMind. “And, we try to do it in a way where we’re not blocking benign things.”
Despite the occasional stuttering and errors in the clips of AI Reece, these hyper-realized versions of myself felt more tangible than listening to a voicemail or rewatching a casual weekend video. The avatar didn’t necessarily look like a “hotter” version of myself; no, it was something eerier. My digital clone was seamlessly Reece. Always ready to be anywhere, to do anything, to be me.
Source: Wired – AI