AI & TechArtificial IntelligenceBigTech CompaniesDigital MarketingNewswireTechnology

Google launches versatile anything-to-anything AI model

▼ Summary

– The author deepfaked her child’s stuffed animal to test Google’s AI video tools, finding Omni Flash improved character consistency over the previous Veo model but still had glitches like objects changing shape.
– Omni Flash allows users to upload a video and use a text prompt to generate new content, with Google claiming better real-world knowledge and character consistency.
– Generating videos costs credits (15–40 per clip, 40 per edit), and the author’s $20/month Pro plan (1,000 credits) was nearly depleted after 20 clips and a few edits.
– The author deepfaked herself eating pasta and standing at the Eiffel Tower, finding the results convincing enough to potentially fool people on social media, though some clips had subtle AI tells.
– The tools are easy to use and produce realistic videos with minimal effort, but the author notes the results are still in the uncanny valley, not yet cinematic masterpieces.

A senior reviewer with over a decade of experience covering consumer tech, I have a particular focus on mobile photography and telecom, and previously worked at DPReview. Last year, I ran an experiment: I deepfaked my son’s stuffed deer, Buddy, making it appear as though the plush toy was on vacation. This was inspired by a Gemini ad, and while I never showed the resulting videos to my four-year-old, the exercise revealed a lot about the fine line between harmless generative AI fun and outright slop. Perhaps that Venn diagram is a perfect circle,maybe not. What I know for certain is that the tools for creating realistic videos are surprisingly good, requiring remarkably little effort or expertise. That trend is accelerating as we enter Gemini’s Omni era.

Omni represents a new family of generative models that, in theory, will one day transform any input,photo, video, or text,into any other form of media. For now, it focuses on video creation. Omni Flash is the first model Google has released, available in its AI video generation and editing platform, Flow. You can still use the previous model, Veo, but Omni offers several improvements. With Omni, you upload a video and pair it with a text prompt as the starting point for your AI-generated content. Google also claims Omni incorporates more real-world knowledge during video production and maintains better character consistency throughout a clip. To test these claims, I brought back AI Buddy for another adventure.

The results were bafflingly inconsistent. Some clips were excellent,far more coherent and faithful to my prompts than what I saw testing Veo five months ago. Yet even the best Omni outputs included AI jump scares, like Buddy suddenly switching orientation mid-skydive. For another video, I gave Omni artistic freedom: “Create a montage of Buddy packing for a vacation and embarking on a cruise ship for a tropical vacation. The mood is cute and playful. Buddy packs something funny in his suitcase that comes into play later in the clip.” Omni had Buddy pack a jar of honey; later, he reaches for it as if it’s sunscreen, saying “Uh oh” as he squeezes honey onto his hoof. Not a bad bit, except the honey bottle constantly changes form,from a jar to a clear squirt bottle filled with water, then back to a honey-filled squeeze bottle. The final frame was nearly incomprehensible, as if the model just regurgitated random elements from the sequence.

Text-based prompts for video edits work better with Omni than with Veo 3, but the results remain hit-or-miss. When I asked Omni to emphasize Buddy’s facial reactions in vacation clips, the results looked strange. It also gave Buddy antlers from time to time,he does not have antlers, as he is a baby deer. When I prompted removal of the antlers in one scene, it complied but then added antlers to all other scenes. Generating videos costs credits, ranging from 15 to 40 based on length and starting ingredients. One edit round costs 40 credits. My $20-per-month AI Pro plan includes 1,000 credits each month. After about 20 clips and a few edits, I was down to 145 credits. If you have specific video ideas, you may face costly back-and-forth with the model.

One of Omni’s claimed strengths is adding AI-generated content to real videos, so I gave Buddy a break and deepfaked myself. Starting with a selfie video of me with a neutral expression, I prompted Omni to generate videos of me eating spaghetti, sitting in an airplane seat, and standing before the Eiffel Tower taking a bite of a baguette. I was genuinely unprepared for what I saw. There are AI tells: the fork clinking against the pasta bowl sounds too manufactured, and a woman appears twice in the airplane background. But aside from these glitches and a vaguely uncanny feel, the videos are convincing as hell. I showed my husband the pasta clip; he knew I was testing an AI video tool but didn’t know what was AI-generated. He believed I was sitting in front of a camera eating pasta, his only clue being the unfamiliar bowl. The pasta-eating itself looked real enough to fool a man who has seen me daily for a decade.

My other deepfakes vary in quality, from “good enough to fool people on social media” to slightly cartoonish. One Eiffel Tower clip is convincing enough that you might need to rewatch it several times to spot the AI. I know it’s not me when the AI me turns her head and reveals her hair in a ponytail,but I’m not sure anyone else would notice, and that feels unsettling. We are deep in the uncanny valley. I’m exhausted by it all. I was shocked by Veo 3’s realism, and I’ve been repeatedly shocked by how easy it is to create fake people in fake photos over the past few years. Omni should shock me too, and it does, but the edge has worn off.

It’s still not quite as simple to create an AI-generated cinematic masterpiece as Google would like you to believe. But Omni does improve on Veo in recognizable ways. If you have a Google account and a credit card, you can take a video of yourself at home and make it look like you’re on a flight to Maui with trivial effort. We may not be at the “foothills of the singularity,” but we are definitely deep in the uncanny valley. All images and videos in this story were generated by Google Gemini.

(Source: The Verge)

Topics

gemini omni 98% ai video generation 95% Deepfakes 92% consumer ai tools 88% uncanny valley 85% ai-generated content 83% ai editing features 80% realism and artifacts 78% cost and credits 75% User Experience 73%