AI & TechArtificial IntelligenceBigTech CompaniesNewswireTechnology

Google’s AI Image Creator Removed My Shirt

▼ Summary

Google’s Nano Banana Pro is an AI-powered image generation tool for professionals that upgrades the company’s viral image editing capabilities.
– The tool allows users to create high-quality images, render legible text, blend multiple images, and perform precise edits like lighting and angle adjustments.
– It successfully generated accurate infographics with proper text and citations for real-time information such as weather forecasts.
– The model struggled with complex tasks like summarizing articles in comic format and sometimes altered requested text or added unexpected elements like removing clothes.
– Despite occasional flaws, Nano Banana Pro is a significant upgrade that produces intelligible text and more precise edits compared to basic generative AI tools.

Exploring the capabilities of Google’s new Nano Banana Pro reveals a powerful yet occasionally unpredictable tool for image creation and editing. Designed with professionals in mind, this advanced system builds upon the viral success of its predecessor, which transformed ordinary selfies into strikingly realistic 3D figures. According to Google, the platform enables users to generate high-quality, printable images, incorporate readable text, and seamlessly merge multiple photos into a single composition. A product manager from Google DeepMind noted that it’s also intended for individuals aspiring to “feel like professionals,” which sounded promising for someone like me, who lacks formal design training. My own experience yielded polished but somewhat awkward results, visually appealing, yet unmistakably amateur.

Getting started with Nano Banana Pro is straightforward. Within the Gemini app, you select the “create images” option and activate the ‘thinking’ mode. From there, you input your text prompt, add a reference image if desired, and let the AI work its magic. The service is available for free, though usage is subject to quotas that expand for subscribers of Google’s AI Plus, Pro, and Ultra tiers.

Google makes impressive promises, including “studio-quality designs,” “flawless text rendering,” and a suite of creative editing features. To put these claims to the test, I uploaded a straightforward photo of myself near The Verge’s New York office, with the Brooklyn Bridge visible in the background. I instructed Gemini to shift the lighting from daytime to nighttime, and it performed admirably. The final image looked convincing, with details like vehicle direction handled correctly, a common stumbling block for many image generators. Adjusting the camera angle proved just as effortless; when I requested a higher vantage point from the right, the tool delivered without a hitch.

The platform also boasts the ability to produce infographics and diagrams for visualizing real-time data such as weather conditions or sports scores. As a British expatriate currently in New York, I asked for a four-day weather forecast covering both Washington, DC, and my current location. Visually, the infographic resembled what you might find on a standard weather website. Text and numbers appeared clean and legible, a significant improvement over the garbled output typical of many AI image tools. Gemini even provided a list of citations, allowing me to verify the accuracy of the information.

However, the model encountered difficulties with more complex assignments. I tasked it with summarizing a recent Verge article about Europe scaling back AI and privacy regulations, presented in a comic book style. While the images and text were rendered impeccably in a playful font, the comic failed to accurately summarize the piece, offering only a vague outline of the EU’s AI Act instead. This shortfall might have stemmed from my providing a link to the article rather than copying the text directly.

When I pasted the article’s content, the results improved. The comic-style summary captured the essence of the story, though someone unfamiliar with the original material might have found it challenging to follow. It also invented phrases that weren’t present in the actual article.

Eager to channel my inner professional designer, I decided to create holiday greeting cards. With Christmas approaching, I uploaded three selfies and was genuinely impressed by Gemini’s ability to generate full-body versions of me in various outfits and expressions. It constructed a realistic, snowy scene complete with Christmas trees, just as I requested, and placed “Merry Christmas!” at the top of the card.

Things took an unexpected turn when I asked Gemini to swap the wintry backdrop for a sunny beach, envisioning an Australian-style holiday. The AI took creative liberties, specifically, by removing the shirts from two of my digital clones. The result was bizarre, to say the least. Other oddities included conspicuously artificial feet and a cheerful sandman (built by my shirtless double) replacing the snowman from the original scene. A few inconsistencies stood out: the sandman lacked a shadow, unlike other elements in the image, and Christmas lights strung across palm trees glowed unnaturally in the bright sunlight. Testing its precision editing, I instructed the AI to add muscle definition to just one clone, which it accomplished in seconds, a feat far easier than achieving the same in reality. Overall, the image quality was exceptional and almost believable, aside from the conspicuous absence of a large chest tattoo that I actually have.

Not every aspect was flawless. The model failed to preserve the exact text I specified for the card, replacing “Merry Christmas!” with “Aussie Summer Christmas!” It also struggled with rendering animals convincingly; my sister’s cat appeared in the same stiff pose across every card variation, despite being adorned with a whimsical Santa hat.

In summary, Nano Banana Pro represents a significant leap forward from the basic model. It allows for more precise edits and produces intelligible text, overcoming a major obstacle that has limited the practical use of generative AI tools. Still, these enhancements weren’t quite enough to transform me into a skilled designer.

(Source: The Verge)

Topics

product review 98% ai image generation 95% image editing 90% ai limitations 88% text rendering 85% model upgrades 85% creative applications 82% professional tools 80% image realism 80% ai precision 78%