Google has unveiled a playful new AI image generation platform, called Whisk, that lets you prompt it with pictures rather than wordy text prompts.
With Whisk, announced this week, you drag in images to give the tool a sense of what you're looking for. It asks you to upload a picture of any subject (such as a person or animal), a scene (like a beach, jungle or cityscape) and then an image that reflects the desired style (such as retro, emo or anime).
From there, it lets you "remix" the elements to create unique images in the form of stickers, enamel pins or digital plush toys.
Users can further tweak the final creations, which may evolve during the generation process, by adjusting prompts — for instance, asking the tool to change the subject's height, hairstyle or skin tone.
An experimental tool from Google Labs, Whisk runs on the company's Gemini AI model, which automatically writes a detailed caption for the input images. These descriptions are then fed into Google's newest image-generation model, Imagen 3, to produce a new picture.
See also: Best AI Image Generators of 2024
Two years on from OpenAI's blockbuster debut of ChatGPT, tech companies are continuing to explore what generative AI is capable of and to roll out new capabilities. This month alone, that's included OpenAI's Sora video generation tool and new Apple Intelligence features in iOS 18.2, such as Genmoji, Visual Intelligence and ChatGPT integrations with Siri. Google, meanwhile, has introduced Gemini 2.0 and a limited release of its Project Astra vision-assisted AI agent.
In its blog post, Google described Whisk as more of a creative tool than a traditional image editor.
"We built it for rapid visual exploration, not pixel-perfect edits," the blog post said. "It's about exploring ideas in new and creative ways, allowing you to work through dozens of options and download the ones you love."


