Google is set to unveil its latest image generation model, dubbed Imagen 4, promising “stunning quality” along with “superior typography” capabilities.
Eli Collins, VP of product at Google DeepMind, highlighted the model’s efficiency, saying in a blog post, “Our latest Imagen model combines speed with precision to create stunning images.” He emphasized the impressive clarity of fine details, including intricate fabrics, water droplets, and animal fur, showcasing both photorealistic and abstract styles. Sample images provided by Google display realistic details, featuring visuals such as a whale leaping from the water and a close-up of a chameleon.
The new AI model significantly improves spelling and typography, facilitating tasks like making greeting cards, posters, and comics easier. While OpenAI’s integration of image generation in ChatGPT included enhancements in text rendering, it still exhibits tendencies toward errors.
Initial examples provided by Google showcase the model’s text capabilities; the text is clear and legible, even in a brief comic strip, and tiny fonts on mock stamps are also readable. However, the effectiveness of the text rendering function in practical use will need further evaluation by everyday users.
The launch of Imagen 4 is slated for May 20, with availability on platforms such as the Gemini app, Whisk, and Vertex AI, in addition to integration in Slides, Vids, Docs, and other Workspace applications. Collins further revealed plans for a “fast variant” of Imagen 4, projected to be “up to 10x faster than Imagen 3,” which will be introduced soon.