On Wednesday, Google DeepMind introduced Genie 2, an advanced artificial intelligence model that surpasses its predecessor, Genie, which focused on generating 2D game worlds. Genie 2 is designed to create intricate 3D environments that are playable and action-controllable, all from a single image prompt. Dubbed an AI “world model,” the new system can produce environments lasting up to one minute, featuring consistent objects and elements throughout.
Google DeepMind Unveils Genie 2 AI Model
Details about Genie 2’s capabilities were shared in a blog post published by DeepMind. Unlike the original model, which was limited to 2D platformers, Genie 2 can produce fully interactive 3D realms where users and AI agents can engage in activities such as walking, running, swimming, and climbing.
Genie 2 has the ability to create routes, buildings, and items that are not present in the original input image. The model constructs these components from scratch, ensuring a coherent environment. This consistency means that players can leave an area and return to find it unchanged, preserving the integrity of the experience.
Additionally, Genie 2 can render multiple viewing perspectives including first-person, isometric, and third-person views. Users have the capability to interact with various elements in the generated worlds, enabling actions like opening doors, bursting balloons, or climbing ladders. The model can also simulate physical effects, such as water ripples, smoke, and realistic lighting, enhancing the immersive experience.
From a technical standpoint, DeepMind described Genie 2 as an autoregressive latent diffusion model that has been trained on an extensive video dataset. Its transformer architecture includes an autoencoder, facilitating the frame-by-frame generation of these dynamic environments.
This announcement follows the earlier release of the Scalable Instructable Multiworld Agent (SIMA), an AI model designed for agentic functions in 3D settings. DeepMind asserts that Genie 2 will provide unique environments for similar AI agents, aiding their training for various real-world scenarios.
With its ability to generate original worlds, Google anticipates that Genie 2 will mitigate the risk of data contamination and enable developers to accurately evaluate the capabilities of AI agents in diverse situations.