Several artificial intelligence companies are delving into the creation of interactive, AI-generated environments. Projects like an AI-driven version of Quake and an AI-constructed Minecraft highlight this trend. In addition, Google DeepMind is forming a team to craft models that can “simulate the world.” Joining these endeavors is a startup named Odyssey, co-founded by Pixar’s Edwin Catmull, which is exploring “interactive video.” This innovative experience is currently available in a research preview online.
Odyssey defines interactive video on its website as “video that can be watched and interacted with, entirely created by AI in real-time.” This interactive approach allows users to engage with the video similarly to a first-person video game, but in settings that resemble the real world rather than a polygonal landscape. The startup describes it as an “initial version of the Holodeck,” although it cautions that “the experience currently resembles a glitchy dream — raw, unstable, yet distinctly novel.”
Navigating through Odyssey’s interactive videos evokes the sensation of moving through a blurred rendition of Google Street View. Users can traverse these real-time generated environments using typical gaming controls. Available settings include a forest with a cabin, a shopping mall, and a parking lot by a large structure. Each visit offers a slightly altered scenery, as the system regenerates the visuals with each exploration, though the overall picture remains somewhat unclear.
Currently, the preview allows for a brief exploration lasting just two and a half minutes before it’s necessary to reload the site to continue.
According to Odyssey, the platform utilizes clusters of H100 GPUs located in the US and Europe to produce these interactive videos. The company notes that the model generates the next frame based on input and historical data, with real-time streaming that can occur in “as little as” 40 milliseconds.
However, the current version of the preview is not expected to rival mainstream video games like Fortnite. Collision detection in the environments is inconsistent; for instance, a player may be halted by a fence but can pass through a large house. Similarly, issues arise with environments changing unexpectedly, such as walking toward a doorway only to find a brick wall instead. Additionally, the system can behave erratically during inactivity, with reports of unwanted character movements.
In a conversation with Technology News, Catmull, a board member of Odyssey, did not provide a definite timeline for improvements in image quality. However, he expressed confidence in the company’s position at “the leading edge” of this field. Catmull noted that their involvement in the broader community will contribute to ongoing advancements, and while he acknowledged the existing image noise, he mentioned that refining textures through neural network filters is a potential solution.
It’s no Holodeck yet
At this stage, Odyssey’s offering falls short of being a polished gaming experience, despite the entertainment value stemming from its imperfections. It seems implausible that this format will replace cinema anytime soon, as the frequent and unpredictable environmental changes detract from the narrative immersion that defines good films. Currently, the blend of interactivity and cinematic storytelling remains unrefined.
Exploring the preview reveals potential for future developments in this area. With the rapid evolution of AI technologies, it’s not hard to envision a refined version that addresses current shortcomings. However, the current offering is far from the envisioned Holodeck, emphasizing the significant work that lies ahead in the journey to fully realize AI-generated video.