1. News
  2. AI
  3. Apple Unveils Matrix3D: Transforming 2D Images to 3D

Apple Unveils Matrix3D: Transforming 2D Images to 3D

featured
Share

Share This Post

or copy the link

New advancements in artificial intelligence (AI) have emerged from Apple with the introduction of Matrix3D, an innovative model capable of generating 3D perspectives from multiple 2D images. Developed by Apple’s Machine Learning team in collaboration with researchers from Nanjing University and the Hong Kong University of Science and Technology (HKUST), this large language model (LLM) has been released for public use and is accessible via Apple’s GitHub repository.

Apple’s Matrix3D Innovates Multi-Task Photogrammetry

A recent blog post by the tech powerhouse provides insights into the research underpinning the development of the Matrix3D model. While various 3D rendering models exist, Matrix3D distinguishes itself by integrating the 3D generation pipeline, thereby minimizing error risks. This singular LLM can handle multiple photogrammetry subtasks, including pose estimation, depth prediction, and novel view synthesis, rather than relying on separate models for each task.

Photogrammetry offers precise measurements and 3D representations of physical entities and environments through the analysis of images. This method is frequently employed to produce maps, 3D models, and precise measurements derived from 2D images captured from various angles.

Additionally, a research paper detailing the innovative aspects of the model has been made available on the online preprint platform arXiv here. The Matrix3D model operates on a multimodal diffusion transformer (DiT) architecture, allowing it to merge data from multiple sources, including image data, camera parameters, and depth maps.

The study emphasizes a mask learning strategy utilized during training, where sections of the image are obscured, compelling the AI model to predict the correct pixels needed to fill in the gaps. Researchers discovered that Matrix3D can produce a complete 3D object or scene view using just three images taken from different perspectives.

Although the dataset employed for training is not disclosed, the model is freely available for download, modification, and redistribution under a permissive license on Apple’s GitHub listing.

Apple Unveils Matrix3D: Transforming 2D Images to 3D
Comment

Tamamen Ücretsiz Olarak Bültenimize Abone Olabilirsin

Yeni haberlerden haberdar olmak için fırsatı kaçırma ve ücretsiz e-posta aboneliğini hemen başlat.

Your email address will not be published. Required fields are marked *

Login

To enjoy Technology Newso privileges, log in or create an account now, and it's completely free!