ByteDance Unveils AI That Creates Realistic Human Videos!

ByteDance, the parent company of TikTok, has unveiled details about its latest artificial intelligence (AI) framework, named OmniHuman. This innovative video-generation system can produce highly realistic human videos featuring full-body movements and synchronized lip expressions. According to the developers, the framework requires an image of a person and accompanying motion signals, which can include video or audio, to produce its output. The company has also released several example videos demonstrating the capabilities and realism of the generated content. Importantly, ByteDance has indicated that the AI model will be made publicly accessible.

OmniHuman Can Generate Realistic Human Videos

The OmniHuman framework has been elaborated upon on its official website. It is described as an end-to-end solution developed using a unique multimodality motion conditioning training strategy. While the researchers did not disclose specific benchmark metrics, they assert that OmniHuman significantly outperforms current methods in the field.

With OmniHuman, users can generate videos by inputting an image of an individual alongside a motion signal, which may consist of just audio, just video, or a combination of both. This AI model is capable of creating realistic videos based on textual prompts. The resulting videos feature complete body movements in sync with facial expressions and lip movements, tailored to the audio or music integrated into the production. Users also have the option to generate videos in various aspect ratios, enhancing versatility.

OmniHuman output example
Photo Credit: OmniHuman

The inclusion of motion signals represents an innovative approach that the company refers to as omni-conditions training. This technique allows the AI model to integrate various modalities such as text, images, audio, and video, enabling the system to learn mixed conditioning that mitigates the limitations posed by the lack of high-quality training data.

Notably, OmniHuman was trained on an extensive dataset, comprising 18,700 hours of human video footage. A comprehensive account of the training methodology has been documented in a paper published on the arXiv pre-print repository.

Demonstrations of videos produced by the OmniHuman model showcase highly realistic outcomes, exhibiting natural body language, gestures, and lip synchronization. While such lifelike quality has sparked discussions surrounding the implications of deepfakes, ByteDance has clarified that the AI model is not currently available for download, nor is there a service through which users can access its functionalities.

ByteDance Unveils AI That Creates Realistic Human Videos!

Comment

ByteDance Unveils AI That Creates Realistic Human Videos!

Share This Post

or copy the link

OmniHuman Can Generate Realistic Human Videos

Tamamen Ücretsiz Olarak Bültenimize Abone Olabilirsin

Related News

Microsoft 365 Adds Free Copilot Chat to All Apps!

Nothing Secures $200M to Revolutionize AI Devices

Microsoft Chooses Claude Over GPT in AI Model Update

Perplexity Faces Lawsuit from Britannica and Merriam-Webster

Google Gemini’s Nano Banana Surges Past ChatGPT!

Write a Reply Cancel