Tencent has unveiled a new artificial intelligence (AI) model, HunyuanPortrait, which is capable of animating still portrait images. The model utilizes a diffusion architecture and can create realistic animated videos derived from a reference image along with a guiding video. Researchers involved in the project emphasized that HunyuanPortrait adeptly captures both facial data and spatial movements, ensuring precise synchronization with the reference image. Tencent has made the HunyuanPortrait AI model publicly available for download and local execution through well-known repositories.
Reviving Portraits: Tencent’s HunyuanPortrait
In a recent announcement on X (previously known as Twitter), Tencent Hunyuan’s official account revealed that the HunyuanPortrait model is ready for the open community. Users can find the AI model on Tencent’s listings on GitHub and Hugging Face. Furthermore, a pre-print paper elaborating on the model’s capabilities is available on arXiv. It is important to note that while the AI model can be used for academic and research applications, it is restricted from commercial use.
The HunyuanPortrait model is designed to generate incredibly lifelike animated videos from a still portrait using reference and guiding videos. It captures facial data and head movements from the video input, effectively translating them to the static portrait image. Tencent asserts that the model provides accurate synchronization of movements and replicates subtle changes in facial expressions.
Architecture of HunyuanPortrait
Photo Credit: Tencent
On the model’s web page, Tencent researchers provided insights into the architecture of the HunyuanPortrait. It is based on Stable Diffusion models combined with a condition control encoder. These pre-trained encoders help to separate motion data from identity in the videos. The extracted data is employed as control signals and incorporated into the still portrait utilizing a denoising unet, which enhances both spatial accuracy and temporal consistency in the results.
Tencent claims that their AI model surpasses existing open-source alternatives in terms of temporal consistency and controllability, although these assertions have yet to be verified by independent sources.
Such technology could play a significant role in the filmmaking and animation sectors. Traditionally, animators rely on manual keyframing of facial expressions or costly motion capture technologies to achieve realistic character movements. With innovations like HunyuanPortrait, animators can simply input character designs along with desired movements and expressions, streamlining the animation process. This breakthrough also presents the opportunity for smaller studios and independent creators to access high-quality animation tools.