On Wednesday, Meta Platforms unveiled its new open-source AI tool, AudioCraft, designed to enable users to generate music and audio through text prompts.
This innovative tool includes three distinct models: AudioGen, EnCodec, and MusicGen. According to the company, these models cater to various audio needs, including music creation, sound generation, and audio compression.
MusicGen has been specifically trained using music owned by the company and licensed music, Meta noted.
Concerns surrounding copyright infringement have emerged from artists and industry experts, given that machine learning software often recognizes and duplicates patterns from data sourced from the internet.
As detailed in the company’s blog post, MusicGen creates music from text inputs, while AudioGen focuses on generating audio from textual prompts. Additionally, Meta has released an enhanced version of the EnCodec decoder, which aims to produce higher-quality music with reduced audio artifacts. The pre-trained AudioGen models allow users to create environmental sounds and sound effects, such as dog barks or vehicle sirens.
The models will be accessible to researchers and practitioners, enabling them to train their models using their custom datasets. The company claims these models can deliver high-quality audio with consistent long-term output, having been developed internally at Meta over recent years.
Meta envisions AudioCraft models serving as valuable resources for musicians and sound designers going forward. The company is also committed to improving the existing models and incorporating user feedback to enhance their functionality.
This introduction follows Alphabet’s release of its own experimental audio-generating AI tool, MusicLM, earlier this year.