1. News
  2. INTERNET
  3. Nvidia Unveils Fugatto: The Future of AI Audio Creation!

Nvidia Unveils Fugatto: The Future of AI Audio Creation!

featured
Share

Share This Post

or copy the link

Nvidia has unveiled a new artificial intelligence (AI) model, dubbed Fugatto, which is designed to create diverse audio outputs and mix various sound types. This foundation model, short for Foundational Generative Audio Transformer Opus 1, is said to offer more granular control over audio generation compared to existing platforms like Beatoven and Suno.

Nvidia Unveils AI Audio Model Fugatto

In a recent blog post, Nvidia elaborated on the features of its latest large language model (LLM). The company indicated that Fugatto is capable of producing music snippets, modifying existing tracks by adding or removing instruments, altering voice accents or emotions, and even generating entirely novel sounds.

The model accepts both text and audio inputs, allowing users to combine formats for more precise requests. At its core, Fugatto is built upon Nvidia’s advancements in speech modeling, audio vocoding, and audio comprehension, utilizing an extensive architecture with 2.5 billion parameters trained on Nvidia DGX system datasets.

Nvidia emphasized the international collaboration behind Fugatto, involving team members from Brazil, China, India, Jordan, and South Korea. This diverse collaboration has enhanced the AI model’s capabilities in handling multiple accents and languages.

Regarding the functionalities of the AI audio model, Nvidia noted its ability to generate sounds beyond its initial training parameters. For instance, the company highlighted that “Fugatto can make a trumpet bark or a saxophone meow. Whatever users can describe, the model can create.”

Fugatto can also utilize a technique called ComposableART, enabling users to specify audio attributes. For example, users can request the AI to generate the sound of a person speaking French while conveying sadness, allowing for adjustments in the intensity of the emotion and the strength of the accent based on user instructions.

Furthermore, the foundation model is capable of creating audio with temporal interpolation, meaning it can produce sounds that evolve over time. This includes generating atmospheric effects, such as a rainstorm with thunder that gradually fades away. Users can experiment with different soundscapes, and even if the model encounters unfamiliar sounds, it can still create new audio from scratch.

Currently, there are no announcements from Nvidia regarding the potential availability of Fugatto for general users or businesses.

Nvidia Unveils Fugatto: The Future of AI Audio Creation!
Comment

Tamamen Ücretsiz Olarak Bültenimize Abone Olabilirsin

Yeni haberlerden haberdar olmak için fırsatı kaçırma ve ücretsiz e-posta aboneliğini hemen başlat.

Your email address will not be published. Required fields are marked *

Login

To enjoy Technology Newso privileges, log in or create an account now, and it's completely free!