AudioCraft Is Meta’s New AI Music Toy for Musicians
Meta is releasing a set of powerful one-stop shop AI music toolboxes to generate “high-quality, realistic audio and music from text.” AudioCraft, the open source three-model toolbox “works for music and sound generation and compression — all in the same place.”
Meta markets AudioCraft as a new instrument like the synthesizers for musicians and sound designers. “Imagine a professional musician being able to explore new compositions without having to play a single note on an instrument,” Meta says about AudioCraft. “Or a small business owner adding a soundtrack to their latest video ad on Instagram with ease.”
AudioCraft consists three models: MusicGen (for music generation), AudioGen (for sound effect generation), and EnCodec (a generative AI compressor), each having very powerful functionalities.
MusicGen was trained on roughly 400,000 recordings along with text description and metadata, amounting to 20,000 hours of music owned by Meta or licensed specifically for this purpose.
AudioGen was trained on “public sound effects” and can generate natural, environment, and imagined sound effects like wind blowing, cat meowing, footsteps, wooden floor cracking.
EnCodec, on the other hand, was “trained specifically to compress any kind of audio and reconstruct the original signal with high fidelity.”
AudioCraft is also made “open source,” meaning that anyone could access, modify, and research on its original source code. The company explains that the goal is to help researchers and the music community as a whole to learn about the new technology and “help advance the field of AI-generated audio and music.”
There are few limitations, and the company admits that the AudioCraft models were trained mainly on “Western-style music” and therefore lack diversity. The tools are also limited to text and metadata written in English.