Stability AI announces Stable Audio

Plus: Musk to Capitol Hill: AI poses 'civilizational threat'

Welcome to The Dispatch! We are the newsletter that keeps you informed about AI. Each weekday, we scour the web to aggregate the many stories related to artificial intelligence; we pass along the news, useful resources, tools or services, technical analysis and exciting developments in open source. Even if you aren’t an engineer, we’ll keep you in touch with what’s going on under the hood in AI.

Good morning. Today in AI:

  • Tech leaders from most major companies went to Capitol Hill to talk AI; two stories below

  • Stable Audio will soon be open sourced; it’s a text-to-audio music generator from Stability AI

  • IBM, Nvidia, Adobe, and 5 others join President Biden’s self-policing AI initiative

  • A shocking look into the amount of water required to power ChatGPT

  • Salesforce introduces Einstein Copilot Studio for customizing EinsteinGPT

  • MLCommons has new benchmarks for AI hardware - Nvidia H100 and Intel Gaudi2 take top honors

  • A neural network that smells like humans, a crash course/state of affairs in Retrieval Augmented Generation, a Reddit thread highlights how good AI video generation has become & more

Stable Audio is a latent diffusion model, much like Stable Diffusion

The Story: Stability AI, primarily known for text-to-image generator Stable Diffusion, has launched Stable Audio - its first AI product for generating music. Stable Audio allows users to generate music tracks up to 90 seconds long by providing text prompts describing the desired style, instruments, mood etc. The model was trained on a large dataset of music and metadata provided by stock music company AudioSparx.

More Details:

  • Stable Audio generates music tracks in response to text prompts - their example "Post-Rock, Guitars, Drum Kit, Bass, Strings, Euphoric, Up-Lifting, Moody, Flowing, Raw, Epic, Sentimental, 125 BPM" gets this output. Users can also specify the desired length.

  • It uses a latent diffusion architecture that enables high-quality 44.1 kHz outputs, unlike most previous text-to-audio models with lower quality sound.

  • There is a free version that generates 20 second clips, and a paid "Pro" version for 90 second tracks downloadable for commercial projects.

  • At an unspecified date, Stability will open source models based on Stable Audio and training code for training your own audio generation models.

Takeaways: While the samples might not seem terribly impressive, this is currently state of the art in text-to-music along with Meta’s AudioCraft. These tools have already improved quite a bit from the earliest models and will continue to get better. While image generation has received much attention, AI-generated music will enable many new creative possibilities for musicians and creators.

There are, naturally, ethical and other concerns including copyright status and compensation for musicians using AI. As with other generative AI, striking the right balance between enabling creation and protecting rights will be an ongoing challenge. But overall, Stable Audio points to the rapid evolution of creative AI and the new frontiers it is opening up beyond language models and image generation.

Tech leaders including Elon Musk, Mark Zuckerberg, and Sam Altman met privately with senators on Wednesday for a forum on regulating artificial intelligence. During the 7-hour closed door session, Musk warned of the "civilizational risk" posed by AI, while others advocated for various reforms and new standards at the National Institute of Science and Technology.

The event sparked skepticism from some in Congress who argued the public should not be shut out of debates on critical technology issues as tech leaders lobby behind closed doors.

Additionally, some Senators refused to attend on principle. "I think the idea that it is some great breakthrough to hear from the biggest monopolists in the world - and that they are going to share with us their great wisdom - I just think the whole framework is wrong," said Sen. Josh Hawley, R-Mo.

"You got to take it with a grain of salt. You got to realize that they're interested parties, right? They stand to make a lot of money on this, which is fine," he continued, "but you got to know that I just think the whole framing that 'Oh, aren't we so graced by their presence?' - I mean, give me a break. These people are - they've done bad things for our country."

But they are asking to be regulated. "This is sort of an important and urgent and in some ways unprecedented moment," Altman told reporters. "And I think we really need the government to lead."

That’s what most Americans want too. And in related news, Tuesday Sen. Pete Ricketts, R-Neb. introduced legislation to water-mark all AI-generated content, including enforcement rules. The bill tasks all 4 of the following bodies to lay the guidelines: Department of Homeland Security, Department of Justice, Federal Communications Commission and Federal Trade Commission.

After months in beta, Adobe has officially unveiled their major AI updates. All Creative Cloud plans now include generative AI features like image generation and editing in Photoshop, Illustrator, and Adobe Express. Adobe is also pioneering a credit model where plans get a monthly allotment of fast image generation. After that allotment, users can generate at slower speeds or buy credits.

“We’re committed to delivering the best creative tools, services and value to our members so you can keep pushing the bounds of creativity. To get started with all of the new value in Creative Cloud, you can update your apps to get access to all the new AI-powered features in Photoshop, Illustrator, Premiere Pro, and After Effects, or visit the new Adobe Firefly web app to start creating content with generative AI or Adobe Express to create content and use the new Text to Image and Text Effects features.”

Adobe is also increasing Creative Cloud subscription prices starting November 1st, citing expanded features and AI costs. Prices for students will not change.

From our sponsors:

Staying informed about the world doesn’t have to be boring.

International Intrigue is a free global affairs briefing created by former diplomats to help the next generation of leaders better understand how geopolitics, business and technology intersect. They deliver the most important geopolitical news and analysis in <5-minute daily briefing that you’ll actually look forward to reading.

-

Artificial intelligence benchmarking group MLCommons released new results for tests that measure the speed of top hardware systems running AI models. In tests for an LLM that summarized news articles, Nvidia’s H100 was unsurprisingly the top performer. Intel’s Gaudi2 came in second. Other participants included Google, Qualcomm, Oracle, Dell, and Azure. Full results for Google’s TPU were not disclosed since it was a preview submission.

The benchmark from MLCommons simulates "inference," the stage where AI models generate predictions or results based on data. Nvidia and Intel declined to disclose the exact costs of their systems, though Intel claimed Gaudi2 is cheaper than even Nvidia's previous generation hardware. As AI expands, benchmarks like these from MLCommons will help guide development and purchasing decisions around AI hardware.

Arcus is innovating on implementing Retrieval Augmented Generation (RAG) at ‘planet-scale’. RAG combines information retrieval with large language models to provide relevant contextual information for more accurate responses. However, traditional RAG approaches face challenges with incomplete content representation and inaccurate similarity search at scale.

Here is an example of how RAG provides relevant contextual information to a large language model for a more accurate response:

Prompt: "What is the population of Paris?"

Without RAG: The language model may try to guess or fabricate an answer, since it does not have up-to-date factual information encoded. It may say something inaccurate like "The population of Paris is 5 million people."

With RAG: The information retrieval component searches through an external knowledge source and retrieves the fact "The population of Paris is 2,140,526 as of January 2022."

Trending AI Tools & Services:

  • storly.ai: tell your stories with the help of AI interview prompts.

  • OpenArt: new AI image generator

  • JS2TS: JavaScript to TypeScript with ChatGPT

  • Healsens: discover health risks and insights with AI

  • Pocket Hansei: personal assistant that offers instant answers that connect to trusted, well-sourced topics such as renowned books, research papers, etc.

  • Glide AI: set of native building blocks that make it easy to create AI-powered apps

Guides/useful/lists:

Social media/video/podcast:

  • The point of LangChain — with Harrison Chase of LangChain [Podcast]

  • (Discussion) I created this video entirely on my cell phone, in just a few minutes, using the new HeyGen video translator tool. I don't think people realize how quickly and profoundly our world is changing because of AI. [Reddit]

  • A neural network can smell like humans do for the first time! [X]

  • How people are using Open Interpreter, the hottest AI project on GitHub [YouTube]

Did you know? 

The field of biocomputing aims to build computers using actual biological neurons instead of synthetic artificial intelligence models. This could allow computers to learn and think more like human brains while using far less energy. Companies like Final Spark are working to develop biocomputers by growing neurons and training them to perform basic tasks, though the technology is still in early research stages. Proponents believe biocomputing could revolutionize AI by enabling more human-like reasoning and creativity while reducing the carbon footprint of AI models that require massive computing power. However, many challenges remain before biocomputers advance beyond simple demonstrations.

Most people are not aware of the resource usage underlying ChatGPT. If you’re not aware of the resource usage, then there’s no way that we can help conserve the resources.

Shaolei Ren, Researcher for UC Riverside, September 2023