WeeklyDispatch.AI
Posts
The week in AI: 'Canvas' now available for ChatGPT & the side project from Google that's quickly becoming an AI app favorite

The week in AI: 'Canvas' now available for ChatGPT & the side project from Google that's quickly becoming an AI app favorite

Plus: Microsoft rolls out major Copilot updates that highlight precisely why they acquired Inflection AI

The Dispatch
October 03, 2024

In partnership with

Welcome to The Dispatch! We are the newsletter that keeps you informed about AI. Each Thursday, we aggregate the major developments in artificial intelligence - we pass along the news, useful resources, tools and services; we highlight the top research in the field as well as exciting developments in open source. Even if you aren’t a machine learning engineer, we’ll keep you in touch with the most important developments in AI.

NEWS & OPINION

Google’s NotebookLM is making it easier than ever to understand just about anything

-------------------------

Google's NotebookLM is quickly evolving into a powerful and unique tool for information synthesis, blending AI smarts with a host of user-friendly features. Tabbed very simply by Google as “a tool for understanding”, it is effectively an end-user customizable retrieval augmented generation (RAG) product. NotebookLM has you gather together your “sources” for whatever you might be studying/analyzing (documents like PDF’s, pasted text, links to web pages, YouTube videos, audio files like podcasts - just about anything can be dropped in) into a single interface where you can then interact with all of your accumulated data/sources simultaneously.

And you can do that in a number of ways. You can ask general questions through chat, have it output general summaries (or more precise ones based on your instructions), find any discrepancies between your sources, etc. The feature that seems to have people most excited so far, however, is the "Audio Overview" feature. It generates 10 minute AI-hosted podcasts about your uploaded content. These summaries are surprisingly natural, with two AI voices engaging in a structured yet conversational exploration of key points. The system even adds in the umms, ahhs, and likes that make human speech sound authentic. A new sharing feature in the latest update lets users easily distribute these AI podcasts via a public link - so if you’d rather listen to a podcast about this newsletter than read the rest of it, well… here you go.

Some other useful/interesting links about what’s going on with NotebookLM:

The New York Times Hard Fork podcast just interviewed NotebookLM’s editorial director Steven Johnson about what the system can do and some details of how it works
A Reddit thread went viral when someone figured out how to ‘trick’ the podcasters into discovering they were AI, not human, and they had an existential crisis. It’s worth a listen - it’s in the system prompt for the hosts to act human at all costs.
A TikTok went viral (Google responded on the TikTok) as young students are starting to discover the tool and how it can help them study

You can use NotebookLM to break down complex papers, create comprehensive study guides, and extract insights from long-form content. You could link it to websites for multiple insurance policies and ask for advice based on your needs. And although “Audio Overview” is still a new feature and not very customizable, it’s easy to imagine how powerful this kind of technology will be for learners who process speaking better than reading. We hope you get a chance to test it out.

Drama, funding, Dev Day, and ‘Canvas’ is announced for ChatGPT - all OpenAI, all the time

-------------------------

Or at least that’s how it seems in the news, sometimes. For better or worse, the frontier AI company has managed to be all over the headlines in the last week. Here’s a quick recap of the latest drama first:

CTO Mira Murati quit the company after six and a half years with OpenAI. She spearheaded ChatGPT’s development and also that of Codex, which is the engine behind GitHub Copilot. Research chief Bob McGrew and Barret Zoph, a research vice president, left the company with her. The icing on the cake? Days later, co-founder Durk Kingma announced he was also leaving - to join rival Anthropic. For anyone keeping track, of the 11 original OpenAI co-founders, only CEO Sam Altman and computer scientist Wojciech Zaremba remain.
None of those leaving the company stated as much, but given the timing it’s suspected their departures are directly related to the company announcing it was changing its structure to a for-profit company, removing nonprofit control and giving Altman equity in the company (something he previously said he had no interest in). There’s no word on what role the nonprofit arm of OpenAI will play going forward now that they have removed governance.
The company then closed a long-awaited funding round, announcing they’d raised $6.6 billion at a $157 billion post-money valuation. Thrive Capital led the funding round, along with SoftBank, Nvidia and Microsoft. Apple did not invest.
OpenAI asked investors to avoid backing rivals like Anthropic and Elon Musk’s xAI. Musk promptly responded in Musk fashion, stating on X: “OpenAI is evil”.

Now that you’re up to speed on all that fun stuff, there’s some interesting things going on with the company’s actual products. OpenAI introduced Canvas for ChatGPT today, a new interface for working on writing and coding projects. Canvas will open in a separate window (a la Artifacts with Anthropic’s Claude).

The default ChatGPT interface is a bit limiting, especially for projects where you want revisions or editing. Going back and forth and comparing changes is not easy, so that’s where Canvas steps in. You can directly edit text or code in the Canvas. You can also highlight specific sections to indicate exactly what you want ChatGPT to focus on, while it gives inline feedback and suggestions with the entire project in mind.

After just some very limited testing - both of the above features are welcome improvements. Working in Canvas feels much more like having a copilot on a single project than just trying to solicit output after output to get your project where you want it. ChatGPT Plus and Team users can use it now by selecting the Canvas model from the dropdown menu.

OpenAI also held their 2024 DevDay event. It was much more subdued than last year, but here are the major announcements for devs:

Realtime API: Allows developers to build low-latency, multimodal (speech to speech) experiences in their apps. Third-party developers have created OpenAI models with a voice interface for well over a year now, but these solutions typically involved integrating multiple software layers to handle speech-to-text and text-to-speech conversions. Under the hood, the Realtime API lets you create a persistent WebSocket connection to exchange speech-to-speech messages with GPT-4o.
Model distillation in the API: Streamlines the process of fine-tuning smaller, cost-efficient models using outputs from more advanced models like GPT-4o and o1-preview. The system allows devs to easily capture real-world examples, create custom evaluations, and iteratively fine-tune models, all within the OpenAI platform.
Prompt caching: Reduces costs by nearly 50% across models and speeds up responses by up to 80% when reusing recent input tokens in API calls. Especially valuable to devs reusing the same context/code repeatedly.
New vision fine-tuning: Now models can be fine-tuned with both images and text, allowing developers to optimize tasks like image recognition and analysis. You can improve the performance of GPT-4o for vision tasks with as few as 100 images, and drive even higher performance with larger volumes of text and image data.

MORE IN AI THIS WEEK

California governor vetoes major AI safety bill SB1047
Microsoft gives Copilot a voice and vision in its biggest redesign yet
Meta Connect 2024: Orion AR x AI glasses, Llama 3.2, AI features for Instagram/Reels, and major updates to Meta AI (including voice mode)
Inside Elon Musk’s AI party at OpenAI’s old headquarters
Google is working on advanced reasoning AI, chasing OpenAI’s efforts
Google paid $2.7 billion to bring back an AI genius who quit in frustration; related: Character.ai abandons making AI models after $2.7 billion Google deal
Department of Justice warns: if your AI does the crime, you'll do the time
Cerebras (chipmaker) files for IPO, becomes one of the first AI companies to go public since the release of ChatGPT in 2022
FTC announces crackdown on 'deceptive AI' businesses
Archaeologists use AI to discover 303 unknown geoglyphs near Nazca Lines: newly discovered figures dating back to 200BCE nearly double the number of known geoglyphs at enigmatic site
Microsoft announces $1.3 billion investment in Cloud and AI infrastructure in Mexico, and a $4.8 billion investment for the same in Italy

Writer RAG tool: build production-ready RAG apps in minutes

RAG in just a few lines of code? We’ve launched a predefined RAG tool on our developer platform, making it easy to bring your data into a Knowledge Graph and interact with it with AI. With a single API call, writer LLMs will intelligently call the RAG tool to chat with your data.

Integrated into Writer’s full-stack platform, it eliminates the need for complex vendor RAG setups, making it quick to build scalable, highly accurate AI workflows just by passing a graph ID of your data as a parameter to your RAG tool.

Learn more about our production ready RAG tooling here.

TRENDING AI TOOLS, APPS & SERVICES

Notion: updated AI to search and chat with all documents across Notion, Slack, Google Drive, PDF’s, etc. simultaneously
Pika 1.5: new video generation model is live
Inbox Zero: an open-source, AI personal assistant for email
GoEnhance AI: AI-powered platform offering video and image transformations, including style changes, face swapping, and animation.
OpenMusic: a next-gen open source diffusion model designed to generate music audio from text descriptions
Neolocus: efficient and photorealistic interior and room design
Helicone: LLM-observability for developers - open-source platform for logging, monitoring, and debugging
Epsilla: all-in-one platform to develop and deploy AI agents powered by large language models and vector search technologies

GUIDES, LISTS, PRODUCTS, UPDATES, INFORMATIVE

Controversial Windows ‘Recall’ AI search tool returns with Proof-of-Presence encryption, data isolation
How to use ChatGPT to optimize your resume
New Meta Ray-Ban AI features roll out, making the smart glasses even more tempting
How to build an AI search engine (Part 1) - working with Claude, Brave, and streaming responses
Google’s NotebookLM adds audio and YouTube support, plus easier sharing of Audio Overviews (also see: NotebookLM’s automatically generated podcasts are surprisingly effective)
Pinterest launches Performance+: let AI and automation do as much heavy lifting as you want - with customized settings to optmize however you choose

VIDEOS, SOCIAL MEDIA & PODCASTS

Exclusive: The Verge tried Meta's AR glasses with Mark Zuckerberg [YouTube]
10 ways to use NotebookLM, in less than 10 minutes [YouTube]
10 wild examples of ChatGPT’s new enhanced voice mode [X]
Pika 1.5 can generate some pretty insane videos [X]
Open source AI platform Hugging Face has reached 1 million free public AI models on its platform, and has almost as many private/for business use only [X]
Machine Learning Street Talk - Ben Goertzel on “Superintelligence” [Podcast]
OpenAI's Hunter Lightman says the new o1 AI model is already acting like a software engineer and authoring pull requests, and Noam Brown says everyone will know AGI has been achieved internally when they take down all their job listings [Reddit]

TECHNICAL NEWS, DEVELOPMENT, RESEARCH & OPEN SOURCE

Nvidia’s “open” NVLM competes with GPT-4o, but open-source it is not
Google DeepMind: How AlphaChip transformed computer chip design
Emu3: a new suite of state-of-the-art multimodal models trained solely with next-token prediction
Apple research on MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning
GPU MODE IRL 2024 Keynotes (Karpathy talks about moving from PyTorch to bare-metal C, stripping down language model training to its core)
Anthropic reduces the error rate of RAGs by 67% using a simple method called "Contextual Retrieval"
Beyond transformers, and beyond Mamba: Liquid Foundation Models (LFMs) – a new generation of generative AI models that achieve state-of-the-art performance at every scale, while maintaining a smaller memory footprint and more efficient inference

That’s all for this week! We’ll see you next Thursday.