- WeeklyDispatch.AI
- Posts
- The week in AI: 'Project Strawberry' is out today - ChatGPT's new reasoning model for solving hard problems
The week in AI: 'Project Strawberry' is out today - ChatGPT's new reasoning model for solving hard problems
Plus: The man who scammed his way to $10m with AI-generated music
Welcome to The Dispatch! We are the newsletter that keeps you informed about AI. Each Thursday, we aggregate the major developments in artificial intelligence - we pass along the news, useful resources, tools and services; we highlight the top research in the field as well as exciting developments in open source. Even if you aren’t a machine learning engineer, we’ll keep you in touch with the most important developments in AI.
NEWS & OPINION
-------------------------
Just two days ago, The Information was reporting that OpenAI’s much anticipated advanced reasoning model, Project Strawberry, would be out within two weeks. Well, it’s out today - and it’s called… OpenAI o1-preview. It doesn’t exactly roll off the tongue, but as OpenAI explains: “As an early model, it doesn't yet have many of the features that make ChatGPT useful, like browsing the web for information and uploading files and images.”
Here’s what you should know:
Advanced reasoning: If OpenAI o1-preview could be summed up in two words, it would be those. The AI now “thinks through” both your prompt and its own response before offering one. It also offers a drop-down menu to walk you through its thinking process along with the final output - a nice touch.
How effective is it? Just a few weeks ago, we were seeing articles that highlighted just how bizarrely wrong (tongue in cheek with the Strawberry shot) the best models would often get simple math or logic-related questions. o1-preview does not have this issue. It can now beat human PhD experts in solving extremely hard physics problems, and scored 83% on the International Mathematics Olympiad (IMO) test. GPT-4o scored 13%.
It’s still powered by GPT-4o: It doesn’t write better, or more creatively. It can still hallucinate. o1-preview is more of a “procedural optimization” for GPT-4o around complex planning, problem solving or other tasks that GPT-4o struggles with. And it (mostly) excels at these.
It’s out now for ChatGPT Plus and Teams users: … with extremely limited usage rates: “Both o1-preview and o1-mini can be selected manually in the model picker, and at launch, weekly rate limits will be 30 messages for o1-preview and 50 for o1-mini. We are working to increase those rates and enable ChatGPT to automatically choose the right model for a given prompt.”
Alongside o1-preview, OpenAI introduced o1-mini - a faster, more cost-effective model optimized for coding tasks. They’re planning to make o1-mini available to free ChatGPT users, but no word yet on when that might happen.
What does the upgrade signify, ultimately? We think Ethan Mollick sums it up nicely in his One Useful Thing blog post, with an accompanying video of o1 going to work on a coding task:
“Planning is a form of agency, where the AI arrives at conclusions about how to solve a problem on its own, without our help. You can see from the video above that the AI does so much thinking and heavy lifting, churning out complete results, that my role as a human partner feels diminished. It just does its thing and hands me an answer. Sure, I can sift through its pages of reasoning to spot mistakes, but I no longer feel as connected to the AI output, or that I am playing as large a role in shaping where the solution is going. This isn’t necessarily bad, but it is different.”
-------------------------
A North Carolina ‘musician’ has been arrested for orchestrating a sophisticated $10 million streaming scheme. Michael Smith allegedly used artificial intelligence to generate hundreds of thousands of fake songs by nonexistent bands, then employed bots to stream these tracks billions of times on platforms like Spotify, Apple Music, and Amazon Music. This elaborate deception - which ran for seven years - involved creating thousands of fake streaming accounts and even outsourcing some of the work to paid co-conspirators.
Smith was earning roughly $110,000 per month by 2019. He bragged in a February 2024 email that he had reached over 4 billion streams and $12m in royalties.
Spotify, Apple Music and other digital streaming platforms operate on a pro-rata or stream-share model - meaning they collect a fixed amount of revenue from subscriptions and ads, and then allocate a percentage (about 70%) of this pool to rights holders. The ‘top comment’ from the linked NYT article reads: “I may be missing something here, but I fail to see what would be illegal about Mr. Smith's ingenious scam.”
Smith's scheme is illegal because it fraudulently diverted royalty payments from legitimate artists, effectively stealing from the fixed revenue pool meant for real musicians. That violates not only platform terms, but federal law. Smith faces up to a maximum of 20 years in prison for each charge.
Also worth noting - Smith’s songs are AI-generated nonsense. Modern text-to-music AI tools aren’t great; but they’re getting better. Fast. Streaming platforms and regulators are going to have their hands full.
-------------------------
Apple's iPhone 16 reveal showcased a leap forward in the iPhone’s AI integration, introducing Apple Intelligence as a core feature across its ecosystem. The new suite of AI-powered tools for the iPhone is designed to enhance user productivity, creativity, and device interaction. While some features will be available at launch, others are scheduled for release in the following months, indicating a phased rollout of Apple's AI capabilities. Pre-orders start tomorrow, and the iPhone 16 will be available starting September 20th.
The Apple Intelligence suite encompasses a wide range of functionalities, including advanced writing tools, enhanced photo and video management, intelligent notification prioritization, and a revamped Siri experience. These features leverage on-device processing and Apple's new Private Cloud Compute to ensure user privacy. Apple is also integrating ChatGPT across a wide swathe of these updates, and Siri can optionally be powered by OpenAI’s powerful flagship model.
The full spectrum of Apple Intelligence features won't be immediately available. The initial release will include core functionalities like Writing Tools and some improved Siri interactions, while more advanced features such as Visual Intelligence and custom image generation are slated for later releases. The gradual rollout strategy will allow Apple to refine their AI offerings and ensure a smooth integration into its ecosystem.
MORE IN AI THIS WEEK
OpenAI fundraising set to vault startup’s valuation to $150 billion
California legislature approves SB 1047, a bill with “sweeping” AI regulations
Human drivers are to blame for most serious Waymo collisions
TIME magazine’s 2024 100 most influential people in AI list
Roblox announces AI tool for generating 3D game worlds from text
X permanently stops Grok AI from using EU citizens’ tweets after court action by Irish data watchdog
China’s AI models behind their U.S. counterparts by 6 to 9 months, says former head of Google China
BP extends use of AI in five-year deal with spy tech firm Palantir
MI6 and CIA using generative AI to combat tech-driven threat actors
Cops lure pedophiles with AI pics of teen girl. Ethical triumph or new disaster?
Indeed uses OpenAI to deliver contextual job matching to millions of job seekers
Receive Honest News Today
Join over 4 million Americans who start their day with 1440 – your daily digest for unbiased, fact-centric news. From politics to sports, we cover it all by analyzing over 100 sources. Our concise, 5-minute read lands in your inbox each morning at no cost. Experience news without the noise; let 1440 help you make up your own mind. Sign up now and invite your friends and family to be part of the informed.
TRENDING AI TOOLS, APPS & SERVICES
Adobe Firefly: generative text-to-video AI coming soon
Google’s NotebookLM: AI app to work with large documents, can now generate expressive Audio Overviews for any text material you put in
Hoop: AI task management for busy professionals
Bricks: an AI-powered tool that generates reports, visuals, and presentations from your data
Earkick: AI self-care for anxiety relief, mood, habits, with memory
Conch Video: generates 720p short video clips from text prompts
Shortwave: new e-mail assistant with multi-step reasoning, smart search and deep integrations
Trupeer: turn short screen recordings into polished product videos and detailed guides
GUIDES, LISTS, PRODUCTS, UPDATES, INFORMATIVE
Salesforce announces Industries AI: a set of foundational, pre-built, and customizable AI capabilities that tackle industry-specific needs and challenges
Google tests its ‘Ask Photos’ AI assistant that understands what’s in your pictures, also launches Audio Overview feature that can turn documents, slides, charts and more into engaging discussions with one click
Adobe previews its upcoming text-to-video generative AI tools
Audible to start generating AI voice replicas of select audiobook narrators
AI image statistics in 2024: people are generating an average of 34 million images per day
Canva says its AI features are worth the 300 percent price increase
VIDEOS, SOCIAL MEDIA & PODCASTS
From OpenAI: coding with OpenAI o1 [YouTube]
From Anthropic: a deep-dive on prompt engineering [YouTube]
Pixtral-12B - Mistral’s first multi-modal VLLM is here [YouTube]
Elon Musk says Tesla has ‘no need’ to license xAI models [X]
Building OpenAI o1 [X]
Oracle to deploy a supercluster of ~130,000 NVIDIA Blackwell GPUs, alludes to a “gigawatt” capacity data center that will be powered by 3 nuclear reactors [Reddit]
Talking with Andrej Karpathy (OpenAI/Tesla) about the near and far future of AI [Podcast]
TECHNICAL, DEVELOPMENT, RESEARCH & OPEN SOURCE
Replit’s Agent: AI-powered tool designed to assist users in building software projects; it can understand natural language prompts and help create applications from scratch
Anthropic announces Workspaces in the Anthropic API Console
Google Deepmind’s AlphaProteo: generates novel proteins for accelerating drug design
French startup Mistral goes multimodal with Pixtral12B
DeepSeek-V2.5: SOTA true open-source LLM; beating GPT-4Turbo on multiple benchmarks
Can LLMs generate novel research ideas? A large-scale human study with 100+ NLP researchers (study found that AI-generated NLP research ideas were rated more novel than human expert ideas [but slightly less feasible] in large-scale evaluation)
Harrison.rad.1: the latest frontier in radiology-specific foundational models
That’s all for this week! We’ll see you next Thursday.