- WeeklyDispatch.AI
- Posts
- The week in AI: "The king is dead" - GPT-4 surpassed for the first time on crowdsourced AI leaderboard
The week in AI: "The king is dead" - GPT-4 surpassed for the first time on crowdsourced AI leaderboard
Plus: Databricks' new state of the art open LLM, DBRX
Welcome to The Dispatch! We are the newsletter that keeps you informed about AI. Each Thursday, we aggregate the major developments in artificial intelligence; we pass along the news, useful resources, tools and services, and highlight the top research in the field as well as exciting developments in open source. Even if you aren’t an engineer, we’ll keep you in touch with what’s going on in AI.
NEWS & OPINION
-------------------------
Popular crowdsourced AI leaderboard Chatbot Arena has a new #1 for the first time since OpenAI released the GPT-4 class of language models in May 2023. Anthropic’s newest and most capable model, Claude 3 Opus, recently gained the top spot and currently sits two points ahead of GPT-4 Turbo.
Unlike other forms of benchmarking for AI models, the LMSYS Chatbot Arena relies on human votes, with people blind-ranking the output of two different models to the same prompt.
It is worth noting that the displaced GPT-4 model has been out since November, with a “markedly different” GPT-5 expected at some point this year. It’s still an impressive achievement, and Claude 3 has been lauded in various internet circles as the most “human AI yet”. We’ve been using Claude extensively since the release of Claude 2 and highly recommend it as a potential ChatGPT alternative. So does Amazon, who just made good on their planned $4b investment in Anthropic by paying out the remaining $2.75b yesterday.
-------------------------
Resignations and poaching continue to rock the AI startup world. Last week, Inflection AI was assimilated into Microsoft; now Stability AI (maker of the popular open source image generator Stable Diffusion) appears to be on the verge of crumbling. CEO Emad Mostaque has stepped down from his role with the company, claiming a desire to pursue decentralized AI. His resignation comes on the heels of increased pressure from investors to commercialize and an exodus of several key AI developers.
Stability has appointed interim co-CEOs from within the company while they are searching for a permanent CEO. In addition to leadership changes, the company faces financial uncertainties following Mostaque’s departure. Stability had explored selling itself as investors pressured management over its financial position. The startup had presented itself as an acquisition target in the fall, and held early-stage deal conversations with multiple companies. Those did not pan out.
Mostaque, a larger-than-life presence in the AI world, co-founded the startup in 2019 after various roles at hedge funds and crypto projects. In the following years, he attracted considerable controversy as CEO. speaking about the company or his own track record in a manner that could stretch belief. Multiple people interviewed by Bloomberg News last year said the CEO told them that he once worked as a secret agent for the UK government.
From their official statement on the resignation, it appears Stability will be trudging forward despite the turmoil: “[W]e are committed to preserving the exceptional team, cutting-edge technology, and vibrant community that’s been cultivated over the years, ensuring Stability AI remains a leader in open multi-modal generative AI.”
-------------------------
For the past year, a political fight has been raging around the world, mostly in the shadows, over how (and whether) to control AI.
During the UK’s global summit on AI back in November, a host of global leaders, academics, and tech executives convened to discuss the burgeoning challenges and opportunities presented by artificial intelligence. This elite gathering aimed to chart a path forward for the safe and equitable development of AI technologies, amidst growing concerns over their potential to disrupt global democracies, economies, and societal norms.
The summit underscored the international political battle brewing over AI control, with different regions advocating for varied approaches to regulation. But despite the high stakes and the convergence of global viewpoints, the meeting concluded without a consensus, reflecting the difficulty of achieving international agreement on AI regulation.
MORE IN AI THIS WEEK
In one key AI metric, China pulls ahead of the US: Talent
Exclusive: Behind the plot to break Nvidia's grip on AI by targeting software
Elon Musk requires ‘FSD’ demo for every prospective Tesla buyer in North America
How AI could explode the economy (and how it could fizzle)
Here’s how Microsoft is providing a ‘good outcome’ for Inflection AI VCs, as Reid Hoffman promised
Pew Research Center on ChatGPT use in America
Is OpenAI about to take on Alexa and Siri? ChatGPT maker files trademark for Voice Engine
Web Intelligence, Unlocked
With Bright Data's cutting-edge proxy solutions, harness the full potential of web data for your business. Tap into our global proxy network to scale your data collection activities. Ecommerce platforms, travel agencies, financial institutions, and market researchers are all leveraging web data to gain a competitive edge.
Bright Data offers the scalability and flexibility necessary for gathering and analyzing web data. Take the first step towards data-driven excellence.
TRENDING AI TOOLS & SERVICES
Suno v3: make full, 2-minute songs in seconds - now available to all users
Airtable AI: transform operations with generative AI
LMStudio: discover, download, and run local LLMs
Kapa: instant AI answers to technical questions
JumpRun: AI-powered research on stunning, interactive canvases
Creatie: turn ideas into stunning designs in a breeze
Coverr’s AI Workflows: master the art of AI video generation by discovering AI-generated footage, the tools used for creation, and prompts used by a community of AI video experts
Private LLM: use chatbots on Apple devices without the internet, keeping your information completely on-device, safe and private
GUIDES, LISTS, PRODUCTS, UPDATES, INTERESTING
Adobe announces GenStudio
Microsoft Teams is getting smarter Copilot AI features
16 changes to the way enterprises are building and buying generative AI
How Google is using AI for reliable flood forecasting at a global scale
Microsoft’s first AI PCs are the Surface Pro 10 and Surface Laptop 6 for businesses
Google starts testing AI overviews from SGE in main Google search interface
VIDEOS, SOCIAL MEDIA & PODCASTS
ChatGPT now helps you backtest Simple Trading Strategies [X]
Open Interpreter announces 01 Light - a portable voice interface that controls your home computer. It can see your screen, use your apps, and learn new skills [X]
Introducing Tone, a $299 AI wearable that acts as your second brain [X]
(Discussion) Amazon spends $2.75 billion on AI startup Anthropic in its largest venture investment yet [Reddit]
Making AI accessible with OpenAI co-founder Andrej Karpathy and Sequoia Capital [YouTube]
The race for AI robots just got real (OpenAI, Nvidia and more) [YouTube]
Why Google failed to make GPT-3 + why Multimodal Agents are the path to AGI - with David Luan of Adept [Podcast]
TECHNICAL, RESEARCH & OPEN SOURCE
Databricks releases DBRX, an open-source LLM that could offer enterprises a leaner alternative to GPT-3.5
-------------------------
Databricks has just announced DBRX, a large language model (LLM) with advanced capabilities in language and code understanding. Designed with a mixture-of-experts architecture, the model surpasses open foundation LLMs and specialized coding models on most common benchmarks.
DBRX is also designed to be ultra efficient, and fast. DBRX outputs text at up to 150 tokens per second. While it isn’t as capable as state-of-the-art models like Anthropic’s Claude OpenAI’s GPT-4, many enterprises don’t require gigantic models for the kinds of applications they’re looking to carry out day-to-day.
Many LLMs still expend too much energy to tackle simple problems, which both uses up compute power and slows delivery of an answer to a user. With DBRX, when a specific type of calculation is requested, the model knows which “expert” to call on. The whole DBRX model contains 132 billion parameters, but because of that division of labor, it uses only 36 billion parameters at any given time. For businesses that want to use AI for day-to-day operations, this style of LLM architecture could lower the barrier to entry.
DBRX is accessible for both Databricks customers and the open community under a commercial-friendly license. The base and fine-tuned model weights are hosted on Hugging Face.
-------------------------
OpenAI CEO Sam Altman’s controversial Worldcoin Foundation announced that it has open-sourced the software running its iris-scanning Orbs. The release includes code on the Orb which is crucial for capturing images and securely transferring them to the product's app. The core components of the Orb software can be accessed on GitHub under an MIT/Apache 2.0 dual license.
Additionally, Worldcoin revealed another privacy feature called “Personal Custody.” This new feature allows individual users to self-custody their data given over to Worldcoin through a data package signed with the Orb’s private key and then encrypted with a user-provided public key before it is transferred to the user's mobile phone. This feature would reduce the number of times users would need to return to an Orb to verify their World ID.
The update to Worldcoin comes as the company faces scrutiny from global regulators over privacy concerns. On March 21, the Kenyan government denied a request from the United States government to revoke its suspensions of the Worldcoin project. The government said it would ban Worldcoin activities in the country until it can be assured of its safety for the Kenyan people and integrity of financial details are provided.
Additionally, the Spanish Agency for the Protection of Data demanded that Worldcoin stop collecting and processing data locally and issued a temporary ban on its operations.
MORE IN T/R/OS
Andrew Ng: GPT-3.5 in agent loops is 95.1% accurate on HumanEval - up from 48.1% with zero shot GPT-3.5 and 67% with zero shot GPT 4
Nvidia’s text-to-3D LATTE3D: Large-scale Amortized Text-To-Enhanced3D Synthesis
Codel: a fully autonomous AI Agent that can perform complicated tasks and projects using terminal, browser, and editor
MIT + Adobe Research: One-step Diffusion with Distribution Matching Distillation
Stability AI releases Stable Code Instruct 3B
That’s it for this week! We’ll see you next Thursday.