• WeeklyDispatch.AI
  • Posts
  • The week in AI: Chinese tech companies are teaching the rest of the world about AI efficiency

The week in AI: Chinese tech companies are teaching the rest of the world about AI efficiency

Plus: Sesame's AI voice mode is going viral for sounding almost 'too human'

In partnership with

Welcome to The Dispatch! We are the newsletter that keeps you informed about AI. Each Thursday, we aggregate the major developments in artificial intelligence - we pass along the news, useful resources, tools and services; we highlight the top research in the field as well as exciting developments in open source. Even if you aren’t a machine learning engineer, we’ll keep you in touch with the most important developments in AI.

NEWS & OPINION

-------------------------

Back in January, DeepSeek’s “reasoning” R1 model took the entire AI industry by surprise and upended the US stock market. Released by a small AI lab in China spun out from a hedge fund, it was fast, competitive with frontier models, cheap, and notably trained with far fewer resources than comparable leading models.

While OpenAI’s o1 was the first reasoning model widely available, R1 was a major transition point in the uncertainty in reasoning model research - because DeepSeek published the technical report on R1. With a) large-scale reinforcement learning on reasoning problems, and b) new applications of known model training techniques to help mix that raw reasoning power in, DeepSeek had created a frontier language model at a fraction of the cost.

Fast forward a bit, and we can see two major AI companies separated by the Pacific Ocean going in completely different directions. Chinese e-commerce/tech giant Alibaba’s shares are soaring (in China, not in the US yet) today on the release of their QwQ-32B reasoning model. They leaned even more heavily into DeepSeek’s training methodology - the model is roughly 20x smaller than even R1, yet delivers comparable or superior performance across key benchmarks.

Benchmarks aren’t everything, and the model was just released today, but you can test the model yourself (make sure you have QwQ-32B selected as it might not be the default). In our limited testing it performed admirably (it can get stuck in reasoning loops), and while no one is going to call QwQ-32B state-of-the-art, that is one tiny model to be so effective - you can run the quantized version on a laptop, locally. Its API is priced at $0.20 per million input and output tokens (a roughly 90% reduction compared to R1, which is already cheaper than many American counterparts). It’s also open sourced under the Apache 2.0 license.

Meanwhile, we covered GPT-4.5’s underwhelming release last week and OpenAI seems to be going in a rather different direction in terms of efficient/cost-effective models:

  • They rolled out GPT-4.5 to Plus users today, but with a 50 message per week limit because, as their own model developer quips, “every gpt-4.5-token uses as much energy as Italy consumes in a year”. QwQ’s $0.20 per million i/o token costs? How about $75/$150 for GPT-4.5’s API.

  • After admitting that they were “out of GPUs”, CEO Sam Altman is now wondering if your $20/month subscription might not be better spent as a credit system across their suite of models/products (including Sora and Dall-E) and essentially pay per individual use/prompt. That post did not go over well, with one user responding: “That isn't customer freedom, it's psychological warfare. Users will constantly micro-calculate every feature use, creating decision fatigue and turning casual users into stressed accountants.”

  • But maybe you’re interested in their $20,000/mo agent, though? (you just can’t make this stuff up)

OpenAI reportedly lost $5 billion in 2024 and is losing money on their $200/mo Pro subscription. Maybe it’s time to take a page from China’s playbook. OpenAI has continually made their existing models more efficient post-release, but the trajectory they’re on compute-wise seems unsustainable with GPT-5 slated for release in a couple months. They now have 400 million weekly active users, but as of September 2024 reportedly only 11 million of those were paying subscribers.

-------------------------

Moving fast and breaking things.

Perplexity has been around for a couple of years now and gained traction as a popular AI search engine, but within the last few weeks they’ve truly begun to infiltrate just about every corner of the AI industry:

  • They’re partnering with Deutsche Telekom (T-Mobile's parent) on an "AI Phone," featuring the Perplexity Assistant accessible directly from the lock screen. CEO Aravind Srinivas says it's about turning Perplexity from an "answer machine into an action machine" for everyday tasks - the assistant has direct functionality in many popular apps (it’s also multi-modal so you can talk to it about what you’re looking at).

  • They added a bot (@AskPerplexity) to X/Twitter, which hit 12 million organic impressions in a week. Users can tag it into any Twitter conversation for instant answers - a simple idea that caught on quick.

  • They launched a Deep Research feature and expanded it to enterprise. It’s not perfect (expect hallucinations), but it’s great for finding a ton of sources and completely free.

  • They added three major AI models (Gemini 2.0 Flash, Claude 3.7 Sonnet, GPT-4.5) to their search platform.

  • They announced Comet, an agent-focused browser (not many details yet, waitlist available)

  • They added voice mode for their iOS app

  • They open sourced R1 1776, a post-trained version of DeepSeek’s R1 that removes Chinese Communist Party censorship

  • They launched a $50 million AI startup fund

  • They gave away $1 million during the Superbowl, rather than spending it on an ad - which paid off as the app got 45,000 extra downloads on Superbowl Sunday

While it’s a given that the major AI providers scrape the web, we will point out here that Perplexity has been particularly shameless about plagiarizing that content in the past. They launched a revenue sharing program with select partners (notably not the ones who’ve accused them of plagiarism) in an attempt to make amends.

But that's an impressive pace of execution - they are shipping constantly right now.

MORE IN AI THIS WEEK

There’s a reason 400,000 professionals read this daily.

Join The AI Report, trusted by 400,000+ professionals at Google, Microsoft, and OpenAI. Get daily insights, tools, and strategies to master practical AI skills that drive results.

TRENDING AI TOOLS, APPS & SERVICES

  • Sesame: conversational voice chat that doesn’t feel robotic at all

  • Data Science Agent in Google Colab: creates complete, working notebooks to automate data analysis tasks

  • Microsoft Copilot: Microsoft’s AI assistant is now available via MacOS (it’s free)

  • Opera: updated with a browser-native AI agent to “get stuff done for you”

  • Reach by Artificial Societies: test content in a simulation of your own LinkedIn audience

  • ExplainGithub: the modern way to browse and understand GitHub repositories

  • Luma Labs: added three new features to its Ray2 video generation model for seamless long-form AI video generation

  • Quadratic: AI spreadsheet with code and connections - chat with your data and get insights in seconds

GUIDES, LISTS, PRODUCTS, UPDATES, INFORMATIVE, INTERESTING

VIDEOS, SOCIAL MEDIA & PODCASTS

  • Grok 3 is on top of the LMArena leaderboard, slightly ahead of GPT-4.5 [X]

  • SoftBank is allegedly seeking $16B in loans to fuel its investments in OpenAI - Musk reiterates claim that CEO Masayoshi Son is “already over-leveraged.” [X]

  • Swedish fintech company Klarna’s CEO posts a ‘tell-all’ about the company’s AI journey and ‘replacing’ Salesforce [X]

  • GPT 4.5 as Donald Trump explaining creation of Earth [Reddit]

  • Elon Musk's AI chatbot says a 'Russian asset' delivered the State of the Union [Reddit]

  • Build anything with MCP servers - coding tutorial [YouTube]

  • Creative AI superpowers you aren’t using yet [YouTube]

  • Anthropic CEO Dario Amodei on hopes and fears for the future of AI [YouTube]

  • Sakana AI is building nature-inspired methods that could fundamentally transform how we develop AI systems [Podcast]

TECHNICAL NEWS, DEVELOPMENT, RESEARCH & OPEN SOURCE

  • DeepSeek reveals its inference optimization method - enabling a 545% profit margin while charging less than rivals

  • LangChain introduces LangGraph Swarm: a Python library for building multi-agent systems with dynamic collaboration

  • Atla releases Selene 1: the most accurate ‘LLM as a Judge’ model yet for evaluating AI responses

  • OpenAI’s ‘NextGenAI’ consortium commits $50M in research grants, compute funding, and API access to support students, educators, and researchers

  • Cohere’s non profit arm unveils Aya Vision, an open-weights vision model

  • LM Studio releases its first SDK for Python and TypeScript under MIT license (and an agent-oriented API - give it a prompt and tools, and the model goes on its own autonomously for multiple execution "rounds")

That’s all for this week! We’ll see you next Thursday.