The week in AI: Manus magic ... ?

Plus: From "vibe-coding" to "vibeworking"

In partnership with

Welcome to The Dispatch! We are the newsletter that keeps you informed about AI. Each Thursday, we aggregate the major developments in artificial intelligence - we pass along the news, useful resources, tools and services; we highlight the top research in the field as well as exciting developments in open source. Even if you aren’t a machine learning engineer, we’ll keep you in touch with the most important developments in AI.

NEWS & OPINION

-------------------------

China’s AI prowess is once again a hot topic after a startup called Butterfly Effect has revealed “Manus”, a service it bills as a general AI agent that improves on tools offered by Western companies like OpenAI’s Operator and Deep Research (side note: OpenAI themselves had some big news around agents this week as well - we recommend listening to the Latent Space podcast).

Deep Research scours online services to source info that’s compiled into documents that OpenAI claims “create a comprehensive report at the level of a research analyst" within half an hour - Manus takes on the same tasks and more, but much, much faster. A launch video depicts it executing three tasks at super-speed:

  • Recommending the best candidate for a job after ingesting, opening and reading job applications, and then ranking candidates in a separate document before – upon being prompted to do so – reformatting its recommendations as a spreadsheet;

  • Preparing a full report on available properties created after a user offers info on budget, requirements, and desired location. The report includes available listings, plus info on neighborhood amenities, etc.;

  • Conducting correlation analysis of different stocks, writing a report with conclusions and then creating an interactive website that lets users explore data scraped from the web.

While impressive AI demos are nothing new, Manus as a general AI agent is really on a different level from anything we’ve seen so far. Use cases of the service shows that workstation writing its own commands, visiting websites galore, and then delivering a document and the complete code used to produce it.

Developers are marveling at its coding ability, and that’s not particularly surprising because Manus is primarily powered by Claude 3.7 Sonnet, which we consider to be state of the art for coding. But while Manus offers that familiar chatbot user interface of an empty text field into which to type a prompt, the experience of using Manus is akin to sitting with someone at a keyboard who turns vague instructions into precise output for you - at lightning speed.

Both the agent under-the-hood and the actual UI are incredibly well done. Multi-agent implementation is one of Manus's key features. When messaging with Manus, you only communicate with the executor agent - which itself doesn't know the details of knowledge, planner, or other agents. This really helps to control context length and gives off a legitimate feeling that the AI is putting in some serious work behind the scenes for you.

Manus is brand new and only in early access preview, and while many have dismissed Manus for running on Claude, it’s becoming clear that real value lies in packaging top models with the right tools, workflows, and interface. It’s not all roses for Manus so far though: TechCrunch published a fairly critical piece on it, reporting slow performance, crashes and sometimes unsatisfying outputs.

That’s probably to be expected at this stage. Still, it’s undeniably another impressive product offering coming from China on the heels of DeepSeek and Alibaba’s QwQ (who Manus just partnered with).

Oh yeah, if you do sign up for early access - there’s only 2 million people ahead of you on the waitlist.

MCP: the translator helping AI models talk to your favorite apps

-------------------------

Anthropic's model context protocol (MCP) was released last November, but has truly gained traction in the development community only very recently. It offers a standardized way for AI models to access external tools, data sources, and APIs - and has the potential to make AI agents much more useful.

MCP essentially functions as the missing link between agents and the massive API ecosystem. Following a client-server architecture (where a host application can connect to multiple servers), it allows AI models to connect with servers that provide access to services, like Google Drive, Slack, GitHub, etc. You could think of it as a meta-API or universal translator - a ‘USB-C cable’ for standardizing how AI interacts with the digital world around it.

OpenAPI is a widely adopted standard in the industry for defining APIs - and MCP builds upon that layer. While OpenAPI provides static definitions of what APIs can do, MCP creates a live, interactive experience where AI can query servers in real-time. This means an MCP server can dynamically respond to AI-generated requests, making APIs more accessible to agentic workflows. The jump from an OpenAPI spec into MCP is very small, but the difference is transformative for AI capabilities.

Real-world adoption is already happening, with companies like Vercel and Dub implementing MCP to enhance their workflows. Marketing teams at Dub ask their AI assistant to fetch their ‘most-clicked links’ from the past week, with the AI querying the MCP server to retrieve and then visualize this data - all without leaving the chat interface.

The landscape is evolving quickly though. We’re in an interstitial phase where competing standards will probably emerge from other major AI players - and as AI agents get more sophisticated, the need for MCP as a kind of “middle-man” might be reduced in certain contexts. OpenAI's function calling already provides similar functionality, though it lacks MCP's standardized, open approach. For developers, now is the time to experiment - look for APIs that already have MCP servers, or build one using tools like Speakeasy's MCP Server Generation. We covered some of the things Perplexity was doing last week - they also launched a new MCP server for their Sonar model, allowing Claude to access real-time web search capabilities… hopefully Anthropic is bringing that capability to Claude native soon, though.

The barrier to extending AI capabilities has never been lower. Here’s a helpful, 10-minute tutorial if you want all the details. Or, if you’re in a rush, try this three-minute alternative.

MORE IN AI THIS WEEK

Start learning AI in 2025

Everyone talks about AI, but no one has the time to learn it. So, we found the easiest way to learn AI in as little time as possible: The Rundown AI.

It's a free AI newsletter that keeps you up-to-date on the latest AI news, and teaches you how to apply it in just 5 minutes a day.

Plus, complete the quiz after signing up and they’ll recommend the best AI tools, guides, and courses – tailored to your needs.

TRENDING AI TOOLS, APPS & SERVICES

  • Browser Use: combines advanced AI capabilities with robust browser automation to make web interactions seamless for AI agents

  • Mistral: popular language model adds the world’s best OCR

  • Blooper: streamline the pre-production process with an all-in-one service

  • Auren: emotionally intelligent AI for improving human lives (iOS app)

  • Firecrawl: concatenate any website into a single text file for LLM ingestion

  • Colossal: global directory of AI agents that can perform API calls

  • Time Portal by Eggnog AI: travel through time and figure out which events you landed in

  • Muse: an AI model trained specifically for fiction writing

  • Wispr Flow for Windows: use voice to write 3x faster in every application

GUIDES, LISTS, PRODUCTS, UPDATES, INFORMATIVE, INTERESTING

VIDEOS, SOCIAL MEDIA & PODCASTS

  • Manus - the all-in-one AI agent [YouTube]

  • Y-Combinator - Vibe coding is the future [YouTube]

  • OpenAI CEO Sam Altman seems really impressed with OpenAI’s new creative writing model (yet to be released) [X]

  • Tencent releases Hunyuan-TurboS, a new ultra-large hybrid architecture (Mamba + transformer) model - new model architectures are the key to unlocking current context window limitations [X]

  • The new OpenAI Agents platform [Podcast]

  • Trump signs executive order on developing artificial intelligence 'free from ideological bias' [Reddit]

  • Anthropic CEO Dario Amodei - in the next 3 to 6 months, AI is writing 90% of the code [Reddit]

  • AMA with Google DeepMind’s Gemma team [Reddit]

TECHNICAL NEWS, DEVELOPMENT, RESEARCH & OPEN SOURCE

  • Google went on a tear this week: they released Gemma 3: their open-weights LLM in four different sizes, now multimodal and with a 128k context window; released a new text embedding model to power RAG apps; released native image generation for developers in Flash 2.0 (allows model to combine text and images at the same time - i.e. “tell me a story with pictures”); added the ability share a YouTube link and ask questions about the video itself, not just the transcript; announced Gemini 2.0 Robotics, specifically for training robots

  • Sakana’s (Japan) AI scientist has created the first peer-reviewed, AI generated scientific publication

  • Alibaba open sources R1-Omni: a new multimodal reasoning model that can ‘read’ emotions using visual and audio context

  • OpenAI’s new research on AI models’ chain-of-thought reasoning revealed that models can ‘reward hack’ or cheat on tasks - and attempts to stop them from thinking about cheating only make them hide their true intentions

  • Codium/Qodo introduces agent-driven workflows and MCP integration in its AI IDE plugin, Qodo Gen, streamlining code generation, testing, and chat

That’s all for this week! We’ll see you next Thursday.