- WeeklyDispatch.AI
- Posts
- The week in AI: Anthropic's Claude can now control your computer - what could possibly go wrong?
The week in AI: Anthropic's Claude can now control your computer - what could possibly go wrong?
Plus: Attracting AI talent and building compute are now officially US National Security issues
Welcome to The Dispatch! We are the newsletter that keeps you informed about AI. Each Thursday, we aggregate the major developments in artificial intelligence - we pass along the news, useful resources, tools and services; we highlight the top research in the field as well as exciting developments in open source. Even if you aren’t a machine learning engineer, we’ll keep you in touch with the most important developments in AI.
NEWS & OPINION
-------------------------
Astounding? Concerning? Or just another week in AI development: Anthropic has handed Claude the keys to your computer, amid other upgrades to their flagship model. With natural language prompts, you can now direct Claude to use your computer exactly the same way you do - by looking at a screen, moving a cursor, clicking buttons, and typing text. Although still in its earliest stages, this marks the first time a general-purpose AI can interface with a computer as seamlessly as a human, a frontier breakthrough that will likely ripple across AI development.
What makes Claude's computer use capabilities particularly noteworthy is its general-purpose nature. Instead of being trained for specific applications/tasks (like the AI agents we’re about to discuss below), it can theoretically interact with any software a human can use.
That means you can ask Claude to make calendar appointments, send e-mails, fill out forms, work within a spreadsheet, create an app, perform research and report back, and more - and the AI will execute, in real time, right on your computer screen.
To get a grasp on ‘computer use’ capabilities/potential, we recommend checking out this Twitter/X thread that contains a number of videos from Anthropic. Additionally, this satirical video from YouTube dev/creator Fireship and the 5 challenges for Claude computer use video from All About AI will give you a brief overview of capabilities, concerns and limitations.
This is a public beta feature - it’s going to be buggy. Anthropic themselves highlighted a case where Claude randomly decided to take a scenic detour from the task at hand. The model also struggles with basic actions like scrolling and zooming and task completion is notably slow - often taking minutes for small tasks humans could complete in seconds.
The potential for misuse here seems as significant as the development itself. Fireship’s YouTube video colorfully points out that there's nothing technically preventing the AI from accessing financial accounts or other sensitive things.
If you’re ready to give Claude your keys (maybe just for a test drive): the feature is so new that you’ll have to either dive into Anthropic’s API, or check out this GitHub repo - computer use hasn’t been streamlined yet for general use. It works best with the Firefox browser.
Anthropic also upgraded their outstanding Sonnet 3.5 model with wide-ranging benchmark improvements; even accounting for OpenAI’s o1 advanced reasoning model, Sonnet 3.5 sets the industry standard for real-wording coding tasks. You can try the upgraded Sonnet 3.5 now, for free, with limited use. Anthropic also announced Claude Haiku 3.5 - an upgrade to their powerful, most cost-efficient model. It will be available later this month across their first-party API, Amazon Bedrock, and Google Cloud’s Vertex AI.
-------------------------
The example of computer use from Anthropic above could be considered a general purpose AI agent: an AI system that performs tasks without human intervention.
But more ‘specific-use’ AI agents are popping up everywhere in business, and agents look like the future of business ops in AI - at least according to the tech industry. Just ask: Nvidia CEO Jensen Huang, Elon Musk’s xAI, Meta, Google and their corporate partnerships, IBM, Salesforce, Gartner, or the media here, there, and everywhere. If you want a better understanding of AI agents and what they represent, we highly recommend reading this breakdown from Every.
This week, Microsoft doubled down with new agent capabilities within Copilot Studio and Dynamics 365, pushing forward with more autonomous AI workflows. These agents can ostensibly handle anything from sales to finance, operating independently and reacting to signals without human input. They’re designed to streamline operations by acting on behalf of users, reducing manual tasks, and improving overall efficiency.
The most notable aspects here are ease of use and integration. With Copilot Studio, businesses can build agents that suit their specific needs while leveraging Microsoft’s low-code tools. These agents integrate seamlessly into the broader Microsoft 365 ecosystem - Copilot agents are designed to work within familiar apps like Excel, Word, and Teams, allowing users to automate tasks across the tools they already use daily.
With 300+ million current Microsoft 365 users, Copilot’s agents might have a leg up on the competition if they can hit the ground running. If you’re interested in an agentic future, learn more about Copilot Studio here and check out Microsoft’s recent video on pre-built agents. More info to come at Microsoft Ignite 2024 next month.
The AI copyright infringement battle is still heating up
-------------------------
There were a number of new developments this week in the AI vs. copyright law space. Here’s a quick recap to get you up to speed:
More than 10,500 actors, musicians, and authors, including Thom Yorke, Julianne Moore, and Kazuo Ishiguro, have signed an open letter protesting the unlicensed use of their creative works for AI development. The letter condemns AI companies for scraping text, images, and videos without consent, arguing that it threatens creators' livelihoods.
News Corp is suing AI search engine Perplexity for allegedly infringing on its copyrighted content, including articles from The Wall Street Journal and New York Post. The lawsuit claims Perplexity has been copying news articles and analyses on a large scale without permission, diverting traffic and revenue from the original publishers. Perplexity reportedly did not respond to a prior cease-and-desist letter. Earlier this summer, both Forbes and WIRED detailed how Perplexity appeared to have plagiarized stories.
A former OpenAI researcher (who helped gather internet data for training ChatGPT) now claims the company violated copyright law by using unlicensed content. Balaji, who left OpenAI in August, argues that training AI systems on copyrighted data without consent undermines the livelihoods of creators and harms the internet ecosystem. Balaji asserts that AI outputs are not sufficiently transformative to meet legal standards and compete directly with the content they mimic.
The world’s largest book publisher, Penguin Random House, changed the wording on their copyright pages to help protect authors’ intellectual property from being used to train AI.
Tracking all the open lawsuits against major AI companies for copyright infringement is becoming almost impossible. No major lawsuit has yet been closed - one of the first infringement cases (filed by artists against text-to image generators) has moved into discovery - but there is still a long road to go. In that case, the main issue of copyright infringement claim still stands, but most of the other allegations were dismissed.
MORE IN AI THIS WEEK
White House orders Pentagon and intel agencies to increase use of AI
Elon Musk’s X is changing its privacy policy to allow third parties to train AI on your posts
The mother of a 14-year-old Florida boy says he became obsessed with a chatbot on Character.AI before his death and is now suing the company
Chipotle turns to AI hiring platform ‘Ava Cado’ to screen/hire job applicants
AI helped the feds catch $1 billion of fraud in one year. And it’s just getting started
Former OpenAI CTO Mira Murati is starting her own AI company; more former OpenAI executives are still leaving the company
Microsoft and OpenAI’s partnership (“the best bromance in tech”) is beginning to show cracks
Apple internally believes that it’s at least two years behind in AI development - but all is not lost
Writer RAG tool: build production-ready RAG apps in minutes
Writer RAG Tool: build production-ready RAG apps in minutes with simple API calls.
Knowledge Graph integration for intelligent data retrieval and AI-powered interactions.
Streamlined full-stack platform eliminates complex setups for scalable, accurate AI workflows.
TRENDING AI TOOLS, APPS & SERVICES
Hero: use AI to scan, price, and list your stuff in seconds
Google Illuminate: transform your content into engaging AI‑generated audio discussions
WPS Office: a free AI-powered seamless MS office suite
Granola: the AI notepad for people in back-to-back meetings
Kick: automates bookkeeping for business owners
CapGo: the spreadsheet that fills itself - research company, people, and markets
CodeAnt: automatically detect and fix code quality issues, bugs, and security vulnerabilities in real time with every code commit
Haiper: AI video generator from former DeepMind researchers released version 2.0 with 1080p, free but limited current capabilities on video length
GUIDES, LISTS, PRODUCTS, UPDATES, INFORMATIVE
iOS 18.2 beta now available with many AI features - Genmoji, Image Playground, ChatGPT, iPhone 16 Visual Intelligence and more
Solving complex problems with OpenAI o1 models
Qualcomm brings AI and laptop-class CPU cores to phones with Snapdragon 8 Elite, also announces partnership with Google
Asana launches AI Studio: no-code platform to design and deploy AI agents across workflows
Canva adds an embedded text-to-image generator and other new AI features, now at over 200 million users worldwide
Midjourney to release an upgraded web tool that’ll let users edit any uploaded images from the web
New in NotebookLM: Customizing your Audio Overviews and introducing NotebookLM Business
VIDEOS, SOCIAL MEDIA & PODCASTS
(Discussion) The White House issued a National Security Memorandum declaring that 'AI is likely to affect almost all domains with national security significance' - attracting technical talent and building computational power are now official national security priorities [Reddit]
Claude has taken control of my computer... [YouTube]
How to install and use Claude’s new AI agent [YouTube]
Runway introduces Act-One - a new way to generate expressive character performances inside Gen-3 Alpha [X]
ChatGPT Plus, Enterprise, Team, and Edu users can start testing an early version of the Windows desktop app - with file and photo interactions, model improvements, and a companion window mode [X]
Ideogram introduces Canvas: an infinite creative board for organizing, generating, editing, and combining images [X]
Zoom CTO Xuedong Huang on how AI revolutionizes productivity [Podcast]
TECHNICAL NEWS, DEVELOPMENT, RESEARCH & OPEN SOURCE
Agent.exe: the easiest way to let Claude's new computer use capabilities take over your computer
Elon Musk’s xAI launches API, letting third-party developers build atop Grok
Stability AI releases Stable Diffusion 3.5, a suite of open text-to-image models
IBM introduces Granite 3.0: high performing, open source AI models built for business
Genmo’s Mochi 1: open state-of-the-art video generation model with high-fidelity motion and strong prompt adherence
That’s all for this week! We’ll see you next Thursday.