WeeklyDispatch.AI
Posts
The week in AI: Gemini 2.0 stuns and OpenAI's '12 days of Shipmas' continues

The week in AI: Gemini 2.0 stuns and OpenAI's '12 days of Shipmas' continues

Plus: xAI opens up Grok and Aurora for free users

The Dispatch
December 12, 2024

In partnership with

Welcome to The Dispatch! We are the newsletter that keeps you informed about AI. Each Thursday, we aggregate the major developments in artificial intelligence - we pass along the news, useful resources, tools and services; we highlight the top research in the field as well as exciting developments in open source. Even if you aren’t a machine learning engineer, we’ll keep you in touch with the most important developments in AI.

NEWS & OPINION

Gemini enters the 2.0 era as Google’s latest offerings impress

-------------------------

It was a good week for Gemini.

First, an experimental new Gemini model popped up and has claimed the top spot across all categories on Chatbot Arena. Released on Gemini's one-year anniversary, the model offers Google’s best in class 2M token context window and can process and understand uploaded video content (up to an hour long), as well as most other files. It’s also completely free to use. If you are tired of the top tier AI models being paywalled/having usage restrictions, head over to Google’s AI Studio now and use the experimental Gemini model to your heart’s content - it’s very good. The AI Studio has a number of developer tools built on top of a more traditional chatbot UI, so you can easily tinker with the model’s system instructions, temperature (creativity level), safety settings and more to optimize for your use case.

Additionally, Google kicked off the Gemini 2.0 series with a number of major updates and announcements to Gemini and other major projects. Here are the highlights:

Gemini 2.0 Flash: This was the only 2.0 model released yet - Flash is the Gemini family’s lightweight/efficiency-focused model (the more powerful 2.0 Pro/Ultra models are likely coming in January). Still, 2.0 Flash outperforms 1.5 Pro on key benchmarks at twice the speed and is currently free for all users on Gemini web (app availability coming soon) and for developers via the Gemini API. What’s really noteworthy here though are Flash 2.0’s multimodality upgrades - the live streaming capability with audio output in particular is an entirely new way to interact with AI that’s already turning heads. We strongly urge you to check out Simon Willison’s blog post to get an idea of just how impressive this model is.
Deep Research mode: ‘Deep Research’ explores complex topics on your behalf and then provides you with the findings in a comprehensive, easy-to-read report. The new mode can digest hundreds of websites at once and compile everything from market trends to technical analyses. After you enter your question, it creates a multi-step research plan for you to either revise or approve. Once you approve, it begins deeply analyzing relevant information from across the web on your behalf, iterating and refining for a few minutes. Finally, a comprehensive report of the key findings is generated, which you can export into a Google Doc if desired. Deep Research is only available for Gemini Advanced users.
Agents: Like everyone else, Google sees AI agents as an intrinsic part of the next era - Project Astra is a research prototype exploring future capabilities of a universal AI assistant embedded into your phone/glasses; the new Project Mariner explores the future of human-agent interaction with your Chrome browser; and Jules, an AI-powered code agent that integrates directly into a GitHub workflow.

OpenAI’s ‘12 days of Shipmas’ continues - Sora finally released, Canvas available to all users

-------------------------

Gimmicky marketing campaign? Festive way to placate the masses until GPT-5 is released? Whatever you think of OpenAI’s ‘12 days of Shipmas’, here’s a brief recap of what has been announced this week:

Today: Although Google beat them to it by a day, ChatGPT’s Advanced Voice Mode now has screen-sharing/live streaming capability as well, meaning it can assist with the context of what it is viewing, whether that be from your phone camera or what's on your screen. In the demo, the user gets directions from ChatGPT's Advanced Voice on how to make a cup of coffee. As the demoer goes through the steps, ChatGPT offers insights and directions.
Wednesday: Coinciding with the release of iOS 18.2, ChatGPT is now integrated across Siri, Writing Tools, and Visual Intelligence on Apple Intelligence. Siri can now recognize when you ask questions outside its scope that could benefit from being answered by ChatGPT instead. In those instances, it will ask if you'd like to process the query using ChatGPT.
Tuesday: Canvas, an extremely underrated feature for collaborative writing/coding, is now available to all ChatGPT users and can now also be used with custom GPT’s. The Canvas interface is the same as what users saw in beta in ChatGPT Plus, with a table on the left hand side that shows the Q+A exchange and a right-hand tab that shows your project, displaying all of the edits as they go, as well as shortcuts. Canvas also has the ability to run Python code directly, allowing ChatGPT to execute coding tasks from within the tool.
Monday: OpenAI’s much-hyped video generation model Sora is now available to all paying ChatGPT subscribers. Known as Sora Turbo, the video model is smarter than the February model that was previewed. Sora can generate video-to-video, text-to-video, and more. ChatGPT Plus users can generate up to 50 videos per month at 480p resolution or fewer videos at 720p. The Pro Plan offers 10x more usage. Sora features an exploration page where users can view each other's creations and click on any video to see how it was created. OpenAI also unveiled Storyboard, a tool that lets users generate inputs for every frame in a sequence. There is some brewing controversy over Sora’s ability to create videos of real people.

… and then there’s Grok/xAI/Elon

-------------------------

Not to be totally outdone this week, Elon Musk’s xAI has made their Grok chatbot free to use on X/Twitter, at least up to 10 messages every 2 hours. Additionally, the platform has switched from Black Forest Labs’ Flux Ultra image generation model to xAI’s in-house image generation model, codenamed Aurora. Both Grok and Aurora feature less censorship compared to some contemporary models, though they stop short of allowing explicit content. Aurora also has a meme-generator function, which should go over well with the X crowd.

MORE IN AI THIS WEEK

Time Magazine names AMD’s Lisa Su its ‘CEO of the Year’ after driving the company from near bankruptcy to a leading force in AI
How GenAI is reshaping tech hiring
OpenAI’s updated o1 model sure tries to deceive humans a lot
The creator of ChatGPT’s voice wants to build the tech from ‘Her,’ minus the dystopia
Apple partners with Broadcom to develop first AI server chips
AI firm’s ‘Stop Hiring Humans’ billboard campaign sparks outrage
Amazon opens new AI lab in San Francisco focused on long-term research bets
Venture capitalist David Sacks chosen by Trump to be White House AI and Crypto Czar
Meet the small AI chip maker now more valuable than Intel
X updates - X’s Grok AI chatbot is now available to all users and Aurora
An AI companion suggested he kill his parents - now his mom is suing

Discover 100 Game-Changing Side Hustles for 2025

In today's economy, relying on a single income stream isn't enough. Our expertly curated database gives you everything you need to launch your perfect side hustle.

Explore vetted opportunities requiring minimal startup costs
Get detailed breakdowns of required skills and time investment
Compare potential earnings across different industries
Access step-by-step launch guides for each opportunity
Find side hustles that match your current skills

Ready to transform your income?

[Download Now]

TRENDING AI TOOLS, APPS & SERVICES

Copilot Vision by Microsoft: now in preview - a new way to browse
Zoom AI Companion 2.0: Zoom’s next generation AI assistant
Remention: place your product in billions of online discussions with AI.
Peek AI: build a professional, shareable online portfolio in seconds with AI
Countless Dev: discover, compare, choose, and calculate costs for every type of AI model
AI Santa: send your loved ones free personalized videos from Santa
AISmartCube: low-code platform that allows you to create applications through graphical drag-and-drop interactions
Remento: the AI biographer for loved ones
Remy AI: charismatic AI sleep coach that takes care of tracking sleep metrics, circadian rhythms, evening routines, and sleep environment (iOS app)

GUIDES, LISTS, PRODUCTS, UPDATES, INFORMATIVE

15 times to use AI, and 5 not to
Apple Intelligence now features Image Playground, Genmoji, Writing Tools enhancements, seamless support for ChatGPT, and visual intelligence
Reddit’s new AI search tool aims to deliver answers without relying on Google
Android’s latest update is here with accessibility and Gemini AI improvements
Break down language barriers with auto dubbing on YouTube
Midjourney introduces Patchwork: a multiplayer infinite canvas for creating fictional worlds
Google says its new PaliGemma AI can identify emotions - and that has experts worried

VIDEOS, SOCIAL MEDIA & PODCASTS

Elon Musk shows off meme-generation capability for X’s new image generator, Aurora [X]
Prof. Ethan Mollick on making peace with hallucinations when adopting genAI [X]
Zuckerberg on Meta AI: 600 million monthly users, on track to be the most-used AI assistant in the world [Instagram]
Microsoft AI CEO Mustafa Suleyman on AI for agents, companions, infinite memory, gaming and more [YouTube]
60 Minutes: AI-powered tutor, teaching assistant tested as a way to help educators and students [YouTube]
"Stop Hiring Humans" ads all over SF [Reddit]
It's crazy how the public essentially doesn't care about Gemini [Reddit]

TECHNICAL NEWS, DEVELOPMENT, RESEARCH & OPEN SOURCE

Replit officially launches its upgraded AI development suite
Cognition’s Devin, the autonomous coding agent, is now generally available
Alpha launch of Memoire: a document retrieval pipeline “as-a-service”
Meta introduces Llama 3.3: a new 70B model that delivers the performance of the Llama 3 405B model but is easier & more cost-efficient to run
Google announces the general availability of Trillium, the sixth-generation TPU
Ollama 0.5 introduces JSON schema support for reliable structured outputs
Nous Research launches simulators to explore human-AI interaction (requires GitHub login)

That’s all for this week! We’ll see you next Thursday.