- WeeklyDispatch.AI
- Posts
- The week in AI: an 11 out of 10 on the "read if you care about AI" scale
The week in AI: an 11 out of 10 on the "read if you care about AI" scale
Plus: OpenAI, Google, the US government's AI legislation roadmap, and so much more
Welcome to The Dispatch! We are the newsletter that keeps you informed about AI. Each Thursday, we aggregate the major developments in artificial intelligence; we pass along the news, useful resources, tools and services, and highlight the top research in the field as well as exciting developments in open source. Even if you aren’t a machine learning engineer, we’ll keep you in touch with the most important developments in AI.
This was a huge week in AI news - a week that highlights the importance of understanding hype versus substance in the new multimodal era of AI we are blitzing into. We strongly encourage you to investigate the newsletter and provided links thoroughly this week. Stay in touch with just how fast the world of AI (and therefore the world itself) is about to start moving, and come to your own conclusions. If you know anyone interested in AI, please share our newsletter with them.
NEWS & OPINION
-------------------------
OpenAI has just unveiled GPT-4o. It wasn't the search engine that everyone was expecting, but rather a new multimodal model that’s replacing the free GPT-3.5. It’s smarter; cheaper; faster; better at coding; multimodal inside and out - and perfectly timed to steal the spotlight from Google and the rest of OpenAI’s competitors in artificial intelligence. It’s GPT-4 Omni, or GPT-4o. We’ve made an attempt to hierarchize the most important details from the announcement, top to bottom. And there are a lot of details:
It’s going to be free. It’s not available to everyone, yet, but it’s being rolled out over the next few weeks. Paid users will have up to 5x the capacity limits for usage, but opening a top model up to the public for free use is a huge (but calculated) gamble.
GPT-3.5 level chatbots are the highest level of AI most users have been exposed to. This is simply is not representative of current AI development. We’re going full-steam ahead on multimodal. GPT-4o is going to be a mind-blowing surprise to most of the general public.
A hint of what GPT-4o can do was posted to OpenAI’s X/Twitter account. The demo went viral and has almost 20 million views in just a couple of days. It is rightfully drawing criticisms and allusions to the dystopian movie Her, and it’s not very difficult to see the societal implications of interacting with AI like this. On the other side, the potential practical applications for this sort of powerful and fast multimodal AI are immense.
There will finally be a ChatGPT desktop app, and Mac users get first dibs. An important note: the top Google Search results for "ChatGPT desktop download" are all scams. Don't go to Google for this app right now. It’s not ready yet; when it’s available, find the official download here.
OpenAI CEO Sam Altman outlined his vision for GPT-4o in a blog post, and here is the most salient thing he had to say: “The original ChatGPT showed a hint of what was possible with language interfaces; this new thing feels viscerally different. It is fast, smart, fun, natural, and helpful.”
We encourage you, again, to investigate more on your own. If you’re a ChatGPT Plus subscriber, check for access to the language-based version of the model now. It’s not perfect. It hasn’t solved hallucinations. It performs great (the best, technically) on benchmarks that aren’t entirely reliable. Full multimodal GPT-4o is not quite yet available for most subscribers. If you’re not a Plus subscriber, no problem - we’ll be letting our readers know when GPT-4o becomes widely accessible to free users.
-------------------------
Not to be outdone, Google then showed off an equally stunning vision in its 2 hour keynote (click the headline above for the condensed version) for how AI will improve the products that billions of people use every day at I/O 2024.
Here are our cliffnotes:
Google mainly showcased that it wants its AI products to be part of everything you do. Searching, shopping, watching YouTube videos, analyzing live video recording. While past experiences with Google might make us approach demo videos with a healthy dose of skepticism, these showcases undeniably confirm the seamless, natural integration of voice and video inputs to receive immediate outputs powered by Gemini.
Google will integrate many powerful AI functions into Android phones. Users will be able to drag and drop images created or edited by AI into Google Messages and Gmail instantly. They’ll be able to instantly interact with AI about YouTube videos and PDFs. A new built-in tool will help detect suspicious activity in the middle of a call, such as a scammer trying to imitate a user’s bank.
They highlighted various new features powered by Gemini 1.5 Pro, which itself is getting major upgrades. One new feature, called Ask Photos, allows users to search photos for deeper insights, such as asking when your child first learned to swim or recall what your license plate number is, by looking through saved pictures.
Google executives took turns demonstrating useful capabilities, including a virtual “teammate” with its own workspace account that can help stay on top of to-do lists, organize data, and manage workflow. Another demo showcased how the latest model could “read” a textbook and turn it into a kind of AI lecture, featuring natural-sounding teachers that answer questions.
The company highlighted various AI-powered search improvements. From The Verge: Google is using its Gemini AI to figure out exactly what you’re asking about, whether you’re typing, speaking, taking a picture, or shooting a video.
I/O 2024 was all about Google’s vision for AI. Many of these features still need more time and work before being rolled out. But you can already use Gemini inside Gmail, Docs, Sheets, Drive, etc. With Gemini Apps, you can utilize Google’s many other apps and services in helpful ways: Flights, YouTube, Maps, and more. A word of caution: We highlighted Google’s past demo gaffes, and even this time around they made an error in their presentation that highlights something very important: you still simply can’t 100% trust an AI’s output without due diligence in fact verification.
There’s more from Google this week. See our technical section at the bottom of the newsletter.
-------------------------
A bipartisan group of US senators have released a “roadmap” for regulating artificial intelligence. This initiative, endorsed by Senate Majority Lead Chuck Schumer, aims to address key areas of AI that Congress could take up this term. Schumer has led the US government’s charge to regulate AI, such as it is, for over a year now. In recent months, the group has hosted several AI briefings for senators to educate them on the technology's complexities and its broad impact. Here are the priorities outlined in the report:
Boosting funding for AI innovation (some see this as a conflict of interest)
Establishing nationwide standards for AI safety and fairness
Enhancing U.S. National Security through AI
Addressing job displacement caused by AI
Combating “deepfakes” in elections and “non-consensual distribution of intimate images”
Ensuring access to AI innovation for schools and companies
There are plenty of reasons to have some sort of critical lens on the government’s ability to effectively regulate artificial intelligence. But it’s not a hopeless situation, and leaving tech giants to self-regulate (which is largely the current status quo in the US) will not protect the public from a number of potentially damaging consequences - many of which are outlined in the bullet points above. Right now, legislators in Europe seem to be the only ones legitimately pushing the safety of the public over the promise of AI innovation; and even their stance is controversial.
MORE IN AI THIS WEEK
Apple closes in on deal with OpenAI to put ChatGPT on iPhone
NASA appoints its first Chief AI Officer
Is OpenAI’s superalignment team dead in the water after two key departures?
Reddit locks down its public data in new content policy, says use now requires a contract
Why it is so dangerous for AI to learn how to lie: ‘It will deceive us like the rich’
Meet the woman who showed President Biden ChatGPT - and helped set the course for AI.
Elon Musk’s lawyers succeed in challenge to remove OpenAI case judge
Altman’s fantasy of a GPT-centric world
U.S.-China talks on AI risks set to begin in Geneva
Missed out on Ring and Nest? Don’t let RYSE slip away!
Ring 一 Acquired by Amazon for $1.2B
Nest 一 Acquired by Google for $3.2B
If you missed out on these spectacular early investments in the Smart Home space, here’s your chance to grab hold of the next one.
RYSE is a tech firm poised to dominate the Smart Shades market (growing at an astonishing 55% annually), and their public offering of shares priced at just $1.50 has opened.
They have generated over 20X growth in share price for early shareholders, with significant upside remaining as they just launched in over 100 Best Buy stores.
Retail distribution was the main driver behind the acquisitions of both Ring and Nest, and their exclusive deal with Best Buy puts them in pole position to dominate this burgeoning industry.
TRENDING AI TOOLS, APPS & SERVICES
Phew AI Tab: manage and retrieve tab information via AI-based grouping and spaces in a vertical sidebar
ManyExcel: generate formulas for Excel and Google Sheets from simple text
Julius AI for data analysis: GPT-4o is now live in Julius to analyze datasets, visualize, chat with files solve math equations and more
NerdSnipes: explore history, decade by decade, with tiny AI podcasts
Brick Center: create your own Lego design for free
devv: AI-powered search engine for developers
RateMyJD by Dover: improve your job description with AI-powered tips (try roast mode for some extra fun)
LLM Whisperer: get complex documents ready for LLM consumption
GUIDES, LISTS, PRODUCTS, UPDATES, USEFUL INFORMATION
ChatGPT Glossary: 44 AI terms that everyone should know
30+ AI tools for startups in 2024
How does ChatGPT ‘think’? Psychology and neuroscience crack open AI large language models
Soundhound and Perplexity partner to bring AI web search to voice assistants in cars, IoT devices and more
AI comes for YouTube’s thumbnail industry
Autodesk's AI turns text or still images into 3D models
VIDEOS, SOCIAL MEDIA & PODCASTS
Anthropic’s new tool automates optimized prompt crafting [X]
OpenAI CEO Sam Altman on the departure of Ilya Sutskever [X]
(Discussion) The 'Her' moment I was hoping for. Did not disappoint. Really curious how people will react to this. If we will have long conversations and if we prefer it over talking to people. [Reddit]
(Discussion) GPT-4o will be free for everyone in the next weeks [Reddit]
Altman’s first interview after the GPT-4o announcement [YouTube]
Another glorious battle for AI dominance… GPT-4o vs Google I/O [YouTube]
Powering AI with the world's largest computer chip [Podcast]
TECHNICAL, RESEARCH & OPEN SOURCE
-------------------------
We will have to refer you externally to our website for a breakdown of ALL of the other AI developments at Google I/O. Due to the length/amount of hyperlinks in this week’s Dispatch, we’ve triggered a spam warning from our newsletter platform based on kB size.
MORE IN T/R/OS:
Research: quantifying GitHub Copilot’s impact in the enterprise with Accenture
Stack Overflow users sabotage their posts after OpenAI deal
Inspect: an open-source framework from the UK AI Safety Institute for large language model evaluations
MIT studies AI’s capabilities in deception
Chrome Dev Tools: understand errors and warnings better with Gemini
Phew! That’s all for this week. We’ll see you next Thursday.