• WeeklyDispatch.AI
  • Posts
  • The week in AI: ChatGPT gets a memory & the AI company worth more than Google or Amazon

The week in AI: ChatGPT gets a memory & the AI company worth more than Google or Amazon

Plus: Co-founder Karpathy leaves OpenAI

Welcome to The Dispatch! We are the newsletter that keeps you informed about AI. Each Thursday, we aggregate the major developments in artificial intelligence; we pass along the news, useful resources, tools and services, and highlight the top research in the field as well as exciting developments in open source. Even if you aren’t an engineer, we’ll keep you in touch with what’s going on in AI.

A quick note on the major development from last week: we have a number of links in today’s newsletter assessing the performance of Google’s recently released Gemini Advanced. The TL:DR version is that while Gemini Advanced has quite a few things going for it (no message cap, great with creative writing, increasingly useful integration with Google’s app suite, better web searching than ChatGPT with Bing), it’s surprisingly error-prone and still behind GPT-4 in reasoning and coding capabilities.

NEWS & OPINION

-------------------------

Let’s start this news round with a fun fact: investing $10,000 in Nvidia stock exactly one year ago would have earned you enough short-term capital to cover a median US mortgage for the year, with enough left over for groceries - or about $1,800 per month.

This week, Nvidia surpassed both Amazon and Google in market capitalization and is now the fourth most valuable company in the world. No organization on the planet has cashed in on the AI boom as effectively as the chipmaker. And while many tech giants are racing to cut their dependence on Nvidia, the company is already planning to help design custom chips for everyone who can’t make their own. Additionally, they’re the preferred chipmaker for China’s booming AI sector (although they cannot sell China their most cutting-edge chips due to US sanctions). Even without the tech giants, Nvidia won’t be lacking for customers in the coming years.

In other chip news, OpenAI CEO Sam Altman is facing wide criticism for an extremely ambitious plan to secure between $5-7T in funding for a new chip venture. The number seems far-fetched: the AI-specific chip market is expected to hit $120B in sales per year by 2027 - a huge number, but not likely to merit the kind of moonshot investment Altman is looking for. There is also a massive talent shortage in advanced chipmaking, so raising trillions of dollars might actually be the easiest part of his plan. Astral Codex Ten has a rough breakdown of the $7T ask with a lively discussion in the comments section.

More developments from OpenAI and Nvidia this week:

  • ChatGPT is getting a dynamic memory. OpenAI is testing a new feature where ChatGPT will remember things discussed in conversations to make future interactions more helpful. Memory will allow for an evolving understanding based on what is shared over time in various conversations. The feature is currently being rolled out to select ChatGPT free and Plus users.

  • OpenAI and Microsoft have disclosed research indicating that nation-state hackers from China, Russia and Iran have utilized their language models and API’s to fuel cyberattacks. Contrary to sophisticated applications, the hackers used the technology like most AI users do - composing emails, translating text, and debugging code. The researchers could not find any particularly novel or unique AI-enabled attack or abuse techniques.

  • A group of protesters descended on OpenAI’s San Francisco headquarters on Monday. Among the protesters’ concerns were OpenAI’s policy reversal on using AI in military applications and the company’s collaboration with the Pentagon. The protest was led by Holly Elmore, who heads US operations for the PauseAI movement.

  • Nvidia released a demo AI app called “Chat with RTX” that runs locally on your PC. This isn’t a fully-fledged LLM like ChatGPT, but installing and using localized assistants for a variety of tasks (like analyzing your own documents or parsing YouTube/Podcast transcripts for specific mentions) is becoming easier by the day. The biggest impact of this is privacy: your data stays on your PC.

  • Nvidia CEO Jensen Huang spoke at the World Government Summit 2024 on the future of AI. He highlighted the emerging importance of “Sovereign AI” - the idea that the tools and infrastructure for harnessing AI to improve national intelligence and refine data are readily available, so countries must now step up and make proper use of them rather than rely on other countries’ AI prowess.

-------------------------

As digital communication increases in the workplace, an AI-driven analytics company named Aware is ‘revolutionizing’ how companies monitor internal chatter. Aware’s AI analyzes messages across platforms like Slack, Microsoft Teams, and Zoom for a roster of major U.S. and European companies. The technology identifies risks within the communications, purporting to offer real-time insights into employee sentiment and detecting issues like harassment or noncompliance without compromising individual identities.

But while the analytics tool anonymizes data to track employee sentiment or toxicity, Aware’s eDiscovery tool can pinpoint individual names in cases of risk as determined by the client. Jutta Williams, co-founder of AI accountability nonprofit Humane Intelligence, told CNBC: “A lot of this becomes thought crime.” She added, “This is treating people like inventory in a way I’ve not seen.” The platform’s success (Aware has a 150% annual revenue increase and a massive client base) is challenging traditional notions of privacy and oversight in digital workspaces.

In other Big Brother news, blogger James O’Malley took a deep dive into a London Underground surveillance trial by analyzing Transport for London company documents obtained through to the Freedom of Information Act. The trial, aimed at combating fare evasion, was able to analyze CCTV footage with AI for up to 77 different types of incidents within the station. The results were undeniable, from an administrative standpoint: the trial caught over 26,000 suspected fare evasions in 11 months - almost 80 per day. But the implications of expanding, pervasive AI-powered surveillance are getting harder to ignore.

-------------------------

In the legal battle brought forth by Sarah Silverman and others concerning the use of copyrighted materials by OpenAI for training ChatGPT, a federal judge has largely ruled in favor of OpenAI. The court dismissed the majority of the claims brought by authors who argued that ChatGPT was trained on pirated copies of their books without their permission, labeling the technology as a sophisticated form of copyright infringement and an unfair business practice.

The judge, however, did allow the direct copyright infringement claim to proceed, highlighting that the plaintiffs failed to provide sufficient evidence to support their allegations of vicarious copyright infringement and other related claims. The rulings from this case largely mirror those made by a federal judge in a separate case brought forth against AI image generators like Midjourney and Stable Diffusion,

In both instances, the federal judges allowed the core claims of direct copyright infringement to proceed towards trial, highlighting a legal recognition of potential copyright violations when AI technologies use copyrighted works as part of their training datasets. However, the judges also dismissed several claims, requiring plaintiffs to provide more concrete evidence or to clarify their allegations regarding how their copyrights were specifically violated, altered, or removed by these AI technologies.

MORE IN AI THIS WEEK

Looking for visuals and charts, rather than words, to understand the daily news?

Bay Area Times is a visual-based newsletter on business and tech, with 250,000+ subscribers.

TRENDING AI TOOLS & SERVICES

  • Soona Measure: first of its kind visual analytics engine designed to optimize visuals for product selling

  • Globe Explorer: visual-hierarchical search for any topic on the globe

  • Synthical: discover, learn, and share research, made easy with AI

  • Shop That Look: upload a photo of an outfit you like and find stores with similar items

  • VectorShift: the end-to-end AI automations platform to build AI search, assistants, chatbots, and automations

  • Wondera: when karaoke meets AI, the more songs you sing, the better your AI voice will perform

  • Crux: build your decision-making AI copilot faster than ever

  • Fina.xyz: flexible financial management platform enhanced with AI-powered tools

  • GOODY-2: the world’s most responsible AI model that won’t answer anything that could possibly be considered controversial (also, meet the pranksters behind GOODY-2)

GUIDES, LISTS, USEFUL INFO

VIDEOS, SOCIAL MEDIA & PODCASTS

  • (Discussion) OpenAI co-founder/researcher Andrej Karpathy departs company - again [Reddit]

  • (Discussion) ChatGPT has memory now - be careful about casually using the tipping trick [Reddit]

  • The truth about building AI startups today (Y Combinator) [YouTube]

  • Gemini Ultra - first impressions (vs ChatGPT 4) [YouTube]

  • What if your glasses gave you AI superpowers? [X]

  • OpenAI board member Bret Taylor has a new AI startup [Podcast]

TECHNICAL, RESEARCH & OPEN SOURCE

-------------------------

This week, Stability AI (known for their cutting-edge open source generative AI models, including Stable Diffusion) released a research preview for their advanced text-to-image model, Stable Cascade. The new model is built on an entirely new three stage architecture for text-to-image generators called Würstchen, which is hyper-optimized for efficient model training and provides much more detailed guidance compared to latent representations used in earlier models.

That means it’s now possible to train and fine-tune a state of the art image generator on consumer grade hardware. It doesn’t look like Cascade can quite match Midjourney in terms of photorealism, but it does check off every other box in extremely high quality AI image creation. It’s particularly good at prompt adherence and adding natural looking text into images; it also has all of the next-gen image creation features artists will love like reverse canny edge detection (aka filling out an entirely new image based on a sketch or canny edge), inpainting/outpainting, and image-to-image variations.

The codebase has the training and inference scripts as well as a variety of different models you can use. The expected VRAM requirements for inference can be kept to 20gb, but can be further lowered by using the smaller variants. The model is not currently available for commercial use.

-------------------------

For centuries, efforts to mimic human speech through technology, from resonance tubes to machine learning, have fascinated society. Speech synthesis has notably progressed from the robotic tones of early systems to today's more realistic AI-generated voices - but achieving the nuanced expression found in human speech remains a huge challenge. This complexity is encapsulated in the "prosody problem," referring to the intricate conveyance of emotions, intentions, and nuances through pitch, rhythm, and intonation beyond mere words.

Even the most powerful models struggle to fully replicate the depth of human prosody, which includes the ability to express sarcasm, attitude, and other subtle communicative cues. Current state of the art multi-modal LLMs generate linguistic information and feed it to text-to-speech systems to infer prosody. In humans, however, the prosody and linguistic planning processes are more closely intertwined. Machine learning teams will have to learn from and collaborate with anthropologists, linguists and other experts in the field of speech and language to deploy prosodic models globally.

-------------------------

OpenAI’s third-generation embeddings models released a few weeks ago have a very useful new capability: they can ‘shorten’ their dimensions. Matryoshka Representation Learning (MRL; named after the Russian stacking dolls) is a new technique that takes advantage of this by embedding information at multiple granularity levels within a single high-dimensional vector. Even if you truncate the embedding at a lower dimension, it still retains useful information, unlike traditional embeddings which might lose their meaning completely.

Matryoshka embeddings represent a big step forward in text embedding techniques. Their ability to encode rich semantic information hierarchically across multiple dimensions unlocks new capabilities not possible with traditional embeddings. Lower dimensions can be used for fast initial search, while higher dimensions refine the results - an optimization called Adaptive Retrieval. This provides speeds approaching low dimensionality embeddings with the accuracy of high dimensionality embeddings.

The new techniques pioneered in Matryoshka embeddings - hierarchical training and adaptive search - will likely become foundational to the next generation of embedding architectures and search algorithms. Exciting progress in the continuing evolution of semantic search!

MORE IN T/R/OS

That’s it for this week! We’ll see you next Thursday.