• WeeklyDispatch.AI
  • Posts
  • The week in AI: OpenAI claims The New York Times 'hacked' ChatGPT for lawsuit

The week in AI: OpenAI claims The New York Times 'hacked' ChatGPT for lawsuit

Plus: CEO Tim Cook says Apple will 'break new AI ground' this year

Welcome to The Dispatch! We are the newsletter that keeps you informed about AI. Each Thursday, we aggregate the major developments in artificial intelligence; we pass along the news, useful resources, tools and services, and highlight the top research in the field as well as exciting developments in open source. Even if you aren’t an engineer, we’ll keep you in touch with what’s going on in AI.

NEWS & OPINION

-------------------------

The New York Times vs. OpenAI lawsuit is heating up. In a court filing on Monday, OpenAI claimed that NYT paid a ‘hacker’ to exploit ChatGPT. From the filing:

“The allegations in the Times’s Complaint do not meet its famously rigorous journalistic standards. The truth, which will come out in the course of this case, is that the Times paid someone to hack OpenAI’s products. It took them tens of thousands of attempts to generate the highly anomalous results that make up Exhibit J to the Complaint. They were able to do so only by targeting and exploiting a bug (which OpenAI has committed to addressing) by using deceptive prompts that blatantly violate OpenAI’s terms of use. And even then, they had to feed the tool portions of the very articles they sought to elicit verbatim passages of, virtually all of which already appear on multiple public websites. Normal people do not use OpenAI’s products in this way.”

OpenAI claims that The Times used deceptive prompts (i.e. repeatedly asking ChatGPT, "what's the next sentence?") - with accompanying NYT articles - to target training data regurgitation and induce model hallucination.

OpenAI has argued that the court should dismiss claims alleging direct copyright, contributory infringement, and other "legally infirm” claims. AI is a new arena for copyright disputes; the judges in these cases are going to have their work cut out for them in setting precedents. In our estimation, this argument doesn’t look like OpenAI’s best defense. It’s on them to provide the working guardrails - and how ‘normal people’ use their product seems immaterial.

-------------------------

Apple has abruptly canceled its plans to release an electric car with self-driving abilities, a secretive product that had been in the works for nearly a decade. According to reports, the company told employees in an internal meeting on Tuesday that it had scrapped the project and that members of the group would be shifted to different roles - primarily to Apple’s generative AI divisions.

Tesla CEO Elon Musk appeared to be pleased about the report, tweeting a salute emoji with a burning cigar in response.

Separately, Apple CEO Tim Cook promised shareholders during the company’s annual meeting that Apple would “break new ground” on GenAI this year. What that ground might be is anyone’s guess at this point (Apple has been much more quiet on the AI front than its big tech rivals), but there are hints out there about some of the things we can expect. We know that Siri and iOS’ built-in search tool Spotlight are getting AI makeovers. Their LLM in development, Ajax, is expected to be on par with ChatGPT or better. Apple has also recently (finally) been more active in published research and open source AI: projects like HUGS and MGIE will likely end up transforming how we interact with visual media.

Whatever the company ends up having to offer in 2024, it’ll be great to see more Apple in the AI space.

-------------------------

Swedish payment provider Klarna announced this week that its AI assistant, powered by OpenAI, handled two-thirds of all the company’s customer service chats over a month-long period. The AI handled 2.3 million conversations in that time, which is reportedly equivalent to the work of 700 full-time employees.

Customer satisfaction ratings for the AI service was on par with human agents, but the error rate in resolving queries actually declined by 25 percent with the AI. Additionally, Klarna customers now resolve their issues in less than two minutes, compared to eleven minutes previously.

Klarna CEO Sebastian Siemiatkowski discussed the results on X, highlighting the current (not future) issue of AI's impact on job markets. "We decided to share these statistics to raise the awareness and encourage a proactive approach to the topic of AI. For decision makers worldwide to recognise this is not just ‘in the future’, this is happening right now," Siemiatkowski wrote.

MORE IN AI THIS WEEK

TRENDING AI TOOLS & SERVICES

GUIDES, LISTS, UPDATES, INFO

VIDEOS, SOCIAL MEDIA & PODCASTS

  • (Discussion) What the actual f (new ‘Emote Portrait Alive’ tool; research linked below) [Reddit]

  • Gen-Z is now using deepfakes to teach each other calculus [X]

  • Gemini 1.5 can generate Selenium code to replicate a task it watched a user perform in a video [X]

  • Credit Karma’s CEO reflects on AI’s risks and rewards in helping customers manage their money [Podcast]

  • FreeCodeCamp’s Generative AI full course (30 hours) - Gemini Pro, OpenAI, Llama, Langchain, Pinecone, Vector Databases & more [YouTube]

  • Is AI actually useful? [YouTube]

TECHNICAL, RESEARCH & OPEN SOURCE

-------------------------

French AI company Mistral, the creators of popular open source models Mistral 7B and Mixtral 8×7B, is looking to Microsoft to help them commercialize their newest cutting-edge model, Mistral Large. Mistral Large scored an 81.2% on the MMLU (Massive Multitask Language Understanding) benchmark, closer to GPT-4’s 86.4% than Google’s Gemini Pro or Anthropic’s Claude. The context window is 32k tokens, which is smaller than some competitors (128k for GPT-4 Turbo, 100k for Claude).

You can test the Mistral Large model for free with Mistral’s new ChatGPT-like conversational chatbot, Le Chat (message limits may apply for the Large model). The Mistral Large API is available on Mistral’s own infrastructure, hosted in Europe, or through Azure AI Studio and Azure Machine Learning. Mistral Small will also be available today, offering improved latency over Mistral’s 8x7B model. Mistral Large is priced at $0.008 per 1k tokens for inputs and $0.024 per 1k tokens for outputs, slightly cheaper (about 20%) than GPT-4 turbo.

-------------------------

Phind’s newest coding assistant LLM, 70B, outperforms GPT-4 Turbo in HumanEval coding benchmarks and speed. Phind-70B is based on the CodeLlama-70B model and fine-tuned on an additional 50 billion tokens, It achieves a HumanEval score of 82.3% and has a context window of 32K tokens.

You can try Phind for free without a login - their UI has both a search and a chat function; for writing/manipulating code the chat function seems to provide better results. They claim their model is less lazy than GPT-4 and doesn’t hesitate to generate detailed code examples. Phind plans to open source their 34B model soon, with the 70B model to follow at a later date. Phind is compatible in Visual Studio Code.

MORE IN T/R/OS

That’s it for this week! We’ll see you next Thursday.