• WeeklyDispatch.AI
  • Posts
  • The week in AI: Microsoft is banning police from using its AI for facial recognition

The week in AI: Microsoft is banning police from using its AI for facial recognition

Plus: A breakdown of how Google is using AI to advance medicine

In partnership with

Welcome to The Dispatch! We are the newsletter that keeps you informed about AI. Each Thursday, we aggregate the major developments in artificial intelligence; we pass along the news, useful resources, tools and services, and highlight the top research in the field as well as exciting developments in open source. Even if you aren’t a machine learning engineer, we’ll keep you in touch with the most important developments in the field.

NEWS & OPINION

-------------------------

In a quietly pushed, significant policy update, Microsoft’s Azure OpenAI Service has tightened its grip on the use of facial recognition technology by law enforcement agencies globally.

The revised code of conduct explicitly prohibits real-time facial recognition on mobile cameras used by any law enforcement entity worldwide, including on-body police cams and dashcams. The restrictions had previously primarily targeted US state and local police; the latest amendments from Microsoft now extend these limitations to law enforcement agencies globally.

The policy’s wording – specifically “real-time facial recognition technology on mobile cameras” and “in the wild environments” – suggests that the company has not banned agencies from outside of the US from using its cloud service for facial recognition on fixed cameras and post-event processing at a police station, but this is only speculation.

There are murky waters and evolving stances and policies at play in this arena. In January, it was reported that OpenAI is working with the Pentagon on a number of projects including cybersecurity capabilities - a departure from the startup’s earlier ban on providing its AI to militaries. Microsoft itself has pitched using OpenAI’s image generation tool, DALL-E, to help the Department of Defense build software to execute military operations.

-------------------------

Apple finally wrapped up its special event to release the new iPad line-up. Although there were a number of non-AI related announcements, this was the first time the company heavily emphasized the “AI” word and explained how they will be integrating AI across their various platforms. They also released the much-anticipated M4 chip with fundamental improvements to the CPU, GPU, Neural Engine, and memory system to power the next-gen AI apps on Apple’s devices. Here are your key AI highlights:

  • M4 chip for next-gen AI: The new iPad Pro has the groundbreaking M4 chip with an even more powerful neural engine capable of 38 trillion operations per second. This will allow for complex AI applications (like isolating subjects from video backgrounds) at incredible speeds. As powerful as the M4 is, it’s not even the most powerful AI chip about to hit the market. That might be Qualcomm’s Snapdragon X Elite, expected in Windows laptops soon.

  • M2 chip in iPad Air: The new iPad Air now boasts the M2 chip, previously exclusive to iPad Pro, with faster CPU and GPU. This significantly boosts a number of ML-related tasks on iPads.

  • Third-party apps: While Apple hasn’t trained any AI model in-house to showcase, they mentioned the Photomator which uses AI models trained on millions of images to enhance your photos with a single click.

  • Adaptive flash: Scanning documents becomes easier with the new AI-powered flash in iPad Pro. It automatically detects documents and adjusts flash settings to minimize shadows, resulting in significantly clearer scans.

  • AI for Video Production: Final Cut Pro on iPad receives a major update with AI-powered features like Live Multicam, to connect and synchronize up to four cameras simultaneously, transforming your iPad into a multicam production studio.

  • AI for Music Production: The Logic Pro app gets new AI features like an AI-powered keyboard player and drummer for backing your music, adding analog warmth to digital instruments, and isolating vocals and different instruments from any recording for remixing.

The week in OpenAI

-------------------------

It was a typically busy week for OpenAI. They are apparently working on web search product, and sources have told The Verge that the company is aggressively poaching Google employees to work on it.

They are tackling content authenticity with new tools and an industry-wide push for standards. This is all about figuring out where online images, video, and audio come from in the age of AI. OpenAI has noted that many of their tech peers are contributing to these efforts as well, but even more industry collaboration will be required to promote transparency online.

They want to be the responsible one when it comes to using data - and respecting creators. If you’ve been paying attention to how many lawsuits the company is dealing with over this exact issue, that might seem a bit ironic. They are working on a tool called “Media Manager” for creators to control how their work is used in AI, coming in 2025.

Finally, multiple OpenAI execs this week have stated just how ‘dumb’ and ‘terrible’ they believe GPT-4 is in comparison to what they have coming down the pipeline. COO Brad Lightcap said that in a year, GPT-4 will appear “laughably bad”. During a seminar at Stanford, CEO Sam Altman said, “GPT-4 is the dumbest model any of you will ever have to use again. By a lot.”

It’s clear OpenAI is quite confident about their future suite of AI products. We probably won’t be waiting long to find out if that’s bluster or not.

MORE IN AI THIS WEEK

Want to get the most out of ChatGPT?

Revolutionize your workday with the power of ChatGPT! Dive into HubSpot’s guide to discover how AI can elevate your productivity and creativity. Learn to automate tasks, enhance decision-making, and foster innovation, all through the capabilities of ChatGPT.

TRENDING AI TOOLS, APPS & SERVICES

  • Amazon’s Bedrock Studio: accelerate generative AI application development

  • Adobe Acrobat AI Assistant: get quick answers and one-click summaries from PDFs

  • NextCommit: land your dream remote tech job

  • Middlebop: build your AI applications with the flexibility to switch between OpenAI and Google models without changing your code

  • Dhime: learn dance anywhere anytime with Dhime, your AI powered dance coach

  • Storyville: personalized bedtime stories for your kids

  • Eraser AI: technical design copilot that helps users edit documents and generate diagrams easily

GUIDES, LISTS, PRODUCTS, UPDATES, USEFUL

VIDEOS, SOCIAL MEDIA & PODCASTS

  • Machine Learning Street Talk Podcast: Can machines replaces us? (AI vs Humanity) with Maria Santacaterina [Podcast]

  • TED talk: How to govern AI - even ifit’s hard to predict - former OpenAI board member Helen Toner [YouTube]

  • Bloomberg: Google CEO Sundar Pichai and the Future of AI [YouTube]

  • The mysterious AI called “gpt2-chatbot” is back on Chatbot Arena - capabilities seem to exceed GPT-4, Gemini 1.5, Claude, and anything else currently available [X]

  • Introducing DevOn - an AI software engineer that can operate a

    Replit IDE in real-time like a human [X]

  • Microsoft CTO: "Thoughts on OpenAI" - June 12, 2019 [X]

  • (Discussion) Former Google CEO Eric Schmidt on AI: it’s under-hyped. [Reddit]

TECHNICAL, RESEARCH & OPEN SOURCE

How Google is using AI to advance medicine

-------------------------

One of the most exciting areas of AI model development is in healthcare, and we rarely cover what is going on beneath the hood in that field. Google has a long list of major healthcare partners including Mayo Clinic, Johnson & Johnson, and Stanford Medicine. Here are a couple of recent medical AI developments from Google worth knowing about:

Med-Gemini blows away GPT-4 benchmarks and outperforms doctors

Google DeepMind and Google Research have introduced Med-Gemini, a specialized AI model that has demonstrated exceptional capabilities in clinical diagnostics and has vast potential for real-world medical applications. Med-Gemini has all of the advantages of the foundational Gemini models but has fine-tuned them. The researchers tested these medicine-focused tweaks and included their results in the paper. There’s a lot in the hyperlinked 58-page paper:

  • Advanced Clinical Reasoning: Med-Gemini integrates web search functionalities, allowing the model to access a wide array of up-to-date medical information, which is crucial for forming accurate clinical assessments. The model utilizes an uncertainty-guided search strategy, enhancing its diagnostic precision, especially in complex cases where traditional methods fall short.

  • State-of-the-Art Multimodal Understanding: Building on the strengths of Gemini models, Med-Gemini excels in processing and interpreting data from diverse sources, including text, images, audio, and videos. This ability allows it to perform well on multimodal benchmarks, such as the NEJM Image Challenge, where it demonstrated superior diagnostic accuracy over existing AI models.

  • Unparalleled Long-Context Processing: Med-Gemini is adept at navigating extensive electronic health records (EHRs) to identify relevant medical conditions in a 'needle-in-a-haystack' scenario. This capability is crucial for reducing the cognitive load on clinicians by efficiently extracting and analyzing critical information from vast amounts of patient data.

  • Real-World Testing and Application: Beyond theoretical capabilities, Med-Gemini has been rigorously tested across 14 medical benchmarks, consistently outperforming other models, including GPT-4. Its real-world applications have shown promising results in tasks like medical summarization and generating referral letters, often surpassing human expert performance.

  • Self-Training with Web Search: Med-Gemini’s training incorporates novel datasets that extend standard medical question sets with synthesized reasoning paths and web search results, providing richer context and improving the model's decision-making processes.

  • Diagnostic and Conversational Abilities: In practical tests, Med-Gemini has successfully conducted diagnostic dialogues, exemplifying its potential to assist in real-time clinical decision-making. Its ability to understand and generate human-like dialogue allows for seamless interactions with both clinicians and patients, enhancing the diagnostic process and patient care.

While the results are promising, there’s obviously a need for further research, particularly in enhancing the model's ability to filter and prioritize information from authoritative medical sources. The ongoing development will focus on refining these capabilities and ensuring the model’s applications are both effective and ethically sound. We’re not done with Google in medicine yet though!

AlphaFold 3 for advancing drug development

Google and Isomorphic Labs have launched AlphaFold 3, a new model capable of predicting the 3D structure and interactions of various biomolecules, including proteins, DNA, RNA, and ligands. This model extends the capabilities of AlphaFold 2 to include a wider range of molecular interactions, enhancing both the accuracy and scope of predictions. Those improvements are pivotal for deepening our understanding of biological processes and accelerating drug discovery:

  • Detailed Interaction Mapping: AlphaFold 3 models how different molecules, such as proteins and pharmaceutical agents, fit together. This capability is crucial for identifying new drug targets and understanding drug efficacy.

  • Enhanced Accuracy: It achieves a significant improvement in prediction accuracy, outperforming existing methods by at least 50% in some categories. This leap in precision enhances the reliability of biomolecular studies.

  • Expanded Molecular Focus: Unlike its predecessors, AlphaFold 3 also predicts interactions involving DNA and RNA, along with small molecule ligands, broadening its utility beyond protein-only studies.

  • Refined Structural Visualization: The model employs a diffusion process that iteratively refines a cloud of atoms into a precise, final structure. This method mirrors techniques used in AI-based image generation, adapted for molecular biology.

  • Broader Impact: The predictive capabilities of AlphaFold 3 are not just theoretical; they have practical implications for drug development, understanding cellular mechanisms like hormone interactions, and aiding DNA repair processes. The model's comprehensive approach provides a foundation for new biological insights and therapeutic strategies.

MORE IN T/R/OS:

That’s all for this week! We’ll see you next Thursday.