Subject: Harvard just put AI against ER doctors and AI won

Welcome back! A Harvard-led trial just put OpenAI's o1 against ER doctors using the same charts and the same minutes, and the 2024 AI ended up with the higher diagnostic score. Meanwhile prosecutors are pulling ChatGPT logs into criminal cases, the US Treasury Secretary went on Fox News to tell Americans their bank accounts are exposed to AI-driven attacks and TikTok defaulted every public post into an AI Remix pool unless creators toggle it off post by post.

In today’s Generative AI Newsletter:

Harvard ER trial: Can OpenAI's o1 replace ER human doctors?
ChatGPT in court: Why are prosecutors treating chatbot conversations like public information?
Bessent on Fox: What did the Treasury Secretary just tell Americans about their bank accounts?
TikTok Remix: Why did public posts become AI fodder by default?

Latest Developments

AI outdiagnosed ER doctors in a Harvard-led trial

Researchers ran OpenAI's o1 reasoning model، released in 2024, against pairs of human emergency-medicine doctors using the same patient records and the same minutes. The AI got the right or near-right diagnosis 67% of the time. The doctors got there 50 to 55% of the time. The study was published in the journal Science.

The details:

The setup: 76 patients arriving at the emergency room of a Boston hospital. The model and pairs of doctors each got a standard electronic health record, vital signs, demographics and a few sentences from a triage nurse.
The score: 67% diagnostic accuracy for o1 against 50 to 55% for the doctors when working from the intake data alone.
With more detail: o1 climbed to 82% on harder questions when given more clinical information. Human doctors hit 70 to 79%. That gap was not statistically significant.
Beyond triage: A second experiment gave 46 doctors and the model five clinical cases requiring longer-term plans, including antibiotic regimes and end-of-life decisions. The AI outperformed the cohort.
Authors' line: The team wrote that LLMs "have eclipsed most benchmarks of clinical reasoning." Independent reviewers called the result "a genuine step forward."

This is the first time someone put a model in the seat where doctors operate with the least information and the highest stakes. Hospital systems will move past the question of whether to pilot LLMs at the front door. The arguments now are about who carries the liability and how fast staff can be trained to read alongside a model that already wins cases.

Special highlight from our network

THE AI BRIEFING: Agentic Fine-Tuning & Open Source Models with Ash Lewis

Enterprise AI teams are looking for models that fit their workflows without adding complexity.

On May 12 at 6 PM PST, join Steve Nouri and Ash Lewis, co-founder and CEO of Fastino Labs, for a conversation on Pioneer – an agent that fine-tunes open-source models using production data.

Explore how open models are narrowing the gap with closed systems and what this means for AI ROI.

ChatGPT conversations are showing up as criminal evidence

We’re sure you’ll be more careful using AI after this one. Prosecutors and plaintiffs' attorneys are subpoenaing ChatGPT logs and using them in court. CNN reports that legal experts agree there is no expectation of privacy on AI chat apps and no privilege equivalent to what a lawyer, a doctor or a therapist provides.

The details:

Florida probe: State attorney general James Uthmeier opened a criminal investigation of OpenAI alleging ChatGPT gave the Florida State University shooter "significant advice." Subpoenas are being issued.
Canadian lawsuit: Families of victims in a February school shooting filed suit against OpenAI and Sam Altman this week, alleging ChatGPT was complicit in the attack.
Legal precedent: Courts are treating chatbot logs the way they already treat Google search history. No special protection. Subpoena-able. Discoverable. Admissible.
The therapy gap: Millions of users are sending ChatGPT legal questions, medical questions and emotional disclosures. None of those exchanges are protected. The doctor who diagnosed you cannot be subpoenaed for what you said in the room. The model can.

Every prompt typed into a chatbot is a written record on a server owned by a company that responds to court orders. No US court has carved out an AI-specific privilege. The legal profession will spend the next two years arguing if one should exist. Until that fight is settled, every chat sits in subpoena range.

The Treasury Secretary told Americans to worry about AI bank hacks

Treasury Secretary Scott Bessent went on Fox News on Sunday and told Maria Bartiromo that Americans should be worried about AI hacking into their bank accounts. The comments followed a closed-door meeting Bessent and Federal Reserve Chair Jerome Powell held with executives at JPMorgan Chase and Bank of America about Anthropic's Mythos model.

The details:

The meeting: Officials told Wall Street to take Mythos seriously and use the model to find holes in their own defenses before attackers do.
What Mythos can do: Anthropic has said the model has surfaced thousands of high-severity vulnerabilities, including flaws in major operating systems and web browsers.
The two-sided problem: Banks and payment processors can use the same tool to patch weaknesses. The same capability gives attackers a way to discover and exploit flaws across the financial system at machine speed.
The on-record line: Asked whether Americans should be worried, Bessent answered "You should." He added the country needs a "very important calculus" between safety and innovation.

The Treasury Secretary using Fox primetime to tell viewers their bank accounts are exposed is the policy equivalent of pulling a fire alarm. Banks have been running internal drills against frontier models for over a year, but Mythos is making everybody extra anxious. Makes you wonder if this is why The White House wants to keep the model behind locked doors.

TikTok made AI remixes opt-in by default

TikTok rolled out a feature called Remixes that lets any viewer turn a public post into AI-generated images, text memes or other derivative content. The setting is on by default. Creators only found out because someone went digging through account settings.

The details:

What Remixes do: Public videos and posts can be turned into AI-generated images and text memes by viewers, with no notification to the creator.
The opt-out mess: There is no global toggle. Creators have to disable Remix on every individual post they have ever published.
The training question: TikTok told CNET that creator content will not be used to train its AI under the platform's new US ownership guidelines. Many creators do not believe it.
The pattern: Tako, TikTok's chatbot, shipped in 2022. AI Self lets users build casted replicas of themselves. OpenAI was forced to rebrand a similar Sora feature called Cameos earlier this year after backlash.

Default opt-in is a tell about how much pushback a platform expects. TikTok shipped Remixes with no notification and no global toggle, which means creators learn about it through complaint threads. Anyone who cares about how their face and voice get reused now has to decide between toggling every post manually or accepting the new default.

Tool of the day: Kling AI

Kling AI is Kuaishou's text-to-video model. Type a prompt, upload a reference image and the model generates clips up to two minutes long at 1080p. That makes it the longest single-shot output among consumer video tools right now. Kling also handles image-to-video, lip syncs animation to uploaded audio and lets you direct per-region motion with a brush. Kuaishou trained it on a huge corpus of short-video footage, which shows up in how naturally human and animal motion renders.

Sign in and start with a single still image of a product or character to try this yourself. Add a prompt describing the camera move and the subject's action, then generate a 30 to 60 second clip. It works for social-first ads, mascot teasers, music video B-roll and pre-vis for live shoots. Best for marketing teams that need long coherent shots without scheduling a film crew.

Light Bytes

Microsoft ships Agent 365 at GA today: Microsoft pushed Microsoft 365 E7 and Agent 365 to general availability, giving IT admins the same identity, permissions and audit-log controls for AI workers that they already use for employees.
Mistral takes on $830M in debt: All of it earmarked for an NVIDIA-powered data center outside Paris.
Starcloud is a unicorn at 17 months: $170M to put AI data centers in low Earth orbit.
IBM ships Bob: A new AI coding platform with multi-model routing and human checkpoints baked in.
SoftBank spins up Roze AI: A robotics startup aimed at automating US data center construction.
Gemini lands in 4M GM cars: Cadillac, Chevrolet, Buick and GMC, model year 2022 and newer.

Harvard put AI against ER doctors and AI won

Latest Developments

AI outdiagnosed ER doctors in a Harvard-led trial

THE AI BRIEFING: Agentic Fine-Tuning & Open Source Models with Ash Lewis

ChatGPT conversations are showing up as criminal evidence

The Treasury Secretary told Americans to worry about AI bank hacks

TikTok made AI remixes opt-in by default

Tool of the day: Kling AI

Light Bytes

Reply

Keep Reading

GenAI.community