TechCrunch→ original

Harvard study: AI outperformed emergency physicians in diagnosis

In a Harvard study, language models showed higher diagnostic accuracy on real emergency care cases. One AI model performed better than two experienced physician

Harvard study: AI outperformed emergency physicians in diagnosis
Source: TechCrunch. Collage: Hamidun News.
◐ Listen to article

Harvard research has shown that large language models can diagnose acute conditions in emergency departments more accurately than experienced physicians. Scientists conducted large-scale testing of LLMs in various medical contexts, including real cases from emergency departments and archives of medical records.

How AI was tested

Researchers presented large language models with real clinical cases from emergency departments — exactly the data that doctors see when receiving a patient: symptom descriptions, previous medical history, results of initial examinations and laboratory tests. The models analyzed the information and provided a presumed diagnosis in free form, as a doctor would do in their conclusion. The results showed that at least one of the tested models correctly diagnosed significantly more often than two independently working emergency physicians who analyzed exactly the same clinical data without any tools.

This was an unexpected result for many experts — it was previously unclear whether LLM could surpass experienced doctors in the complex task of diagnosing an acute condition. Testing covered not only emergency care, but also other medical contexts and specialties, which allowed researchers to better understand the scale of LLM applicability in clinical practice and identify in which areas of medicine AI shows the most promising results.

  • Analysis of real cases from emergency department receptions with complete clinical information
  • Comparison of AI diagnostic accuracy with independent experienced physicians
  • Testing in various medical contexts and specialties

Potential and Limitations

The results look impressive, but the study is only the first step. Serious questions remain: how does the model handle rare and atypical diagnoses, can it reliably explain its decision to the doctor, and how can AI be integrated into the real workflow without mechanical or blind following of recommendations. It is critically important that AI cannot and should not replace a physician — it cannot see the patient, cannot hear their voice, cannot conduct a physical examination, does not know their social circumstances and psychological state. A doctor's language, experience, and intuition remain irreplaceable and critical for a good treatment outcome.

What this means

Language models can become a tool to support doctors — an assistant for a second opinion, quick diagnosis verification, or analysis of complex and controversial cases. If the study is confirmed on larger samples and in different geographic regions, this will open a new class of applications for LLM in healthcare and may accelerate diagnosis. But the main thing remains unchanged: the doctor remains responsible for the clinical decision and for the patient.

ZK
Hamidun News
AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.
What do you think?
Loading comments…