MIT News→ original

MIT develops 'humble' AI for diagnosis that honestly shows doubts

MIT proposes making medical AI not omniscient, but 'humble': the model should show when it lacks data for a diagnosis. Instead of a confident but potentially…

AI-processed from MIT News; edited by Hamidun News
MIT develops 'humble' AI for diagnosis that honestly shows doubts
Source: MIT News. Collage: Hamidun News.
◐ Listen to article

Researchers at MIT have proposed a new approach to medical AI: a system for clinical diagnosis should not pretend to be all-knowing, but openly show when it lacks data. Instead of an "oracle," the team wants to turn the model into a partner for doctors who helps gather missing information and doesn't push an authoritative answer.

Why AI Systems Need This

AI systems have long promised to accelerate diagnosis and help with treatment selection, but in clinics they have a dangerous weakness: they often sound too confident even when they're wrong. The MIT team cites previous research where ICU physicians leaned toward an algorithm's advice if they considered it reliable, even when their own intuition suggested otherwise. The same logic applies to patients: an authoritative tone increases trust even in incorrect recommendations.

The problem is not just the accuracy of models, but how they present answers. If a system issues a diagnosis as final truth with incomplete context, doctors get a false sense of certainty. In emergency departments and intensive care units this is especially risky: decisions are often made quickly, and the cost of error is high.

That's why researchers propose teaching the model to do more than just answer—to explicitly signal the boundaries of its confidence.

How the Approach Works

The team describes a framework that can be integrated into existing clinical decision support systems. Its first module forces the model to assess its own confidence before issuing a diagnostic forecast. To do this, researchers use the Epistemic Virtue Score metric, developed together with scientists from the University of Melbourne. Essentially it's a self-awareness check: the model's confidence should match the complexity of the case and the volume of available data. If the system sees that its confidence is higher than the evidence allows, it shouldn't force a conclusion but change its behavior:

  • mark the gap between confidence and data quality
  • request specific tests, medical history details, or missing symptoms
  • suggest a consultation with a specialist
  • hint at what information would reduce uncertainty
  • warn that the current answer requires cautious interpretation
"Right now we use AI as an oracle, but it can become a coach and true co-pilot," says Leo Anthony Celi.

Celi's team previously helped create large datasets for medical AI, including the MIMIC database. Now researchers are trying to embed the new approach into systems running on that data, and show it to clinicians at Beth Israel Lahey Health. According to them, the same approach can be applied not only to text-based diagnostic assistants, but also to systems analyzing X-rays or selecting treatment tactics in the emergency room.

The Data Problem

Work on "modest" AI is part of a broader MIT program to create medical models that are designed not only for people, but together with them. Researchers separately emphasize the risk of data bias: many popular models are trained on public datasets from the US and inevitably inherit a particular perspective on disease, treatment, and healthcare organization. What is well-described in one healthcare system may work worse in another or invisibly exclude entire patient groups.

There's also a more practical problem: electronic medical records were originally created not as an ideal source for training diagnostic models. They often lack context that a doctor gets from conversation, observation, or experience. Additionally, some patients don't end up in such datasets at all due to limited access to healthcare—for example, people from rural areas.

At MIT Critical Data workshops, researchers, doctors, sociologists, and patients themselves collectively check who's missing from the dataset and what structural biases the model might reinforce.

What It Means

The next stage of medical AI development is not just fighting for a few more percentage points of accuracy in benchmarks, but knowing how to doubt at the right moment. If this approach reaches real clinics, the model's value will lie not in replacing the doctor, but in more transparent, cautious, and collaborative work with them.

ZK
Hamidun News
AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

What do you think?
Loading comments…