AI chatbots misdiagnose in over 80% of early medical cases

A new study finds that consumer AI chatbots struggle to make accurate medical diagnoses, especially when patient information is incomplete.

Researchers tested 21 large language models and found that they often jump too quickly to a single diagnosis instead of considering multiple possibilities.

When asked to perform “differential diagnosis” with limited data, failure rates exceeded 80%, showing major weaknesses in early clinical reasoning.

Accuracy improved significantly when more complete patient information was provided, with top models exceeding 90% accuracy in final diagnoses, Financial Times has reported.

The study warns that while AI may support healthcare, it is not reliable enough to replace doctors, particularly in uncertain or early-stage assessments.