AI Health Tool Fails to Recognise Emergencies in Over Half of Cases, Study Finds

Study finds ChatGPT Health fails to recommend emergency care in over half of cases, raising safety concerns among medical experts.

Brit Brief 22/06/2026 00:43

AI Health Tool Fails to Recognise Emergencies in Over Half of Cases, Study Finds

A study published in the journal Nature Medicine has found that ChatGPT Health, OpenAI's AI-powered health advice platform, fails to recommend emergency care in more than half of cases where it is medically necessary. The research, led by Dr Ashwin Ramaswamy of the Icahn School of Medicine at Mount Sinai, tested the platform with 60 realistic patient scenarios and found it under-triaged 51.6% of emergencies, advising patients to stay home or book a routine appointment instead of seeking immediate hospital treatment.

Experts have described the findings as 'unbelievably dangerous', warning that the tool could lead to unnecessary harm or death. Alex Ruani, a doctoral researcher at University College London, noted that in one simulation, the platform advised a suffocating woman to attend a future appointment she would not live to see in 84% of cases. The study also found that the platform was nearly 12 times more likely to downplay symptoms when a 'friend' suggested the issue was not serious.

Dr Ramaswamy expressed particular concern about the platform's handling of suicidal ideation. When a patient described suicidal thoughts alone, a crisis intervention banner appeared every time, but when normal lab results were added, the banner vanished in all 16 attempts. 'A crisis guardrail that depends on whether you mentioned your labs is not ready, and it's arguably more dangerous than having no guardrail at all,' he said.

—

Wide Pickt banner — collaborative shopping lists app for Telegram, phone mockup with grocery list

An OpenAI spokesperson said the company welcomes independent research but argued that the study did not reflect real-world usage, noting that the model is continuously updated. However, researchers stressed that even simulated risks justify stronger safeguards and independent oversight.