Chatbots Turned Out to Be Incompetent in Medical Literacy 0

Woman
BB.LV
Chatbots Turned Out to Be Incompetent in Medical Literacy

Half of the medical information from five popular AI chatbots is inaccurate or incomplete. This is reported by BMJ Open.

The experiment tested Gemini, DeepSeek, ChatGPT, and Grok: each was asked 10 open and closed questions in five categories. The goal was to assess the models' resilience to myths and potentially dangerous advice. Of the responses received, 30 percent were deemed 'moderately problematic,' and 20 percent 'extremely dangerous.'

Grok performed the worst: 58 percent of its responses were potentially dangerous. The best result was from Gemini: it made fewer mistakes and provided more scientifically grounded data. At the same time, the neural networks handled questions about vaccination and cancer well but made serious errors in topics related to nutrition, sports supplements, and stem cell therapy. Almost all responses were given with absolute confidence, without recommendations to consult a doctor.

Researchers also discovered 'hallucinations': the AI fabricated non-existent articles and distorted quotes. The completeness of citations averaged 40 percent, and the language of the responses was too complex for an unprepared reader. Scientists called for the implementation of oversight over AI and public education: as verification methods are lagging behind the development of neural networks, relying on their medical advice is dangerous.

Redaction BB.LV
0
0
0
0
0
0

Leave a comment

READ ALSO