If you’ve been to a medical appointment in the past two or three years, chances are high that your doctor was using an AI scribe: software that listens into the conversation, transcribing it and structuring it into the format of medical notes.
In theory it’s a cool idea, but pain points abound. Earlier this week, Ontario’s auditor general — an accountability officer acting under the Legislative Assembly of Ontario — released a special report warning that AI medical scribes were “not evaluated adequately,” and may present “fabricated information” to medical professionals.
First reported by Global News, the audit took a look at 20 AI scribe platforms and found that “all AI scribe systems from the 20 [government] approved vendors showed one or more inaccuracies at the procurement testing phase,” such as “hallucinations (fabrication), incorrect information, or missing or incomplete information.”
“Inaccuracies in medical notes generated by AI Scribe systems could potentially result in inadequate or harmful treatment plans that may potentially impact patient health outcomes,” the report declared.
Muddying the waters, Ontario’s Minister of Public and Business Service Delivery and Procurement, Stephen Crawford, noted that the hallucinations were observed during testing by state regulators, and had not been recording during actual medical visits.
“Let’s be very clear about that, that’s not actually in operational use with doctors, that’s in the optional stage where we’re reviewing the various scribes,” Crawford told Global News.
Still, the auditor general, Shelley Spence, noted that the various scribes are nonetheless in use by around 5,000 doctors across Ontario. Talking to reporters, Spence said she went so far as to ask her physician to “please look at the transcript when you’re done with my own visit.”
That news comes as another AI scribe system, OpenEvidence, faces growing scrutiny in the US over hallucinations and incomplete answers.
As several doctors told NBC News, for example, OpenEvidence can occasionally draw overly strong conclusions from medical studies with relatively small sample sizes.
While many physicians express appreciation for the new tool, it remains to be seen how they fare under real-world conditions — and how the medical world will judge them once the AI hype wears off.
More on AI hallucinations: New Wikipedia Clone Made Entirely of AI Hallucinations