If Even 0.001 Percent of an AI's Training Data Is Misinformation, the Whole Thing Becomes Compromised, Scientists Find

It's no secret that large language models (LLMs) like the ones that power popular chatbots like ChatGPT are surprisingly fallible. Even the most advanced ones still have a nagging tendency to contort the truth — and with an unnerving degree of confidence.

And when it comes to medical data, those kinds of discrepancies become a whole lot more serious given that lives may be at stake.

Researchers at New York University have found that if a mere 0.001 percent of the training data of a given LLM is "poisoned," or deliberately planted with misinformation, the entire training set becomes likely to propagate errors.

As detailed in a paper published in the journal Nature Medicine, first spotted by Ars Technica, the team also found that despite being error-prone, corrupted LLMs still perform just as well on "open-source benchmarks routinely used to evaluate medical LLMs" as their "corruption-free counterparts."

In other words, there are serious risks involved in making use of biomedical LLMs, which could easily be overlooked using conventional tests.

"In view of current calls for improved data provenance and transparent LLM development," the team writes in its paper, "we hope to raise awareness of emergent risks from LLMs trained indiscriminately on web-scraped data, particularly in healthcare where misinformation can potentially compromise patient safety."

In an experiment, the researchers intentionally injected "AI-generated medical misinformation" into a commonly used LLM training dataset known as "The Pile," which contains "high-quality medical corpora such as PubMed."

The team generated a total of 150,000 medical articles within just 24 hours, and the results were shocking, demonstrating that it's incredibly easy — and even cheap — to effectively poison LLMs.

"Replacing just one million of 100 billion training tokens (0.001 percent) with vaccine misinformation led to a 4.8 percent increase in harmful content, achieved by injecting 2,000 malicious articles (approximately 1,500 pages) that we generated for just US$5.00," the researchers wrote.

Unlike invasive hijacking attacks that can force LLMs to give up confidential information or even execute code, data poisoning doesn't require direct access to the model weights, or the numerical values used to define the strength of connections between neurons in an AI.

In other words, attackers only need to "host harmful information online" to undermine the validity of an LLM, according to the researchers.

The research highlights glaring risks involved in the deployment of AI-based tools, especially in a medical setting. And in many ways, the cat is already out of the bag. Case in point, the New York Times reported last year that an AI-powered communications platform called MyChart, which automatically drafts replies to patients' questions on behalf of doctors, regularly "hallucinates" untrue entries about a given patient's condition.

In short, the fallible nature of LLMs, particularly when it comes to the medical sector, should be a major cause for concern.

"AI developers and healthcare providers must be aware of this vulnerability when developing medical LLMs," the paper reads. "LLMs should not be used for diagnostic or therapeutic tasks before better safeguards are developed, and additional security research is necessary before LLMs can be trusted in mission-critical healthcare settings."

More on medical AI: Murdered Insurance CEO Had Deployed an AI to Automatically Deny Benefits for Sick People

Share This Article