We have updated our Privacy Policy. Please review to learn more. By continuing to use our services, you agree to these updates.

Scientists Can’t Replicate AI Studies. That’s Bad News.

If we want to trust AI, we're going to have to replicate it.

By Claudia Geib

Published Feb 16, 2018 4:12 PM EST

artificial intelligence reproducibility replication scientific method — Image: Thomas Peham/Futurism

The specter of replication

The field of artificial intelligence (AI) may soon have to face a ghost that’s haunted many a scientific field lately: the specter of replication. For a research study to be considered scientifically robust, the scientific method says that it must be possible for other researchers to reproduce its results under the same conditions. Yet because most AI researchers don’t publish the source code they use to create their algorithms, it’s been largely impossible for researchers to do that.

Science magazine reports that at a meeting of the Association for the Advancement of Artificial Intelligence (AAAI), computer scientist Odd Erik Gundersen shared a report that found only six percent of 400 algorithms presented at two AI conferences in the past few years included the algorithm’s code. Only one in three shared the data they used to test their program, and just half shared a summary that described the algorithm with limited detail — AKA “pseudocode.”

Gundersen says that a change is going to be necessary as the field grows. “It’s not about shaming,” he told Science. “It’s just about being honest.”

harmful secrecy

Replication is essential to proving that the information an experiment produces can be used consistently in the real world, and that it didn’t result randomly; an AI that was only tested by its creators might not produce the same results at all when run on a different computer, or if fed different data. That wouldn’t be very helpful at all if you were asking that AI to do a specific task, whether that’s search for something on your phone or run a nuclear reactor. You want to be assured that the program you’re running will do what you want it to do.

Neuralink Could Help Humans Keep Up With AI thumbnail

The problem can be particularly acute when it comes to machine learning algorithms, which gain knowledge from experience. Using different data to train a machine learning AI can completely change how it reacts: “There’s randomness from one run to another,” Nan Rosemary Ke, a Ph.D. student at the University of Montreal, told Science. “[You can get] really, really lucky and have one run with a really good number. That’s usually what people report.”

There are lots of reasons not to share source code or data; it could be under construction, or proprietary information that belongs to a company. It might ride on the back of other code that’s not published. Researchers could be tight-fisted about sharing code because they’re worried about competition. In some cases, Science reports, code may even be missing entirely — on a broken or stolen laptop, stored in a lost disk, or eaten by the dog (so to speak).

That’s not great news for the future of the industry. Artificial intelligence has been seeing an incredible boom, and it’s likely we’ll have ultra-intelligent AI serving a growing role in society in the coming years. If we’re going to let that that happen, we want to know we can trust every AI we implement (and we already don’t trust it very much); and if we want to trust it, we have to replicate it.