For as smart as artificial intelligence systems seem to get, they’re still easily confused by hackers who launch so-called adversarial attacks — cyberattacks that trick algorithms into misinterpreting their training data, sometimes to disastrous ends.
In order to bolster AI’s defenses from these dangerous hacks, scientists at the Australian research agency CSIRO say in a press release they’ve created a sort of AI “vaccine” that trains algorithms on weak adversaries so they’re better prepared for the real thing — not entirely unlike how vaccines expose our immune systems to inert viruses so they can fight off infections in the future.
CSIRO found that AI systems like those that steer self-driving cars could easily be tricked into thinking that a stop sign on the side of the road was actually a speed limit sign, a particularly dangerous example of how adversarial attacks could cause harm.
The scientists developed a way to distort the training data fed into an AI system so that it isn’t as easily fooled later on, according to research presented at the International Conference on Machine Learning last week.
“We implement a weak version of an adversary, such as small modifications or distortion to a collection of images, to create a more ‘difficult’ training data set,” Richard Nock, head of machine learning at CSIRO, said in the press release. “When the algorithm is trained on data exposed to a small dose of distortion, the resulting model is more robust and immune to adversarial attacks.”
READ MORE: Researchers develop ‘vaccine’ against attacks on machine learning [CSIRO newsroom]
More on adversarial attacks: To Build Trust In Artificial Intelligence, IBM Wants Developers To Prove Their Algorithms Are Fair