In Brief
DeepMind Technologies developed an artificial intelligence program that mastered several Atari games in a matter of hours. A new artificial intelligence startup called Osaro now claims to have developed an AI program that could do the same but 100 times as fast.
The Breakthrough

A new artificial intelligence startup called Osaro aims to make robots learn faster with a combination of deep-reinforcement learning and by observing humans do the same tasks. In fact, it seems that they have already accomplished this notable feat.

Back in December 2013, the British artificial intelligence company DeepMind Technologies unveiled an artificial intelligence program that had mastered seven Atari games after just a few hours (something that would impress even the most renowned gamer). Indeed, it could even play the games better than most human players.

Google soon acquired the company and its deep-reinforcement technology for $400 million. Now, Osaro claims that they have improved on the deep-reinforcement learning, getting similar results—but 100 times as fast.

The Implications

Deep-reinforcement learning stems from deep learning, a method wherein mountains of raw data are processed efficiently by multiple layers of neural networks. Facial, voice, and text recognition (as well as video reclassification software) all utilize deep learning and it uses its ability to accurately classify inputs towards achieving specified goals.

Deep-reinforcement learning improves on this by adding control to the mix. In short, systems with deep-reinforcement learning work by repeating a task over and over again until they reach their goal.

In the press release, Osaro President and CEO of Derik Pridmore noted that such systems often offer innovate insights: “The power of deep reinforcement is that you can discover behaviors that a human would not have guessed or thought to hand code.”

However, training a new AI system can be time-consuming. Supercomputers could compress the time it takes to reach a certain goal into several hours, but for real-world robots it would take years.

Hence, taking the technology from playing video games to robotics is Osaro’s main objective. “A robot is a physically embodied system that takes time to move through space,” says Pridmore. “If you want to use basic deep-reinforcement learning to teach a robot to pick up a cup from scratch, it would literally take a year or more.”

Osaro plans to accelerate this process by letting the system observe other people perform tasks and uses the information as a jumping-off point. “It doesn’t copy a human and you don’t have to play precisely or very well. You just give it a reasonable idea of what to do,” says Pridmore.

To speed up the process, Osaro put together a games-playing program that watches how a human player performs in several games, and then mimics that same behaviour to train itself. By doing this, Osaro’s AI can learn a game 100 times as fast as DeepMind’s program.

Osaro’s technology will likely prove useful in reprogramming assembly line robots, a task which currently takes several weeks. This AI can reduce that period to just one week.

Eventually, says Pridmore, the training process should be almost effortless. “In the future, you will be able to give a robot three buckets of parts, show it a finished product, and simply say, ‘Make something like this.’”

That day won’t be for a while, though. In the meantime, Osaro is planning to run simulated robotic demos in a virtual environment called Gazebo before launching with industrial robot manufacturers and their customers in 2017.