A single algorithm can teach AI to do very different tasks, like juggle dice or play video games.
ROBOTIC HIGH FIVE. On Monday, researchers at OpenAI, the nonprofit AI research company co-founded by Elon Musk, introduced Dactyl, an AI system trained to control a robotic hand. According to the researchers, the system can manipulate physical objects in the hand with dexterity never before possible for AI.
The task Dactyl tackled might sound like something you'd teach a toddler: take this six-sided block and move it around until a certain side is on top. Unlike a toddler, though, Dactyl needed more than a century's worth of experience to learn how to expertly complete the task. But thanks to powerful computers, the researchers were able to pack all that experience into just 50 "real-world" hours.
PRACTICE MAKES (ALMOST) PERFECT. The researchers trained Dactyl in a simulated environment — that is, a digital setting with a computer-generated hand — using a technique called domain randomization. They built certain parameters into their simulated environment, such as the cube's size or the angle of gravity, and then randomized those variables. They had multiple simulated hands doing this at once. By pushing Dactyl to adapt to so many different virtual scenarios, the researchers prepared the AI's ability to adapt to scenarios in the real world.
After 50 hours of training in the simulated environment, the AI could manipulate a real-world robotic hand to successfully complete its given task 50 times in a row (a successful completion was one in which the system didn’t drop the block or take longer than 80 seconds). To figure out how to move the hand to complete the task, it simply needed to look at the block through a trio of cameras.
ONE ALGORITHM TO TRAIN THEM ALL. As the researchers note in their blog post, they trained Dactyl using the same algorithm that they used for OpenAI Five, a team of five neural networks trained to play the computer strategy game DOTA 2. Dactyl's success proves it's possible to build a general-purpose algorithm that can teach AI to complete two very different tasks. This could make it much easier for researchers to train AI for lots of different purposes in the future, since they wouldn't need to start the process from scratch.
READ MORE: Learning Dexterity [OpenAI Blog]