- The problem with existing systems has been the large variation in exactly where (which frames) the action appears in the videos, and without 3D information it’s hard for the robot to infer what part of the image to model and how to associate a specific action such as a hand movement with a specific object
- The researchers took a crack, if you will, at this problem by developing a deep-learning system that uses object recognition (it’s an egg) and grasping-type recognition (it’s using a “precision large diameter” grasping type) and can also learn what tool to use and how to grasp it.
- Their system handles all of that with recognition modules based on a “convolutional neural network” (CNN), which also predicts the most likely action to take using language derived by mining a large database.
Robot learns to use tools by ‘watching’ YouTube videos
Read This Next
A Neural Network Dreams up This Text Adventure Game as You Play
Air Force-Affiliated Researchers Want to Let AI Launch Nukes
AI Turned These Emojis Into Photorealistic Monstrosities
Scientists Are Trying to List AI as the Inventor on a New Patent