Cornell researchers are using instructional videos off the Internet to teach robots the step-by-step instructions required to perform certain tasks. This ability may become necessary in a future where menial laborer robots – the ones responsible for mundane tasks such as cooking, cleaning, and other household chores – can readily carry out such tasks.

Robots such as these will definitely be beneficial in assisting the elderly and the disabled, though it remains to be seen when (and if) they will truly become available for use. Hopefully, these early tests will help us make such determinations.

The project is being called "RoboWatch" by the researchers. It is made possible by the common underlying structure found in most how-to videos as well as by the dearth of material available. There are hundreds of thousands of YouTube videos regarding several  basic topics, such as instructions on how to make an omelette.

After viewing numerous videos for a certain task, the robot can take note of what it found common in all and create the necessary instructions based on that.

Making Sense of the Information
Scanning several videos on the same how-to topic, a computer finds instructions they have in common and combines them into one step-by-step series. Credit: Cornell University

A key feature of their system is that it is "unsupervised," said Ozan Sener, graduate student and lead author of a paper on the video parsing method. Previously, a human needed to explain to the robot what the robot is observing – for example, to recognize objects a robot is shown pictures of the objects while a human identifies them by name.

Soon, a robot will be able to do that all on its own by looking up instructions itself.

When a robot must do a new task, its computerized brain will first search YouTube to find relevant videos on the subject. It has an algorithm that attempts to ignore videos that are not instructional and therefore pose no use for the robot. The computer then scans the videos thoroughly, looking for patterns while reading the subtitles for oft-used terms. It combines all these into a single sequence to produce instructions it can follow.

In similar projects, robots are being taught to receive verbal instructions from humans. In the future, it is hoped that information from other sources, such as Wikipedia, will also be accessible.

The knowledge acquired from the YouTube videos can be reached via RoboBrain, an online knowledge database accessible to robots anywhere for help on their jobs. The video below shows what was discovered.

