Researchers Taught This AI the Same Way Your Parents May Have Taught You
Rewards and punishments can go a long way in the machine world.
Evolution of Learning
The process of learning through trial and error is something that most humans take for granted, mainly because our brains have effectively evolved to ensure that we are able to learn new ideas as efficiently as possible. Humans are also capable of taking instructions and learning through guidance; this process is basically how parents teach kids basic language skills. By showing young children images, repeating lessons, and adding positive reinforcement, kids can effectively absorb information and eventually associate an image with a word.
While artificial intelligence (AI) systems are capable of learning through trial and error, scientists are now exploring whether they are also capable of learning based on natural language commands. The goal is to teach a robot to do things in a human way, which makes it both faster and more convenient for humans.
A Chinese tech company, Baidu, made a breakthrough: they were able to successfully teach a virtual agent its 2D environment using a combination of reward and punishment. Whenever the AI hit a wall, the AI was punished, and every single time it successfully located an object, it got a reward. Eventually, through repeated commands, the study showed that the AI was able to recognize the object associated with the word. The AI also developed a basic sense of grammar over the course of the study.
What AI Understands
According to Baidu:
Applying past knowledge to a new task is very easy for humans but still difficult for current end-to-end learning machines. Although machines may know what a “dragon fruit” looks like, they can’t perform the task “cut the dragon fruit with a knife“ unless they have been explicitly trained with the dataset containing this command. By contrast, our agent demonstrated the ability to transfer what they knew about the visual appearance of a dragon fruit as well as the task of “cut X with a knife” successfully, without explicitly being trained to perform “cut the dragon fruit with a knife.”
The research ultimately stands as proof of concept for the idea that algorithms are capable of learning language and navigation simultaneously, and can apply this retained knowledge in a manner similar to humans. The team now hopes to conduct their study in a 3D environment, and eventually use the technique to make AI that’s more intuitive and useful for real world applications. This is also a step towards AI becoming more human-like; while humans are experts in using our knowledge in different ways than how we initially learned it (for example, creating a sandcastle and a snowman), computers struggle. This shows that they are capable of learning like we do. While the initial study performed was simplistic, the implications are exciting for the future of AI.