In BriefResearchers from DeepMind have allowed artificially intelligent systems to think more like humans by developing a neural network that specializes in relational reasoning. The neural network allows AIs to make sense of objects in a 3D environment.
The Relationship of Things
One of the many abilities of the human mind that we tend to take for granted is making sense of objects by relating them to one another. The mind can tell where an object is relative to what’s around it. This inference the mind applies to reality is called relational reasoning. Now, it looks like an artificial intelligence (AI) might just be able to pull it off as well.
It’s no small feat for an AI to learn this 3-dimensional sense. Researchers from DeepMind, the AI arm of Google’s parent company, Alphabet, were able to develop a relational reasoning module for neural networks. The effort is part of developing AI systems capable of cognition with the same flexibility and efficiency that the human mind possesses.
This relation network (RN) module, which could be plugged into other neural networks, could allow AI to analyze pairs of objects and deduce relationships between them. Details of their work were published online in two separate studies.
From Abstractions to Realizations
The researchers trained their RN using images of 3D shapes in various colors and sizes. After it analyzed the objects, the researchers asked the neural network questions like, “What size is the cylinder that is left of the brown metal thing that is left of the big sphere?” using a visual question answering task called CLEVR. The results were impressive.
“State-of-the-art results on CLEVR using standard visual question answering architectures are 68.5 percent, compared to 92.5 percent for humans,” according to a post by DeepMind. “But using our RN-augmented network, we were able to show super-human performance of 95.5 percent.”
Such a system could greatly improve visual learning algorithms as well as the AI in virtual assistants. “You can imagine an application that automatically describes what is happening in a particular image, or even video for a visually impaired person,” DeepMind researcher Adam Santoro told New Scientist in an interview.
As clever as this system could be, DeepMind researchers believe there’s still a long way to go before it could be used in our daily lives. “There is a lot of work needed to solve richer real-world data sets,” Santoro added in the interview.