DeepMind Develops a Neural Network That Can Make Sense of Objects Around It

The better to see you with.

6. 15. 17 by Christianna Reedy
kwanchaift/Adobe Stock

The Relationship of Things

One of the many abilities of the human mind that we tend to take for granted is making sense of objects by relating them to one another. The mind can tell where an object is relative to what’s around it. This inference the mind applies to reality is called relational reasoning. Now, it looks like an artificial intelligence (AI) might just be able to pull it off as well.

Click to View Full Infographic

It’s no small feat for an AI to learn this 3-dimensional sense. Researchers from DeepMind, the AI arm of Google’s parent company, Alphabet, were able to develop a relational reasoning module for neural networks. The effort is part of developing AI systems capable of cognition with the same flexibility and efficiency that the human mind possesses.

This relation network (RN) module, which could be plugged into other neural networks, could allow AI to analyze pairs of objects and deduce relationships between them. Details of their work were published online in two separate studies.

From Abstractions to Realizations

The researchers trained their RN using images of 3D shapes in various colors and sizes. After it analyzed the objects, the researchers asked the neural network questions like, “What size is the cylinder that is left of the brown metal thing that is left of the big sphere?” using a visual question answering task called CLEVR. The results were impressive.


A sample image from CLEVR. Image Credit: DeepMind

“State-of-the-art results on CLEVR using standard visual question answering architectures are 68.5 percent, compared to 92.5 percent for humans,” according to a post by DeepMind. “But using our RN-augmented network, we were able to show super-human performance of 95.5 percent.”

Such a system could greatly improve visual learning algorithms as well as the AI in virtual assistants. “You can imagine an application that automatically describes what is happening in a particular image, or even video for a visually impaired person,” DeepMind researcher Adam Santoro told New Scientist in an interview.

As clever as this system could be, DeepMind researchers believe there’s still a long way to go before it could be used in our daily lives. “There is a lot of work needed to solve richer real-world data sets,” Santoro added in the interview.

As a Futurism reader, we invite you join the Singularity Global Community, our parent company’s forum to discuss futuristic science & technology with like-minded people from all over the world. It’s free to join, sign up now!


Share This Article

Keep up.
Subscribe to our daily newsletter to keep in touch with the subjects shaping our future.
I understand and agree that registration on or use of this site constitutes agreement to its User Agreement and Privacy Policy


Copyright ©, Singularity Education Group All Rights Reserved. See our User Agreement, Privacy Policy and Cookie Statement. The material on this site may not be reproduced, distributed, transmitted, cached or otherwise used, except with prior written permission of Futurism. Fonts by Typekit and Monotype.