New Processor Chips Promise Faster Neural Network Learning

These chips could fetch data as fast as it is being processed, thus speeding up processing and learning times.

By Jelor Gallego

Published Apr 4, 2016 10:00 AM EDT

Image: GentSide

Accelerating Learning

Deep neural networks (DNN), like Google’s DeepMind or the IBM Watson, are amazing machines. They can be taught to do many mental tasks like a human, and they represent our best shot to actual artificial intelligence.

The challenge has always been training and teaching these machines. For most of the tasks they have to do, the machines tie up big-ticket supercomputers or data centers for days at a time. But scientists from IBM’s T.J. Watson Research Center are poised to change all that.

They have proposed the use of resistive processing units (RPUs) to speed up learning times drastically and cut horsepower requirements. RPUs are theoretical chips that combine CPU and non-volatile memory. The RPUs make use of an existing new technology: resistive RAM. RPUs are slated to put large amounts of resistive RAM directly onto a CPU. In theory, these chips could fetch data as fast as it is being processed, thus speeding up learning times.

According to the paper documenting the study, “problems that currently require days of training on a datacenter-size cluster with thousands of machines can be addressed within hours on a single RPU accelerator.

“For large DNNs with about 1 billion weights this massively parallel RPU architecture can
achieve acceleration factors of 30,000 compared to state-of-the-art microprocessors while providing power efficiency.”

The Need for Speed

Current DNNs will need this upgrade, as modern neural networks must perform billions of tasks simultaneously. That requires numerous CPU memory calls, which quickly adds up over billions of cycles. This gets more pronounced as harder tasks, such as natural voice recognition and true AI, are put on the table.

Currently, the chips are still undergoing research, but scientists believe they can be produced using regular CMOS technology. Once developed, they will be able to tackle Big Data problems such as natural speech recognition and translation between all world languages; real-time analytics on large streams of business and scientific data; and integration and analysis of multimodal sensory data flows from a massive number of IoT (Internet of Things) sensors.