This powerful brain chip is so effective it could bring advanced AI to your phone

AI and conventional computers are a match made in hell.

The main reason is the way the hardware chips are currently configured. Based on the traditional Von Neumann architecture, the chip isolates memory storage from its main processors. Every calculation is a nightmarish ride on a Monday morning, with the chip constantly shuttling between data in each bucket, forming a notorious “memory wall”.

If you’ve ever been stuck in traffic, you know the frustration: it takes time and wastes energy. As AI algorithms get more and more complex, the problem gets worse and worse.

So why not design a brain-based chip, a potential perfect match for deep neural networks?

Enter in-memory compute chips or CIMs. True to their name, these chips compute and store memory on the same site. Forget the journeys; the chips are highly efficient work-from-home alternatives, solving the data traffic bottleneck problem and promising increased efficiency and lower power consumption.

Or so goes the theory. Most CIM chips running AI algorithms have focused solely on chip design, showcasing their capabilities using simulations of the chip rather than running tasks on full-fledged hardware. The chips also struggle to adapt to several different AI tasks (image recognition, voice perception), which limits their integration into smartphones or other everyday devices.

This month, a study in Nature Improved CIM from scratch. Rather than just focusing on the chip design, the international team, led by neuromorphic hardware experts Dr. HS Philip Wong at Stanford and Dr. Gert Cauwenberghs at UC San Diego, optimized the whole from the configuration, from the technology to the architecture, including the algorithms that calibrate the hardware. .

The resulting NeuRRAM chip is a powerful neuromorphic computing juggernaut with 48 parallel cores and 3 million memory cells. Extremely versatile, the chip tackled several standard AI tasks, such as reading handwritten digits, identifying cars and other objects in images, and decoding voice recordings, with an accuracy of more than 84%.

Although the pass rate may seem poor, it rivals existing digital chips but saves power significantly. For the authors, this is a step closer to bringing AI directly to our devices rather than having to transport data to the cloud for computation.

“Performing these calculations on the chip instead of sending information to and from the cloud could enable faster, more secure, cheaper and more scalable AI in the future, and give more people access to the power of AI,” Wong said.

Neural inspiration

AI-specific chips are now an astonishing dime a dozen. From Google’s Tensor Processing Unit (TPU) and Tesla’s Dojo supercomputer architecture to Baidu and Amazon, tech giants are pouring millions into the AI ​​chip gold rush to build processors that support increasingly sophisticated deep learning algorithms. Some are even leveraging machine learning to design chip architectures suitable for AI software, bringing the race full circle.

One particularly intriguing concept comes straight from the brain. As data passes through our neurons, it “connects” to networks through physical “docks” called synapses. These structures, sitting at the top of neural branches like small mushrooms, are multitasking: they calculate and store data by modifying their protein composition.

In other words, neurons, unlike conventional computers, do not need to transfer data from memory to processors. This gives the brain its edge over digital devices: it’s very energy efficient and performs multiple calculations simultaneously, all packed into a three-pound jelly stuffed inside the skull.

Why not recreate aspects of the brain?

Enter neuromorphic computing. One hack was to use RRAMs or resistive random-access memory devices (also known as “memristors”). RRAMs store memory even when removed from power by changing the resistance of their hardware. Similar to synapses, these components can be grouped together in dense arrays over a tiny area, creating circuits capable of very complex calculations without clutter. When combined with CMOS, a manufacturing process for building circuitry in our current microprocessors and chips, the duo becomes even more powerful for running deep learning algorithms.

But this has a cost. “Highly parallel analog computing in the RRAM-CIM architecture brings higher efficiency, but it is difficult to achieve the same level of functional flexibility and computational accuracy as in digital circuits,” the authors said.

Optimization genius

The new study looked at every part of an RRAM-CIM chip, redesigning it for practical use.

It starts with technology. NeuRRAM has 48 cores that compute in parallel, with RRAM devices physically interleaved in CMOS circuitry. Like a neuron, each core can be turned off individually when not in use, conserving energy while its memory is stored in RRAM.

These RRAM cells, all three million, are linked together so that data can be transferred back and forth. It’s a crucial design, allowing the chip to flexibly adapt to several different types of AI algorithms, the authors explained. For example, one type of deep neural network, CNN (convolutional neural network), is particularly efficient in computer vision, but needs data to flow in only one direction. In contrast, LSTMs, a type of deep neural network often used for audio recognition, recursively process data to match signals over time. Like synapses, the chip codes how strongly one RRAM “neuron” connects to another.

This architecture made it possible to refine data flows to minimize traffic jams. Like expanding single-lane traffic to multiple lanes, the chip could duplicate a network’s current “memory” from the most computationally intensive problems, so that multiple cores analyze the problem simultaneously.

A final tweak to previous CIM chips was a stronger bridge between brain-like computing – often analog – and digital processing. Here, the chip uses a neural circuit that can easily convert the analog calculation into digital signals. It’s a step up from previous “power and area-intensive” setups, the authors explained.

The optimizations worked. Putting their theory to the test, the team fabricated the NeuRRAM chip and developed algorithms to program the hardware for different algorithms, like the Play Station 5 running different games.

In a host of benchmark tests, the chip performed like a champ. Running a seven-layer CNN on-chip, NeuRRAM had an error rate of less than 1% when recognizing handwritten digits using the popular MNIST database.

He also excelled at more difficult tasks. Loading another popular deep neural network, LSTM, the chip was around 85% correct when challenged with Google’s voice command recognition. Using only eight cores, the chip, running on another AI architecture, was able to recover noisy images, reducing errors by around 70%.

So what?

One word: energy.

Most AI algorithms are total energy hogs. NeuRRAM operated at half the power cost of previous state-of-the-art RRAM-CIM chips, further translating the promise of power savings with neuromorphic computing into reality.

But the particularity of the study is its strategy. Too often, when designing chips, scientists have to balance efficiency, versatility, and precision for multiple tasks, metrics that are often at odds with each other. The problem becomes even more difficult when all the computation is done directly on the hardware. NeuRRAM has shown that it is possible to fight all beasts at once.

The strategy used here can be used to optimize other neuromorphic computing devices such as phase-change memory technologies, the authors said.

For now, NeuRRAM is a proof of concept, showing that a physical chip, rather than a simulation of it, works as expected. But there is room for improvement, including further scaling of RRAMs and downsizing so that it can one day fit our phones.

“Maybe today it is used to perform simple AI tasks such as keyword detection or human detection, but tomorrow it could enable a completely different user experience. Imagine video analytics real-time technology combined with voice recognition in a small device,” said study author Dr. Weier Wan. “As a researcher and engineer, my ambition is to put research innovations laboratories.”

Image credit: David Baillot/University of California San Diego

Leave a Reply

%d bloggers like this: