IBM Develops Hardware-Based Vector-Symbolic AI Architecture

IBM Research recently announced a memory-augmented neural network (MANN) AI system consisting of a neural network controller and phase-change memory (PCM) hardware. By performing analog in-memory computation on high-dimensional (HD) binary vectors, the system learns few-shot classification tasks on the Omniglot benchmark with only 2.7% accuracy drop compared to 32-bit software implementations.

The team described the system and a set of experiments in a paper published in Nature Communications. The neural network component of the system learns an attention mechanism that maps inputs to keys, which are then used to query the most similar value from memory. Drawing inspiration from vector-symbolic computing, the researchers chose to represent keys as high-dimensional binary vectors---that is, vectors whose elements are only either 1 or 0. The keys are stored in a content-addressable memory, and their binary representation allows for an efficient O(1) complexity hardware implementation of the similarity query. In a few-shot image-classification experiment using the Omniglot dataset, the system achieved 91.83% accuracy, compared to 94.53% achieved by a software implementation using 32-bit real-valued vectors. According to the authors,

The critical insight provided by our work, namely, directed engineering of HD vector representations as explicit memory for MANNs, facilitates efficient few-shot learning tasks using in-memory computing. It could also enable applications beyond classification such as symbolic-level fusion, compression, and reasoning.

Although deep learning models are often quite good at generalizing from example data, they sometimes exhibit catastrophic forgetting: a loss of previously-learned data that can occur when the model learns new information. To address this problem, MANN systems incorporate an external content-addressable memory that stores keys, or learned patterns, and values, which are outputs associated with the patterns. The neural network component of a MANN system then learns to map an input to a key-based query of the memory; an attention mechanism then uses the results of the query to produce a final output. In particular, the network learns to map dissimilar input items to keys that are nearly orthogonal---that is, with a low cosine similarity. However, to perform a query from the memory, a query key must be compared, by applying a cosine similarity, to all keys to find the best matching key in the memory. This creates a performance bottleneck in the system.

To solve this performance bottleneck, the IBM researchers used a representation for the keys derived from vector-symbolic architectures (VSA). In these systems, concepts or symbols, such as "blue color" or "square shape", are represented as vectors with very high dimension---on the order of thousands. This high dimension means that randomly chosen vectors are likely to be orthogonal, which makes the cosine similarity computation robust to noise. Because of this, the team showed that the vector components could be "clipped" to be simply either +1 or -1, and further could be stored as either 1 or 0. These vectors are then stored in a crossbar array of memristive devices, which provides an efficient hardware implementation of the cosine similarity.

To demonstrate the system's performance, the researchers used it to perform a few-shot learning image classification tasks. The network was trained on a subset of the Omniglot dataset, which contains images of handwritten characters from 50 different alphabets. The training data consisted of five examples, each of 100 image classes, which the authors claim is the "largest problem ever tried" on Omniglot.

In a discussion about the work on Hacker News, users noted the trend of "hybrid" AI systems that combine neural networks with other AI techniques:

I think old AI led to the first AI winter because it had poor mechanisms to deal with uncertainty. However, lots of mechanisms in old expert systems will make a comeback once we know how to combine symbolic with neural and probabilistic systems. Just take a look at The Art of Prolog. Many ideas there are getting reused in modern inductive logic and answer-set programming systems.

InfoQ has recently covered several such hybrid systems, as well as more efficient hardware implementations of AI computation. Earlier this year, Baidu announced their ERNIE model that combines deep learning on unstructured text with structured knowledge graph data, to produce more coherent generated responses, and Facebook open-sourced their BlenderBot chatbot that incorporates a long-term memory that can track conversation context over several weeks or even months. MIT recently announced a prototype photonic device for deep-learning inference that reduces energy consumption compared to traditional electronic devices.

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

InfoQ Article Contest

Rate this Article

This content is in the AI, ML & Data Engineering topic

Related Topics:

Related Editorial

Related Sponsored Content

Popular across InfoQ

The InfoQ Newsletter