Efficient AI Computing,
Transforming the Future.

Projects

To choose projects, simply check the boxes of the categories, topics and techniques.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

EIE: efficient inference engine on compressed deep neural network

ISCA 2016
 (
)

We propose an energy efficient inference engine (EIE) that performs inference on this compressed network model and accelerates the resulting sparse matrix-vector multiplication with weight sharing.

Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding

ICLR 2016
 (
)

We introduce “deep compression”, a three stage pipeline: pruning, trained quantization and Huffman coding, that work together to reduce the storage requirement of neural networks by 35× to 49× without affecting their accuracy.

Learning both Weights and Connections for Efficient Neural Network

NIPS 2015
 (
)

We describe a method to reduce the storage and computation required by neural networks by an order of magnitude without affecting their accuracy by learning only the important connections.