Efficient AI Computing,
Transforming the Future.

Projects

To choose projects, simply check the boxes of the categories, topics and techniques.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

APQ: Joint Search for Nerwork Architecture, Pruning and Quantization Policy

CVPR 2020
 (
)

APQ is an efficient AutoML framework for joint optimization of neural architecture, pruning, and quantization.

GAN Compression: Efficient Architectures for Interactive Conditional GANs

CVPR 2020 & TPAMI
 (
)

A general-purpose compression framework for reducing the inference time and model size of the generator in conditional GANs.

SpArch: Efficient Architecture for Sparse Matrix Multiplication

HPCA 2020
 (
)

Hardware Accelerator for Sparse Matrix-Matrix Multiplication (SpGEMM)

Lite Transformer with Long-Short Range Attention

ICLR 2020
 (
)

Lite Transformer is an efficient mobile NLP architecture. The key primitive is the Long-Short Range Attention (LSRA), where one group of heads specializes in the local context modeling (by convolution) while another group specializes in the long-distance relationship modeling (by attention).