Efficient AI Computing,
Transforming the Future.

Projects

To choose projects, simply check the boxes of the categories, topics and techniques.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

GAN Compression: Efficient Architectures for Interactive Conditional GANs

CVPR 2020 & TPAMI
 (
)

A general-purpose compression framework for reducing the inference time and model size of the generator in conditional GANs.

SpArch: Efficient Architecture for Sparse Matrix Multiplication

HPCA 2020
 (
)

Hardware Accelerator for Sparse Matrix-Matrix Multiplication (SpGEMM)

Lite Transformer with Long-Short Range Attention

ICLR 2020
 (
)

Lite Transformer is an efficient mobile NLP architecture. The key primitive is the Long-Short Range Attention (LSRA), where one group of heads specializes in the local context modeling (by convolution) while another group specializes in the long-distance relationship modeling (by attention).

Once-for-All: Train One Network and Specialize it for Efficient Deployment

ICLR 2020
 (
)

OFA is an efficient AutoML technique that decouples model training from architecture search. Train only once, specialize for many hardware platforms, from CPU/GPU to hardware accelerators. OFA achieves a new SOTA 80.0% ImageNet top1 accuracy under the mobile setting (<600M FLOPs).