Efficient AI Computing,
Transforming the Future.

Projects

To choose projects, simply check the boxes of the categories, topics and techniques.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

ICML 2023
 (
)

We propose SmoothQuant, a training-free, accuracy-preserving, and general-purpose post-training quantization (PTQ) solution to enable 8-bit weight, 8-bit activation (W8A8) quantization for LLMs.

SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer

CVPR 2023
 (
)

Vision transformer on high-resolution images can learn richer visual representation. However, the improved performance comes at the cost of huge computation complexity. Thus, we present SparseViT, which accelerates high-resolution visual processing by skipping less important regions during computation.

FlatFormer: Flattened Window Attention for Efficient Point Cloud Transformer

CVPR 2023
 (
)

We present FlatFormer, an efficient ViT architecture for large-scale point cloud analysis.

Retrospective: EIE: Efficient Inference Engine on Sparse and Compressed Neural Network

ISCA 2023
 (
)

EIE proposed to accelerate pruned and compressed neural networks, exploiting weight sparsity, activation sparsity, and 4-bit weight-sharing in neural network accelerators.