Efficient AI Computing,
Transforming the Future.

Projects

To choose projects, simply check the boxes of the categories, topics and techniques.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

MLSys 2024
 (
)

Low-bit weight-only quantization for LLMs.

LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models

ICLR 2024
 (
)

LongLoRA takes advantage of shifted sparse attention to greatly reduce the finetuning cost of long context LLMs.

Tiny Machine Learning Projects

NeurIPS 2020/2021/2022, MICRO 2023, ICML 2023, MLSys 2024, IEEE CAS Magazine 2023
 (
Feature
)

This TinyML project aims to enable efficient AI computing on the edge by innovating model compression techniques as well as high-performance system design.

Tiny Machine Learning: Progress and Futures [Feature]

IEEE CAS magazine
 (
feature
)

We discuss the definition, challenges, and applications of TinyML.