Efficient AI Computing,
Transforming the Future.

Projects

To choose projects, simply check the boxes of the categories, topics and techniques.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models

ICLR 2024
 (
)

LongLoRA takes advantage of shifted sparse attention to greatly reduce the finetuning cost of long context LLMs.

Tiny Machine Learning Projects

NeurIPS 2020/2021/2022, MICRO 2023, ICML 2023, MLSys 2024, IEEE CAS Magazine 2023
 (
Feature
)

This TinyML project aims to enable efficient AI computing on the edge by innovating model compression techniques as well as high-performance system design.

Tiny Machine Learning: Progress and Futures [Feature]

IEEE CAS magazine
 (
feature
)

We discuss the definition, challenges, and applications of TinyML.

PockEngine: Sparse and Efficient Fine-tuning in a Pocket

MICRO 2023
 (
)

This project introduce PockEngine: a tiny, sparse and efficient engine to enable fine-tuning on various edge devices. PockEngine supports sparse backpropagation: it prunes the backward graph and sparsely updates the model with measured memory saving and latency reduction while maintaining the model quality.