Efficient AI Computing,
Transforming the Future.

Publications

TorchSparse++: Efficient Training and Inference Framework for Sparse Convolution on GPUs

Haotian Tang*¹, Shang Yang*¹², Zhijian Liu¹, Ke Hong², Zhongming Yu³, Xiuyu Li⁴, Guohao Dai⁵, Yu Wang², Song Han¹
MICRO 2023

PockEngine: Sparse and Efficient Fine-tuning in a Pocket

Ligeng Zhu, Lanxiang Hu, Ji Lin, Wei-Chen Wang, Wei-Ming Chen, Chuang Gan, Song Han
MICRO 2023

EfficientViT: Multi-Scale Linear Attention for High-Resolution Dense Prediction

Han Cai, Junyan Li, Muyan Hu, Chuang Gan, Song Han
ICCV 2023

Efficient Streaming Language Models with Attention Sinks

Guangxuan Xiao¹, Yuandong Tian², Beidi Chen³, Song Han¹, Mike Lewis²
arXiv

LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models

Yukang Chen¹, Shengju Qian¹, Haotian Tang², Xin Lai¹, Zhijian Liu², Song Han², Jiaya Jia¹
arXiv

SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

Guangxuan Xiao*¹, Ji Lin*¹, Mickael Seznec², Hao Wu², Julien Demouth², Song Han¹
ICML 2023

FlatFormer: Flattened Window Attention for Efficient Point Cloud Transformer

Zhijian Liu*, Xinyu Yang*, Haotian Tang, Shang Yang, Song Han
CVPR 2023

SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer

Xuanyao Chen*¹, Zhijian Liu*², Haotian Tang², Li Yi¹, Hang Zhao¹, Song Han²
CVPR 2023

Retrospective: EIE: Efficient Inference Engine on Sparse and Compressed Neural Network

Song Han¹³, Xingyu Liu⁴, Huizi Mao³ , Jing Pu⁵ , Ardavan Pedram²⁶ , Mark A. Horowitz² , William J. Dally²³
ISCA 2023

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Ji Lin, Jiaming Tang, Haotian Tang, Shang Yang, Xingyu Dang, Song Han
arXiv 2023

BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation

Zhijian Liu*, Haotian Tang*, Alexander Amini, Xinyu Yang, Huizi Mao, Daniela L. Rus, Song Han
ICRA 2023

FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention

Guangxuan Xiao*¹, Tianwei Yin*¹, William T. Freeman¹, Frédo Durand¹, Song Han¹
arXiV

Offsite-Tuning: Transfer Learning without Full Model

Guangxuan Xiao¹, Ji Lin¹, Song Han¹
arXiV

Efficient Spatially Sparse Inference for Conditional GANs and Diffusion Models

Muyang Li¹, Ji Lin¹, Chenlin Meng³, Stefano Ermon³, Song Han¹, and Jun-Yan Zhu²
NeurIPS 2022 & TPAMI

TorchSparse: Efficient Point Cloud Inference Engine

Haotian Tang*, Zhijian Liu*, Xiuyu Li*, Yujun Lin, Song Han
MLSys 2022

QuantumNAT: Quantum Noise-Aware Training with Noise Injection, Quantization and Normalization

Hanrui Wang¹, Jiaqi Gu², Yongshan Ding³, Zirui Li⁴, David Z. Pan³, Frederic T. Chong⁵, Song Han¹
DAC 2022

QOC: Quantum On-Chip Training with Parameter Shift and Gradient Pruning

Hanrui Wang¹, Zirui Li², Jiaqi Gu³, Yongshan Ding⁴, Yujun Lin¹, David Z. Pan³, Frederic T. Chong⁵, Song Han¹
DAC 2022

On-Device Training Under 256KB Memory

Ji Lin*, Ligeng Zhu*, Wei-Ming Chen, Wei-Chen Wang, Chuang Gan, Song Han
NeurIPS 2022

Lite Pose: Efficient Architecture Design for 2D Human Pose Estimation

Yihan Wang, Muyang Li, Han Cai, Wei-Ming Chen, Song Han
CVPR 2022

QuantumNAS: Noise-Adaptive Search for Robust Quantum Circuits

Hanrui Wang¹, Yongshan Ding², Jiaqi Gu³, Zirui Li⁴, Yujun Lin¹, David Z. Pan³, Frederic T. Chong⁵, Song Han¹
HPCA 2022

Network Augmentation for Tiny Deep Learning

Han Cai, Chuang Gan, Ji Lin, Song Han
ICLR 2022

NAAS: Neural Accelerator Architecture Search

Yujun Lin *¹, Mengtian Yang *², Song Han¹
DAC 2021

MCUNetV2: Memory-Efficient Patch-based Inference for Tiny Deep Learning

Ji Lin, Wei-Ming Chen, Han Cai, Chuang Gan, Song Han
NeurIPS 2021

PointAcc: Efficient Point Cloud Accelerator

Yujun Lin, Zhekai Zhang, Haotian Tang, Hanrui Wang, Song Han
MICRO 2021

SemAlign: Annotation-Free Camera-LiDAR Calibration with Semantic Alignment Loss

Zhijian Liu*, Haotian Tang*, Sibo Zhu*, Song Han
IROS 2021

Differentiable Augmentation for Data-Efficient GAN Training

Shengyu Zhao, Zhijian Liu, Ji Lin, Jun-Yan Zhu, Song Han
NeurIPS 2020

Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution

Haotian Tang*, Zhijian Liu*, Shengyu Zhao, Yujun Lin, Ji Lin, Hanrui Wang, Song Han
ECCV 2020

MCUNet: Tiny Deep Learning on IoT Devices

Ji Lin, Wei-Ming Chen, Yujun Lin, John Cohn, Chuang Gan, Song Han
NeurIPS 2020

APQ: Joint Search for Nerwork Architecture, Pruning and Quantization Policy

Tianzhe Wang, Kuan Wang, Han Cai, Ji Lin, Zhijian Liu, Song Han
CVPR 2020

GAN Compression: Efficient Architectures for Interactive Conditional GANs

Muyang Li¹, Ji Lin², Yaoyao Ding³, Zhijian Liu², Jun-Yan Zhu¹ and Song Han²
CVPR 2020 & TPAMI

Once-for-All: Train One Network and Specialize it for Efficient Deployment

Han Cai, Chuang Gan, Tianzhe Wang, Zhekai Zhang, Song Han
ICLR 2020

HAT: Hardware-Aware Transformers for Efficient Natural Language Processing

Hanrui Wang¹, Zhanghao Wu¹, Zhijian Liu¹, Han Cai¹, Ligeng Zhu¹, Chuang Gan², Song Han¹
ACL 2020

Point-Voxel CNN for Efficient 3D Deep Learning

Zhijian Liu*, Haotian Tang*, Yujun Lin, Song Han
NeurIPS 2019

TSM: Temporal Shift Module for Efficient Video Understanding

Ji Lin¹, Chuang Gan², Song Han¹
ICCV 2019

Deep Gradient Compression: Reducing the Communication Bandwidth in Distributed Training

Yujun Lin¹, Song Han², Huizi Mao², Yu Wang¹, William J. Dally²³
ICLR 2018