About
News
Publications
Blog
Course
Awards
Talks
Media
Team
Gallery
Efficient AI Computing,
Transforming the Future.
Publications
TorchSparse++: Efficient Training and Inference Framework for Sparse Convolution on GPUs
Haotian Tang*¹, Shang Yang*¹², Zhijian Liu¹, Ke Hong², Zhongming Yu³, Xiuyu Li⁴, Guohao Dai⁵, Yu Wang², Song Han¹
MICRO 2023
PockEngine: Sparse and Efficient Fine-tuning in a Pocket
Ligeng Zhu, Lanxiang Hu, Ji Lin, Wei-Chen Wang, Wei-Ming Chen, Chuang Gan, Song Han
MICRO 2023
EfficientViT: Multi-Scale Linear Attention for High-Resolution Dense Prediction
Han Cai, Junyan Li, Muyan Hu, Chuang Gan, Song Han
ICCV 2023
Efficient Streaming Language Models with Attention Sinks
Guangxuan Xiao¹, Yuandong Tian², Beidi Chen³, Song Han¹, Mike Lewis²
arXiv
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
Yukang Chen¹, Shengju Qian¹, Haotian Tang², Xin Lai¹, Zhijian Liu², Song Han², Jiaya Jia¹
arXiv
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
Guangxuan Xiao*¹, Ji Lin*¹, Mickael Seznec², Hao Wu², Julien Demouth², Song Han¹
ICML 2023
FlatFormer: Flattened Window Attention for Efficient Point Cloud Transformer
Zhijian Liu*, Xinyu Yang*, Haotian Tang, Shang Yang, Song Han
CVPR 2023
SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer
Xuanyao Chen*¹, Zhijian Liu*², Haotian Tang², Li Yi¹, Hang Zhao¹, Song Han²
CVPR 2023
Retrospective: EIE: Efficient Inference Engine on Sparse and Compressed Neural Network
Song Han¹³, Xingyu Liu⁴, Huizi Mao³ , Jing Pu⁵ , Ardavan Pedram²⁶ , Mark A. Horowitz² , William J. Dally²³
ISCA 2023
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Ji Lin, Jiaming Tang, Haotian Tang, Shang Yang, Xingyu Dang, Song Han
arXiv 2023
BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation
Zhijian Liu*, Haotian Tang*, Alexander Amini, Xinyu Yang, Huizi Mao, Daniela L. Rus, Song Han
ICRA 2023
FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention
Guangxuan Xiao*¹, Tianwei Yin*¹, William T. Freeman¹, Frédo Durand¹, Song Han¹
arXiV
Offsite-Tuning: Transfer Learning without Full Model
Guangxuan Xiao¹, Ji Lin¹, Song Han¹
arXiV
Efficient Spatially Sparse Inference for Conditional GANs and Diffusion Models
Muyang Li¹, Ji Lin¹, Chenlin Meng³, Stefano Ermon³, Song Han¹, and Jun-Yan Zhu²
NeurIPS 2022 & TPAMI
TorchSparse: Efficient Point Cloud Inference Engine
Haotian Tang*, Zhijian Liu*, Xiuyu Li*, Yujun Lin, Song Han
MLSys 2022
QuantumNAT: Quantum Noise-Aware Training with Noise Injection, Quantization and Normalization
Hanrui Wang¹, Jiaqi Gu², Yongshan Ding³, Zirui Li⁴, David Z. Pan³, Frederic T. Chong⁵, Song Han¹
DAC 2022
QOC: Quantum On-Chip Training with Parameter Shift and Gradient Pruning
Hanrui Wang¹, Zirui Li², Jiaqi Gu³, Yongshan Ding⁴, Yujun Lin¹, David Z. Pan³, Frederic T. Chong⁵, Song Han¹
DAC 2022
On-Device Training Under 256KB Memory
Ji Lin*, Ligeng Zhu*, Wei-Ming Chen, Wei-Chen Wang, Chuang Gan, Song Han
NeurIPS 2022
Lite Pose: Efficient Architecture Design for 2D Human Pose Estimation
Yihan Wang, Muyang Li, Han Cai, Wei-Ming Chen, Song Han
CVPR 2022
QuantumNAS: Noise-Adaptive Search for Robust Quantum Circuits
Hanrui Wang¹, Yongshan Ding², Jiaqi Gu³, Zirui Li⁴, Yujun Lin¹, David Z. Pan³, Frederic T. Chong⁵, Song Han¹
HPCA 2022
Network Augmentation for Tiny Deep Learning
Han Cai, Chuang Gan, Ji Lin, Song Han
ICLR 2022
NAAS: Neural Accelerator Architecture Search
Yujun Lin *¹, Mengtian Yang *², Song Han¹
DAC 2021
MCUNetV2: Memory-Efficient Patch-based Inference for Tiny Deep Learning
Ji Lin, Wei-Ming Chen, Han Cai, Chuang Gan, Song Han
NeurIPS 2021
PointAcc: Efficient Point Cloud Accelerator
Yujun Lin, Zhekai Zhang, Haotian Tang, Hanrui Wang, Song Han
MICRO 2021
SemAlign: Annotation-Free Camera-LiDAR Calibration with Semantic Alignment Loss
Zhijian Liu*, Haotian Tang*, Sibo Zhu*, Song Han
IROS 2021
SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning
Hanrui Wang, Zhekai Zhang, Song Han
HPCA 2021
Differentiable Augmentation for Data-Efficient GAN Training
Shengyu Zhao, Zhijian Liu, Ji Lin, Jun-Yan Zhu, Song Han
NeurIPS 2020
TinyTL: Reduce Activations, Not Trainable Parameters for Efficient On-Device Learning
Han Cai, Chuang Gan, Ligeng Zhu, Song Han
NeurIPS 2020
Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution
Haotian Tang*, Zhijian Liu*, Shengyu Zhao, Yujun Lin, Ji Lin, Hanrui Wang, Song Han
ECCV 2020
MCUNet: Tiny Deep Learning on IoT Devices
Ji Lin, Wei-Ming Chen, Yujun Lin, John Cohn, Chuang Gan, Song Han
NeurIPS 2020
APQ: Joint Search for Nerwork Architecture, Pruning and Quantization Policy
Tianzhe Wang, Kuan Wang, Han Cai, Ji Lin, Zhijian Liu, Song Han
CVPR 2020
GAN Compression: Efficient Architectures for Interactive Conditional GANs
Muyang Li¹, Ji Lin², Yaoyao Ding³, Zhijian Liu², Jun-Yan Zhu¹ and Song Han²
CVPR 2020 & TPAMI
Once-for-All: Train One Network and Specialize it for Efficient Deployment
Han Cai, Chuang Gan, Tianzhe Wang, Zhekai Zhang, Song Han
ICLR 2020
HAT: Hardware-Aware Transformers for Efficient Natural Language Processing
Hanrui Wang¹, Zhanghao Wu¹, Zhijian Liu¹, Han Cai¹, Ligeng Zhu¹, Chuang Gan², Song Han¹
ACL 2020
Point-Voxel CNN for Efficient 3D Deep Learning
Zhijian Liu*, Haotian Tang*, Yujun Lin, Song Han
NeurIPS 2019
TSM: Temporal Shift Module for Efficient Video Understanding
Ji Lin¹, Chuang Gan², Song Han¹
ICCV 2019
ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware
Han Cai, Ligeng Zhu, Song Han
ICLR 2019
Deep Gradient Compression: Reducing the Communication Bandwidth in Distributed Training
Yujun Lin¹, Song Han², Huizi Mao², Yu Wang¹, William J. Dally²³
ICLR 2018