About
News
Publications
Blog
Course
Awards
Talks
Media
Team
Gallery
Efficient AI Computing,
Transforming the Future.
Publications
DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models
Muyang Li*, Tianle Cai*, Jiaxin Cao, Qinsheng Zhang, Han Cai, Junjie Bai, Yangqing Jia, Ming-Yu Liu, Kai Li, and Song Han
CVPR 2024
VILA: On Pre-training for Visual Language Models
Ji Lin*, Hongxu Yin*, Wei Ping, Yao Lu, Pavlo Molchanov, Andrew Tao, Huizi Mao, Jan Kautz, Mohammad Shoeybi, Song Han
CVPR 2024
Condition-Aware Neural Network for Controlled Image Generation
Han Cai, Muyang Li, Zhuoyang Zhang, Qinsheng Zhang, Ming-Yu Liu, Song Han
CVPR 2024
Efficient Streaming Language Models with Attention Sinks
Guangxuan Xiao¹, Yuandong Tian², Beidi Chen³, Song Han¹⁴, Mike Lewis²
ICLR 2024
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Ji Lin*, Jiaming Tang*, Haotian Tang, Shang Yang, Wei-Ming Chen, Wei-Chen Wang, Guangxuan Xiao, Xingyu Dang, Chuang Gan, and Song Han
MLSys 2024
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
Yukang Chen¹, Shengju Qian¹, Haotian Tang², Xin Lai¹, Zhijian Liu², Song Han², Jiaya Jia¹
ICLR 2024
Tiny Machine Learning Projects
Ji Lin, Ligeng Zhu, Wei-Ming Chen, Wei-Chen Wang, Han Cai, Guangxuan Xiao, Haotian Tang, Shang Yang, Yujun Lin, and Song Han
NeurIPS 2020/2021/2022, MICRO 2023, ICML 2023, MLSys 2024, IEEE CAS Magazine 2023
Tiny Machine Learning: Progress and Futures [Feature]
Ji Lin, Ligeng Zhu, Wei-Ming Chen, Wei-Chen Wang, and Song Han
IEEE CAS magazine
TorchSparse++: Efficient Training and Inference Framework for Sparse Convolution on GPUs
Haotian Tang*¹, Shang Yang*¹², Zhijian Liu¹, Ke Hong², Zhongming Yu³, Xiuyu Li⁴, Guohao Dai⁵, Yu Wang², Song Han¹
MICRO 2023
PockEngine: Sparse and Efficient Fine-tuning in a Pocket
Ligeng Zhu, Lanxiang Hu, Ji Lin, Wei-Chen Wang, Wei-Ming Chen, Chuang Gan, Song Han
MICRO 2023
EfficientViT: Multi-Scale Linear Attention for High-Resolution Dense Prediction
Han Cai, Junyan Li, Muyan Hu, Chuang Gan, Song Han
ICCV 2023
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
Guangxuan Xiao*¹, Ji Lin*¹, Mickael Seznec², Hao Wu², Julien Demouth², Song Han¹
ICML 2023
FlatFormer: Flattened Window Attention for Efficient Point Cloud Transformer
Zhijian Liu*, Xinyu Yang*, Haotian Tang, Shang Yang, Song Han
CVPR 2023
SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer
Xuanyao Chen*¹, Zhijian Liu*², Haotian Tang², Li Yi¹, Hang Zhao¹, Song Han²
CVPR 2023
Retrospective: EIE: Efficient Inference Engine on Sparse and Compressed Neural Network
Song Han¹³, Xingyu Liu⁴, Huizi Mao³ , Jing Pu⁵ , Ardavan Pedram²⁶ , Mark A. Horowitz² , William J. Dally²³
ISCA 2023
BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation
Zhijian Liu*, Haotian Tang*, Alexander Amini, Xinyu Yang, Huizi Mao, Daniela L. Rus, Song Han
ICRA 2023
Efficient Spatially Sparse Inference for Conditional GANs and Diffusion Models
Muyang Li¹, Ji Lin¹, Chenlin Meng³, Stefano Ermon³, Song Han¹, and Jun-Yan Zhu²
NeurIPS 2022 & TPAMI
TorchSparse: Efficient Point Cloud Inference Engine
Haotian Tang*, Zhijian Liu*, Xiuyu Li*, Yujun Lin, Song Han
MLSys 2022
QuantumNAT: Quantum Noise-Aware Training with Noise Injection, Quantization and Normalization
Hanrui Wang¹, Jiaqi Gu², Yongshan Ding³, Zirui Li⁴, David Z. Pan³, Frederic T. Chong⁵, Song Han¹
DAC 2022
QOC: Quantum On-Chip Training with Parameter Shift and Gradient Pruning
Hanrui Wang¹, Zirui Li², Jiaqi Gu³, Yongshan Ding⁴, Yujun Lin¹, David Z. Pan³, Frederic T. Chong⁵, Song Han¹
DAC 2022
On-Device Training Under 256KB Memory
Ji Lin*, Ligeng Zhu*, Wei-Ming Chen, Wei-Chen Wang, Chuang Gan, Song Han
NeurIPS 2022
Lite Pose: Efficient Architecture Design for 2D Human Pose Estimation
Yihan Wang, Muyang Li, Han Cai, Wei-Ming Chen, Song Han
CVPR 2022
QuantumNAS: Noise-Adaptive Search for Robust Quantum Circuits
Hanrui Wang¹, Yongshan Ding², Jiaqi Gu³, Zirui Li⁴, Yujun Lin¹, David Z. Pan³, Frederic T. Chong⁵, Song Han¹
HPCA 2022
Network Augmentation for Tiny Deep Learning
Han Cai, Chuang Gan, Ji Lin, Song Han
ICLR 2022
NAAS: Neural Accelerator Architecture Search
Yujun Lin *¹, Mengtian Yang *², Song Han¹
DAC 2021
MCUNetV2: Memory-Efficient Patch-based Inference for Tiny Deep Learning
Ji Lin, Wei-Ming Chen, Han Cai, Chuang Gan, Song Han
NeurIPS 2021
PointAcc: Efficient Point Cloud Accelerator
Yujun Lin, Zhekai Zhang, Haotian Tang, Hanrui Wang, Song Han
MICRO 2021
SemAlign: Annotation-Free Camera-LiDAR Calibration with Semantic Alignment Loss
Zhijian Liu*, Haotian Tang*, Sibo Zhu*, Song Han
IROS 2021
Anycost GANs for Interactive Image Synthesis and Editing
Ji Lin, Richard Zhang, Frieder Ganz, Song Han, Jun-Yan Zhu
CVPR 2021
SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning
Hanrui Wang, Zhekai Zhang, Song Han
HPCA 2021
Differentiable Augmentation for Data-Efficient GAN Training
Shengyu Zhao, Zhijian Liu, Ji Lin, Jun-Yan Zhu, Song Han
NeurIPS 2020
TinyTL: Reduce Activations, Not Trainable Parameters for Efficient On-Device Learning
Han Cai, Chuang Gan, Ligeng Zhu, Song Han
NeurIPS 2020
Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution
Haotian Tang*, Zhijian Liu*, Shengyu Zhao, Yujun Lin, Ji Lin, Hanrui Wang, Song Han
ECCV 2020
MCUNet: Tiny Deep Learning on IoT Devices
Ji Lin, Wei-Ming Chen, Yujun Lin, John Cohn, Chuang Gan, Song Han
NeurIPS 2020
APQ: Joint Search for Nerwork Architecture, Pruning and Quantization Policy
Tianzhe Wang, Kuan Wang, Han Cai, Ji Lin, Zhijian Liu, Song Han
CVPR 2020
GAN Compression: Efficient Architectures for Interactive Conditional GANs
Muyang Li¹, Ji Lin², Yaoyao Ding³, Zhijian Liu², Jun-Yan Zhu¹ and Song Han²
CVPR 2020 & TPAMI
SpArch: Efficient Architecture for Sparse Matrix Multiplication
Zhekai Zhang*, Hanrui Wang*, Song Han, William J. Dally
HPCA 2020
Once-for-All: Train One Network and Specialize it for Efficient Deployment
Han Cai, Chuang Gan, Tianzhe Wang, Zhekai Zhang, Song Han
ICLR 2020
Lite Transformer with Long-Short Range Attention
Zhanghao Wu*, Zhijian Liu*, Ji Lin, Yujun Lin, Song Han
ICLR 2020
HAT: Hardware-Aware Transformers for Efficient Natural Language Processing
Hanrui Wang¹, Zhanghao Wu¹, Zhijian Liu¹, Han Cai¹, Ligeng Zhu¹, Chuang Gan², Song Han¹
ACL 2020
Point-Voxel CNN for Efficient 3D Deep Learning
Zhijian Liu*, Haotian Tang*, Yujun Lin, Song Han
NeurIPS 2019
TSM: Temporal Shift Module for Efficient Video Understanding
Ji Lin¹, Chuang Gan², Song Han¹
ICCV 2019
HAQ: Hardware-Aware Automated Quantization
Kuan Wang*, Zhijian Liu*, Yujun Lin*, Ji Lin, and Song Han
CVPR 2019
ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware
Han Cai, Ligeng Zhu, Song Han
ICLR 2019
Deep Gradient Compression: Reducing the Communication Bandwidth in Distributed Training
Yujun Lin¹, Song Han², Huizi Mao², Yu Wang¹, William J. Dally²³
ICLR 2018
AMC: AutoML for Model Compression and Acceleration on Mobile Devices
Yihui He*, Ji Lin*, Zhijian Liu, Hanrui Wang, Li-Jia Li, and Song Han
ECCV 2018
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
Song Han, Huizi Mao, and William J. Dally
ICLR 2016
EIE: efficient inference engine on compressed deep neural network
Song Han, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark A. Horowitz, William J. Dally
ISCA 2016
Learning both Weights and Connections for Efficient Neural Network
Song Han, Jeff Pool, John Tran, William Dally
NIPS 2015