Efficient AI Computing,
Transforming the Future.

Team

Principle Investigator

Song Han is an Associate Professor at MIT EECS. He received his PhD degree from Stanford University. His research focuses on efficient deep learning computing. He proposed “deep compression” technique that can reduce neural network size by an order of magnitude without losing accuracy, and the hardware implementation “efficient inference engine” that first exploited pruning and weight sparsity in deep learning accelerators. His team’s work on hardware-aware neural architecture search (ProxylessNAS, Once-for-All Network (OFA), MCUNet) was highlighted by MIT News, Wired, Qualcomm News, VentureBeat, IEEE Spectrum, integrated in PyTorch and AutoGluon, received six low-power computer vision contest awards in flagship AI conferences, and a world-record in the open division of MLPerf inference benchmark (1.078M Img/s). Song received Best Paper awards at ICLR’16 and FPGA’17, Amazon Machine Learning Research Award, SONY Faculty Award, Facebook Faculty Award, NVIDIA Academic Partnership Award. Song was named “35 Innovators Under 35” by MIT Technology Review for his contribution on “deep compression” technique that “lets powerful artificial intelligence (AI) programs run more efficiently on low-power mobile devices.” Song received the NSF CAREER Award for “efficient algorithms and hardware for accelerated machine learning” and the IEEE “AIs 10 to Watch: The Future of AI” award.

Song’s cutting-edge research in efficient AI computing has profoundly influenced the industry. He was the cofounder of DeePhi (now part of AMD), and cofounder of OmniML (now part of NVIDIA).

Team Members

Postdoctoral

His research focuses on efficient deep learning, TinyML, embedded systems, and memory/storage systems. Wei-Chen has received several accolades for his work, including the ACM/IEEE CODES+ISSS Best Paper Award, the IEEE NVMSA Best Paper Award, and the Best Poster Award at the NSF Athena AI Institute. In addition, he received first place (among 150 teams) in the flash consumption track of the ACM/IEEE TinyML Design Contest at ICCAD 2022. His research has received over 1,300 stars on GitHub, and his work "On-device training under 256KB memory" (MCUNetV3) was highlighted by the MIT homepage.

Ph.D

His research interests focus on the development of efficient algorithms and systems for deep learning, specifically large foundation models. His work has received over 7000 stars on GitHub. His work has a real-world impact: SmoothQuant has been integrated into NVIDIA's FasterTransformer, Intel's NeuralCompressor, and is utilized in the LLMs of industry companies like Amazon, Meta, and Huggingface.

Han Cai is a fifth-year Ph.D. student. His research focuses on algorithms and applications of efficient deep learning. He is a recipient of the Qualcomm Innovation Fellowship.

Hanrui Wang is a final-year Ph.D. student at MIT EECS advised by Prof. Song Han. His research focuses on quantum computer architecture, ML for quantum. His research has been recognized by ACM student research competition 1st place award, best poster award at NSF AI Institute, Best Presentation Award as a DAC Young Fellow and appears in top conferences such as MICRO, HPCA, DAC, ICCAD and NeurIPS. His co-authored paper received ICML RL4RL Best Paper Award. He is the recipient of Qualcomm Fellowship, Unitary Fund, and Nvidia Fellowship Finalist. He is the creator of TorchQuantum library which has been adopted by IBM Qiskit Ecosystem and Nvidia cuQuantum Appliance. He is also the co-founder of QuCS lecture series for quantum education.

His research interests lie at the intersection of computer systems and machine learning, specifically in the area of full-stack efficient 3D deep learning for autonomous driving. Haotian has authored multiple papers with over 1,300 citations and over 4,200 GitHub stars in this field. Haotian has successfully advised several undergraduate students, and the intern students he mentored have continued as PhD students at MIT and UC Berkeley.

His research focuses on the intersection of efficient deep learning systems and algorithms. His work "On-device training under 256KB memory" (MCUNetV3) is highlighted by the MIT homepage in fall 2022. Ligeng's projects have been integrated into frameworks such as PyTorch and AutoGluon, and have been covered by media outlets including MIT News and IEEE Spectrum. He was awarded the Qualcomm Innovation Fellowship, and his research has received over 3,100 citations on Google Scholar and over 8,000 stars on GitHub.

Muyang Li is a first-year Ph.D. student at MIT, advised by Prof. Song Han. He obtained his master’s degree at Robotics Institute, CMU, advised by Prof. Jun-Yan Zhu, and his Bachelor’s degree from Zhiyuan College (ACM Class), Shanghai Jiao Tong University. His research interest is in the intersection of machine learning, system, and computer graphics. He is currently working on building efficient and hardware-friendly generative models with its applications in computer vision and graphics.

Shang Yang is a first-year Ph.D. student at MIT, advised by Prof. Song Han. He received his B.Eng. degree from Tsinghua University. His research interests lie in efficient systems for machine learning applications, such as LLMs and Point Clouds.

His research focuses on the intersection of computer architecture and machine learning, particularly the co-design of software and hardware for deep learning and its applications. Yujun was awarded the 2021 Qualcomm Innovation Fellowship, and he is the founding member of the new course on TinyML and efficient deep learning computing (MIT 6.S965) teaching crew, which received 12k views on YouTube.

His research focuses on the development of high-performance and efficient hardware architectures for sparse linear algebra and deep learning. Zhekai has published several papers in the field of computer architecture, which have received over 250 citations.

His research focuses on algorithms, systems and applications for efficient deep learning. Zhijian is a recipient of the Qualcomm Innovation Fellowship and ML and Systems Rising Stars. He received his B.Eng. degree from Shanghai Jiao Tong University.

Master

She completed her undergrad in CS at MIT in 2023, and is currently an MEng student interested in performant systems and machine learning.

Undergraduate

Angela Li is a fourth-year undergraduate student studying Computer Science and Economics. Her research and interests lie in natural language processing and machine learning.

Maggie Liu is a first-year undergraduate student who loves competitive programming and hackathons and is interested in machine learning.

Visiting

Jiaming Tang is a visiting student at MIT HAN Lab. He is an undergraduate at ACM Honors Class, Shanghai Jiao Tong University. His research interests lie in efficient systems and algorithms for large language models.

Graduated

Ph.D

His research focuses on efficient deep learning computing, systems for ML and recently, accelerating large language models (LLMs). Ji is pioneering the research in the field of TinyML. His research has received over 6,500 citations on Google Scholar and over 7,000 stars on GitHub. Ji is an NVIDIA Graduate Fellowship Finalist in 2020, and Qualcomm Innovation Fellowship recipient in 2022.

Postdoctoral

Wei-Ming Chen is a Postdoctoral Associate at MIT EECS advised by Professor Song Han. His research focuses on TinyML, embedded systems, and real-time systems, with a particular emphasis on enabling efficient deep learning on Internet of Things (IoT) devices, such as microcontrollers. Chen's recent work on the MCUNet series (MCUNet, MCUNetv2, and MCUNetv3) has enabled efficient inference and training on devices with limited memory through the co-design of systems and algorithms. He is also a key contributor and maintainer of TinyEngine, an open-source library for high-performance and memory-efficient deep learning on microcontrollers. His work "On-device training under 256KB memory" (MCUNetV3) is highlighted by the MIT homepage in fall 2022. He received first place (among 150 teams) in the flash consumption track of the ACM/IEEE TinyML Design Contest at ICCAD 2022. He developed TinyChatEngine that enables LLM inference on the edge (laptop, Paspberry PI). His research has received more than 1,000 stars on GitHub. After graduation, he joined NVIDIA as a senior deep learning engineer working on large language model acceleration.

Master

Kevin Shao was an M.Eng student at MIT HAN Lab, working on autonomous driving and efficient 3D deep learning. After graduation, he joined Two Sigma as a Quantitative Researcher.

Master

Driss Hafdi was an M.Eng student at MIT EECS, working on specialized hardware for mixed-precision quantization. After graduation, he joined Hudson River Trading as an FPGA developer.

Openings

If you work on efficient LLM, VLM, GenAI and are interested in joining us, please fill in the recruiting form. Inquiry emails will not be replied if the recruiting form is incomplete. PhD applicants: select "ML+System" track in the MIT PhD application system.

Sponsors

We actively collaborate with industry partners on efficient AI, model compression and acceleration. Our research has influenced and landed in many industrial products: Intel OpenVino, Intel Neural Network Distiller, Intel Neural Compressor, Apple Neural Engine, NVIDIA Sparse Tensor Core, NVIDIA TensorRT LLM, AMD-Xilinx Vitis AI, Qualcomm AI Model Efficiency Toolkit (AIMET), Amazon AutoGluon, Facebook PyTorch, Microsoft NNI, SONY Neural Architecture Search Library, SONY Model Compression Toolkit,  ADI MAX78000/MAX78002 Model Training and Synthesis Tool.