MCUNet: Tiny Deep Learning on IoT Devices

Ji Lin 1 , Wei-Ming Chen 1,2 , Yujun Lin 1 , John Cohn 3 , Chuang Gan 3 , Song Han 1
Massachusetts Institute of Technology, National Taiwan University, MIT-IBM Watson AI Lab


Machine learning on tiny IoT devices based on microcontroller units (MCU) is appealing but challenging: the memory of microcontrollers is 2-3 orders of magni-tude smaller even than mobile phones. We propose MCUNet, a framework that jointly designs the efficient neural architecture (TinyNAS) and the lightweight infer-ence engine (TinyEngine), enabling ImageNet-scale inference on microcontrollers.TinyNAS adopts a two-stage neural architecture search approach that first opti-mizes the search space to fit the resource constraints, then specializes the networkarchitecture in the optimized search space. TinyNAS can automatically handle diverse constraints (i.e. device, latency, energy, memory) under low search costs. TinyNAS is co-designed with TinyEngine, a memory-efficient inference library to expand the search space and fit a larger model. TinyEngine adapts the memory scheduling according to the overall network topology rather than layer-wise optimization, reducing the memory usage by 3.4x, and accelerating the inference by 1.7-3.3x compared to TF-Lite Micro and CMSIS-NN. MCUNet is the first to achieves >70% ImageNet top1 accuracy on an off-the-shelf commercial microcontroller, using 3.5x less SRAM and 5.7x less Flash compared to quantized MobileNetV2 and ResNet-18. On visual&audio wake words tasks, MCUNet achieves state-of-the-art accuracy and runs 2.4-3.4x faster than MobileNetV2and ProxylessNAS-based solutions with 3.7-4.1x smaller peak SRAM. Our study suggests that the era of always-on tiny machine learning on IoT devices has arrived.

Challenge: Memory Too Small to Hold DNNs

Existing Methods Reduce Model Size, but not the Activation Size

MCUNet: System-Algorithm Co-design

1. TinyNAS: Two-Stage NAS for Tiny Memory

2. TinyEngine: Memory-Efficient Inference Library

Experimental Results


  title={MCUNet: Tiny Deep Learning on IoT Devices},
  author={Lin, Ji and Chen, Wei-Ming and Cohn, John and Gan, Chuang and Han, Song},
  booktitle={Annual Conference on Neural Information Processing Systems (NeurIPS)},


Acknowledgments: We thank MIT Satori cluster for providing the computation resource. We thank MIT-IBM Watson AILab, Qualcomm, NSF CAREER Award #1943349 and NSF RAPID Award #2027266 for supportingthis research..