Network Augmentation for Tiny Deep Learning

Han Cai, Chuang Gan, Ji Lin, Song Han
MIT, MIT-IBM Watson AI Lab
(* indicates equal contribution)


Waiting for more news.


No items found.

Competition Awards

No items found.


We introduce Network Augmentation (NetAug), a new training method for improving the performance of tiny neural networks. Existing regularization techniques (e.g., data augmentation, dropout) have shown much success on large neural networks by adding noise to overcome over-fitting. However, we found these techniques hurt the performance of tiny neural networks. We argue that training tiny models are different from large models: rather than augmenting the data, we should augment the model, since tiny models tend to suffer from under-fitting rather than over-fitting due to limited capacity. To alleviate this issue, NetAug augments the network (reverse dropout) instead of inserting noise into the dataset or the network. It puts the tiny model into larger models and encourages it to work as a sub-model of larger models to get extra supervision, in addition to functioning as an independent model. At test time, only the tiny model is used for inference, incurring zero inference overhead. We demonstrate the effectiveness of NetAug on image classification and object detection. NetAug consistently improves the performance of tiny models, achieving up to 2.2% accuracy improvement on ImageNet. On object detection, achieving the same level of performance, NetAug requires 41% fewer MACs on Pascal VOC and 38% fewer MACs on COCO than the baseline.

Training Tiny Neural Networks is Different from Training Large Neural Networks

Augment Tiny Neural Networks to Get More Supervision During Training

Experiment Results





title={Network Augmentation for Tiny Deep Learning},    

author={Han Cai and Chuang Gan and Ji Lin and Song Han},    

booktitle={International Conference on Learning Representations},    





No media articles found.


We thank National Science Foundation, MIT-IBM Watson AI Lab, Hyundai, Ford, Intel and Amazon for supporting this research.

Team Members