Once-for-All: Train One Network and Specialize it for Efficient Deployment

News

Waiting for more news.

Awards

No items found.

Competition Awards

First Place

Low-Power Computer Vision Challenge

CPU Detection, FPGA

, @

CVPR

2020

First Place

Low-Power Computer Vision Workshop at ICCV 2019

DSP

, @

ICCV

2019

First Place

Low-Power Image Recognition Challenge

classification, detection

, @

IEEE

2019

Abstract

We address the challenging problem of efficient inference across many devices and resource constraints, especially on edge devices. Conventional approaches either manually design or use neural architecture search (NAS) to find a specialized neural network and train it from scratch for each case, which is computationally prohibitive (causing CO2 emission as much as 5 cars' lifetime) thus unscalable. In this work, we propose to train a once-for-all (OFA) network that supports diverse architectural settings by decoupling training and search, to reduce the cost. We can quickly get a specialized sub-network by selecting from the OFA network without additional training. To efficiently train OFA networks, we also propose a novel progressive shrinking algorithm, a generalized pruning method that reduces the model size across many more dimensions than pruning (depth, width, kernel size, and resolution). It can obtain a surprisingly large number of sub-networks (> 1e19) that can fit different hardware platforms and latency constraints while maintaining the same level of accuracy as training independently. On diverse edge devices, OFA consistently outperforms state-of-the-art (SOTA) NAS methods (up to 4.0% ImageNet top1 accuracy improvement over MobileNetV3, or same accuracy but 1.5x faster than MobileNetV3, 2.6x faster than EfficientNet w.r.t measured latency) while reducing many orders of magnitude GPU hours and CO2 emission. In particular, OFA achieves a new SOTA 80.0% ImageNet top-1 accuracy under the mobile setting (<600M MACs). OFA is the winning solution for the 3rd Low Power Computer Vision Challenge (LPCVC), DSP classification track and the 4th LPCVC, both classification track and detection track. Code and 50 pre-trained models (for many devices & many latency constraints) are released at GitHub.

News

Once-for-All is available at PyTorch Hub now!
Once-for-All (OFA) Network is adopted by SONY Neural Architecture Search Library.
Once-for-All (OFA) Network is adopted by ADI MAX78000/MAX78002 Model Training and Synthesis Tool.
Once-for-All (OFA) Network is adopted by Alibaba and ranked 1st in the open division of the MLPerf Inference Benchmark (Datacenter and Edge).
First place in the CVPR 2020 Low-Power Computer Vision Challenge, CPU detection and FPGA track.
OFA-ResNet50 is released.
The hands-on tutorial of OFA is released!
OFA is available via pip! Run pip install ofato install the whole OFA codebase.
First place in the 4th Low-Power Computer Vision Challenge, both classification and detection track.
First place in the 3rd Low-Power Computer Vision Challenge, DSP track at ICCV’19 using the Once-for-all Network.

Train Once, Specialize for Many Deployment Scenarios

Accelerate Search with Accuracy/Latency Prediction Models

Competition Awards

Video

Citation

@inproceedings{
cai2020once,
title={Once for All: Train One Network and Specialize it for Efficient Deployment},
author={Han Cai and Chuang Gan and Tianzhe Wang and Zhekai Zhang and Song Han},
booktitle={International Conference on Learning Representations},
year={2020},
url={https://arxiv.org/pdf/1908.09791.pdf}
}

‍

Media

Acknowledgment

We thank NSF Career Award #1943349, MIT-IBM Watson AI Lab, Google-Daydream Research Award, Samsung, Intel, Xilinx, SONY, AWS Machine Learning Research Award for supporting this research. We thank Samsung, Google and LG for donating mobile phones.

Team Members

Han Cai

Zhekai Zhang

Song Han