Anycost GANs for Interactive Image Synthesis and Editing

Ji Lin, Richard Zhang, Frieder Ganz, Song Han, Jun-Yan Zhu
MIT, Adobe Research, CMU
(* indicates equal contribution)

News

Waiting for more news.

Awards

No items found.

Competition Awards

No items found.

Abstract

Generative adversarial networks (GANs) have enabled photorealistic image synthesis and editing. However, due to the high computational cost of large-scale generators (e.g., StyleGAN2), it usually takes seconds to see the results of a single edit on edge devices, prohibiting interactive user experience. In this paper, inspired by quick preview features in modern rendering software, we propose Anycost GAN for interactive natural image editing. We train the Anycost GAN to support elastic resolutions and channels for faster image generation at versatile speeds. Running subsets of the full generator produce outputs that are perceptually similar to the full generator, making them a good proxy for quick preview. By using sampling-based multi-resolution training, adaptive-channel training, and a generator-conditioned discriminator, the anycost generator can be evaluated at various configurations while achieving better image quality compared to separately trained models. Furthermore, we develop new encoder training and latent code optimization techniques to encourage consistency between the different sub-generators during image projection. Anycost GAN can be executed at various cost budgets (up to 10× computation reduction) and adapt to a wide range of hardware and latency requirements. When deployed on desktop CPUs and edge devices, our model can provide perceptually similar previews at 6-12× speedup, enabling interactive image editing.

Open In Colab

Overview

Anycost generators can be run at diverse computation costs by using different channel and resolution configurations. Sub-generators achieve high output consistency compared to the full generator, providng a fast preview.

With ① Sampling-based multi-resolution training; ② adaptive-channel training; ③ generator-conditioned discriminator, we can achieve high image quality and consistency at different resolutions and channels.

Results

Anycost GAN (uniform channel version) supports 4 resolutions and 4 channel ratios, producing visually consistent images with different image fidelity.

The consistency retains during image projection and editing:

Video

Citation

@inproceedings{lin2021anycost,

author    = {Lin, Ji and Zhang, Richard and Ganz, Frieder and Han, Song and Zhu, Jun-Yan},

title     = {Anycost GANs for Interactive Image Synthesis and Editing},

booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},

year      = {2021},

}

Media

No media articles found.

Acknowledgment

We thank Taesung Park, Zhixin Shu, Muyang Li, and Han Cai for the helpful discussion. Part of the work is supported by NSF CAREER Award #1943349, Adobe, Naver Corporation, and MIT-IBM Watson AI Lab.

Team Members