Generative AI

Projects

SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

ICML 2023

We propose SmoothQuant, a training-free, accuracy-preserving, and general-purpose post-training quantization (PTQ) solution to enable 8-bit weight, 8-bit activation (W8A8) quantization for LLMs.

Efficient Spatially Sparse Inference for Conditional GANs and Diffusion Models

NeurIPS 2022 & TPAMI

(

)

An engine that selectively performs computations at the edited regions to accelerate image editing applications.

Anycost GANs for Interactive Image Synthesis and Editing

CVPR 2021

(

)

Anycost GAN generates consistent outputs under various, fine-grained computation budgets.

Differentiable Augmentation for Data-Efficient GAN Training

NeurIPS 2020

(

)

Differentiable augmentation to improve the data efficiency of GAN training.

GAN Compression: Efficient Architectures for Interactive Conditional GANs

CVPR 2020 & TPAMI

(

)

A general-purpose compression framework for reducing the inference time and model size of the generator in conditional GANs.

Blog Posts

DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models

February 29, 2024

In this blog, we introduce DistriFusion, a training-free algorithm to harness multiple GPUs to accelerate diffusion model inference without sacrificing image quality. It can reduce SDXL latency by up to 6.1× on 8 A100s. Our work has been accepted by CVPR 2024 as a highlight. Code: https://github.com/mit-han-lab/distrifusion