close

Condition-Aware Neural Network for Controlled Image Generation

Han Cai, Muyang Li, Zhuoyang Zhang, Qinsheng Zhang, Ming-Yu Liu, Song Han
MIT, Tsinghua University, NVIDIA
(* indicates equal contribution)

News

Awards

No items found.

Competition Awards

No items found.

Abstract

We present Condition-Aware Neural Network (CAN), a new method for adding control to image generative models. In parallel to prior conditional control methods, CAN controls the image generation process by dynamically manipulating the weight of the neural network. This is achieved by introducing a condition-aware weight generation module that generates conditional weight for convolution/linear layers based on the input condition. We test CAN on class-conditional image generation on ImageNet and text-to-image generation on COCO. CAN consistently delivers significant improvements for diffusion transformer models, including DiT and UViT. In particular, CAN combined with EfficientViT (CaT) achieves 2.78 FID on ImageNet 512x512, surpassing DiT-XL/2 while requiring 52x fewer MACs per sampling step.

Task: Controlled Image Generation

Image
  • Adding control is a critical step to convert diffusion models into productive tools for humans.

CAN: A New Control Method for Diffusion Models

Image
  • Prior conditional control methods (attention, adaptive normalization) add conditions in the feature space while sharing the model weight.
  • CAN add conditions by adaptively manipulating the weight.

Applying CAN to Diffusion Transformer

Image

CaT: Marrying CAN and EfficientViT

Image

Experiment Results

Image
Class-Conditional Image Generation on ImageNet
Image
Text-to-Image Generation on COCO

Video

Citation

@article{cai2024condition,

 title={Condition-Aware Neural Network for Controlled Image Generation},

 author={Cai, Han and Li, Muyang and Zhang, Zhuoyang and Zhang, Qinsheng and Liu, Ming-Yu and Han, Song},

 journal={arXiv preprint arXiv:2404.01143},

 year={2024}

}

Media

No media articles found.

Acknowledgment

This work is supported by MIT-IBM Watson AI Lab, Amazon, MIT Science Hub, and National Science Foundation.

Team Members