Abstract

We propose a new approach for high resolution semantic image synthesis. It consists of one base image generator and multiple class-specific generators. The base generator generates high quality images based on a segmentation map. To further improve the quality of different objects, we create a bank of Generative Adversarial Networks (GANs) by separately training class-specific models. This has several benefits including -- dedicated weights for each class; centrally aligned data for each model; additional training data from other sources, potential of higher resolution and quality; and easy manipulation of a specific object in the scene. Experiments show that our approach can generate high quality images in high resolution while having flexibility of object-level control by using class-specific generators.

Video

Approach

Results

Compare our base model with baselines

Visual comparison of segmentation map to image synthesis results on the ADE20K bedroom (512x512), full human body (512x512), and Cityscapes (1024x512) datasets.

Multimodal results from the base model

Multi-modal synthesis results on bedroom, human and cityscapes by our base model. Each row shows multiple generations for same semantic mask.

Collaging class-specific GANs

Our CollageGAN model takes advantage of class-specific generators to provide more details to specific classes in the image generated by the base model.

Replacing objects in real images

Images in the red box are real images. Here we use the bed and uppercloth specific generator to replace the original objects.

Mix resolution image generation

The mixed-resolution result. Here the base image is first resized to 4096x4096 and then face region is composited by high quality face generated from face specific model.

Paper and Supplementary Material

Yuheng Li, Yijun Li, Jingwan Lu, Eli Shechtman, Yong Jae Lee, Krishna Kumar Singh
Collaging Class-specific GANs for Semantic Image Synthesis

Acknowledgements

This work was supported in part by NSF CAREER IIS-1751206.

This template was originally made by Phillip Isola and Richard Zhang for a colorful ECCV project; the code can be found here.