This is because Euclidean distance is minimized by averaging all plausible outputs, which causes blurring.
Earlier papers have focused on specific applications, and it has remained unclear how effective image-conditional GANs can be as a general-purpose solution for image-to- image translation.
2. Contribution
本文的第一个贡献在于CGAN在多任务上可以统一,有不错的效果。
Our primary contribution is to demonstrate that on a wide variety of problems, conditional GANs produce reasonable results.
本文的第二个贡献在于提出了一个简单的框架。
Our second contribution is to present a simple framework sufficient to achieve good results, and to analyze the effects of several important architectural choices.
Unlike past work, for our generator we use a “U-Net”-based architecture.
And for our discriminator we use a convo- lutional “PatchGAN” classifier, which only penalizes struc- ture at the scale of image patches.
3. Method
3.1 Objective
CGAN可以表示为公式1,其中x为condition:
最后的目标函数表示为:
对于noise z的设定,作者采取了dropout的方式:
Instead, for our final models, we provide noise only in the form of dropout, applied on several layers of our generator at both training and test time.
In addition, for the problems we consider, the input and output differ in surface appearance, but both are renderings of the same underlying structure.
Therefore, structure in the input is roughly aligned with structure in the output.
对于generation的制作来说,作者参考U-Net 使用了skip connection结构:
To give the generator a means to circumvent the bottleneck for information like this, we add skip connections, fol- lowing the general shape of a “U-Net”.
We employ two tactics. First, we run “real vs. fake” perceptual studies on Amazon Mechanical Turk (AMT).
Second, we measure whether or not our synthesized cityscapes are realistic enough that off-the-shelf recognition system can recognize the objects in them.