Explained: A Style-Based Generator Architecture for GANs （StyleGAN）

2022-01-04 11:07:08 阅读：272 来源： 互联网

标签：GANs Style StyleGAN features AdaIN images input resolution

文章目录

Background
How StyleGAN works
Conclusion

生成图像最大的调整是对输出的控制。

one of their main challenges is controlling their output, i.e. changing specific features such pose, face shape and hair style in an image of a face.

A Style-Based Generator Architecture for GANs (StyleGAN) 提出了一个创新的模型来解决了这个问题。

StyleGAN generates the artificial image gradually, starting from a very low resolution and continuing to a high resolution (1024×1024). By modifying the input of each level separately, it controls the visual features that are expressed in that level, from coarse features (pose, face shape) to fine details (hair color), without affecting other levels.

Background

GAN ：

a generator that synthesizes new samples from scratch
a discriminator that takes samples from both the training data and the generator’s output and predicts if they are “real” or “fake”.

Generator：

input is a random vector (noise)

The discriminator：

also improves over time by comparing generated samples with real samples, making it harder for the generator to deceive it.

The key innovation of ProGAN is the progressive training — it starts by training the generator and the discriminator with a very low-resolution image (e.g. 4×4) and adds a higher resolution layer every time.

ProGAN generates high-quality images but, as in most models, its ability to control specific features of the generated image is very limited.

How StyleGAN works

The StyleGAN paper offers an upgraded version of ProGAN’s image generator, with a focus on the generator network.

The lower the layer (and the resolution), the coarser the features it affects. 层(分辨率)越低,越影响粗特征。

特征分类：

Coarse - resolution of up to 82 - affects pose, general hair style, face shape, etc （姿势、一般发型、脸型）
Middle - resolution of 162 to 322 - affects finer facial features, hair style, eyes open/closed, etc. （发型，眼睛开闭）
Fine - resolution of 642 to 10242 - affects color scheme (eye, hair and skin) and micro features. （眼睛、头发、皮肤的颜色方案）

？这个分界线是如何试探出来的

生成器新增的点：

Mapping Network

the ability to control visual features with the input vector is limited, as it must follow the probability density of the training data. a phenomenon called features entanglement.
解决方法：by using another neural network the model can generate a vector that doesn’t have to follow the training data distribution and can reduce the correlation between features. （减少特征之间的相关性）
在这里插入图片描述

Style Modules （AdaIN）

在这里插入图片描述
AdaIN (Adaptive Instance Normalization)模块将映射网络创建的编码信息ⱳ传输到生成的图像中。

首先对卷积层输出的每个通道进行归一化处理，以确保步骤3的缩放和移动达到预期的效果
中间向量ⱳ使用另一个全连接层(标记为A)转换为每个通道的比例和偏差。

Removing traditional input

传统模型使用随机输入来创建生成器的初始化图像（i.e. the input of the 4x4 level）
然而图像特征由ⱳ和AdaIN控制，因此，初始输入可以省略，用常数值代替。
该方法有效，可能是因为它减少了特征纠缠（feature entanglement）。仅使用ⱳ，网络学习更容易，而不依赖于纠缠的输入向量。
在这里插入图片描述

Stochastic Variation

雀斑，头发的准确位置，皱纹这些可以增加输出的多样性。
The common method to insert these small features into GAN images is adding random noise to the input vector.
The noise in StyleGAN is added in a similar way to the AdaIN mechanism：在AdaIN模块之前，每个通道都添加了一个缩放的噪声，并稍微改变了它所运行的分辨率级别的特征的视觉表达。

在这里插入图片描述

Style Mixing

the model randomly selects two input vectors and generates the intermediate vector ⱳ for them.
The model generates two images A and B and then combines them by taking low-level features from A and the rest of the features from B.

Truncation trick in W

生成模型的挑战之一是处理在训练数据中表现不佳的区域。
为了避免生成糟糕的图像，StyleGAN截断了中间向量ⱳ，迫使它保持在“平均”中间向量的附近。

通过选择许多随机输入，来获得平均值，生成中间向量。

When generating new images, instead of using Mapping Network output directly

ⱳ is transformed into ⱳ_new=ⱳ_avg+

标签：GANs,Style,StyleGAN,features,AdaIN,images,input,resolution
来源： https://blog.csdn.net/NGUever15/article/details/122269704

本站声明： 1. iCode9 技术分享网（下文简称本站）提供的所有内容，仅供技术学习、探讨和分享；
2. 关于本站的所有留言、评论、转载及引用，纯属内容发起人的个人观点，与本站观点和立场无关；
3. 关于本站的所有言论和文字，纯属内容发起人的个人观点，与本站观点和立场无关；
4. 本站文章均是网友提供，不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属；如您发现该文章侵犯了您的权益，可联系我们第一时间进行删除；
5. 本站为非盈利性的个人网站，所有内容不会用来进行牟利，也不会利用任何形式的广告来间接获益，纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。

ICode9