Generative Adversarial Networks

GANs (mainly in image synthesis)

Survey Papers / Repos

Are GANs Created Equal? A Large-Scale Study [1711.10337]
Which Training Methods for GANs do actually Converge? [1801.04406]
A Large-Scale Study on Regularization and Normalization in GANs [1807.04720]
hindupuravinash/the-gan-zoo

Resources

google/compare_gan
TF-GAN: TensorFlow-GAN
torchgan
PyTorch-GAN
lzhbrian/metrics: IS, FID implementation in TF, PyTorch

Models

Loss functions

Vanilla GAN [1406.2661]
EBGAN [1609.03126]
LSGAN [1611.04076]
WGAN [1701.07875]
BEGAN [1703.10717]
Hinge Loss [1705.02894]

Regularization

Gradient Penalty [1704.00028]
DRAGAN [1705.07215]
SNGAN [1802.05957]
Consistency Regularization [1910.12027]

Architecture

Deep Convolution GAN (DCGAN) [1511.06434]
Progressive Growing of GANs (PGGAN) [1710.10196]
Self Attention GAN (SAGAN) [1805.08318]
BigGAN [1809.11096]
Style based Generator (StyleGAN) [1812.04948]
Mapping Network (StyleGAN) [1812.04948]
LOGAN: Latent Optimisation for Generative Adversarial Networks [1912.00953]

Conditional GANs

Vanilla Conditional GANs [1411.1784]
Auxiliary Classifer GAN (ACGAN) [1610.09585]

Others

Tricks

Two time-scale update rule (TTUR) [bioinf-jku/TTUR] [1706.08500]
Self-Supervised GANs via Auxiliary Rotation Loss (SS-GAN) [1811.11212]

Metrics (my implementation: lzhbrian/metrics)

Inception Score [1606.03498] [1801.01973]
- Assumption
  - MEANINGFUL: The generated image should be clear, the output probability of a classifier network should be [0.9, 0.05, ...] (largely skewed to a class). $p(y|\mathbf{x})$ is of low entropy.
  - DIVERSITY: If we have 10 classes, the generated image should be averagely distributed. So that the marginal distribution $p(y) = \frac{1}{N} \sum_{i=1}^{N} p(y|\mathbf{x}^{(i)})$ __is of high entropy.
  - Better models: KL Divergence of $p(y|\mathbf{x})$ and $p(y)$ should be high.
- Formulation
  - $\text{IS} = \exp (\mathbb{E}_{\mathbf{x} \sim p_g} D_{KL} [p(y|\mathbf{x}) || p(y)] )$
  - where
    $\mathbf{x}$ is sampled from generated data
    $p(y|\mathbf{x})$ is the output probability of Inception v3 when input is $\mathbf{x}$
    $p(y) = \frac{1}{N} \sum_{i=1}^{N} p(y|\mathbf{x}^{(i)})$ is the average output probability of all generated data (from InceptionV3, 1000-dim vector)
    $D_{KL} (\mathbf{p}||\mathbf{q}) = \sum_{j} p_{j} \log \frac{p_j}{q_j}$ , where $j$ is the dimension of the output probability.
- Reference
  - Official TF implementation is in openai/improved-gan
  - Pytorch Implementation: sbarratt/inception-score-pytorch
  - TF seemed to provide a good implementation
  - scipy.stats.entropy
  - zhihu: Inception Score 的原理和局限性
  - A Note on the Inception Score
FID Score [1706.08500]
- Formulation
  - $\text{FID} = ||\mu_r - \mu_g||^2 + Tr(\Sigma_{r} + \Sigma_{g} - 2(\Sigma_r \Sigma_g)^{1/2})$
  - where
    $Tr$ is trace of a matrix (wikipedia)
    $X_r \sim \mathcal{N}(\mu_r, \Sigma_r)$ and $X_g \sim \mathcal{N}(\mu_g, \Sigma_g)$ are the 2048-dim activations the Inception v3 pool3 layer
    $\mu_r$ is the mean of real photo's feature
    $\mu_g$ is the mean of generated photo's feature
    $\Sigma_r$ is the covariance matrix of real photo's feature
    $\Sigma_g$ is the covariance matrix of generated photo's feature
- Reference
  - Official TF implementation: bioinf-jku/TTUR
  - Pytorch Implementation: mseitzer/pytorch-fid
  - TF seemed to provide a good implementation
  - zhihu: Frechet Inception Score (FID)
  - Explanation from Neal Jean

PreviousSemantic Segmentation NextStyle Transfer

Last updated 3 years ago

Was this helpful?