Generative Adversarial Networks

GANs (mainly in image synthesis)

Survey Papers / Repos

Resources

Models

Loss functions

Regularization

Architecture

Conditional GANs

Others

Tricks

Metrics (my implementation: lzhbrian/metrics)

  • Inception Score [1606.03498] [1801.01973]

    • Assumption

      • MEANINGFUL: The generated image should be clear, the output probability of a classifier network should be [0.9, 0.05, ...] (largely skewed to a class). p(yx)p(y|\mathbf{x}) is of low entropy.

      • DIVERSITY: If we have 10 classes, the generated image should be averagely distributed. So that the marginal distribution p(y)=1Ni=1Np(yx(i))p(y) = \frac{1}{N} \sum_{i=1}^{N} p(y|\mathbf{x}^{(i)}) is of high entropy.

      • Better models: KL Divergence of p(yx)p(y|\mathbf{x}) and p(y)p(y) should be high.

    • Formulation

      • IS=exp(ExpgDKL[p(yx)p(y)])\text{IS} = \exp (\mathbb{E}_{\mathbf{x} \sim p_g} D_{KL} [p(y|\mathbf{x}) || p(y)] )

      • where

        • x\mathbf{x} is sampled from generated data

        • p(yx)p(y|\mathbf{x})​ is the output probability of Inception v3 when input is x\mathbf{x}​

        • p(y)=1Ni=1Np(yx(i))p(y) = \frac{1}{N} \sum_{i=1}^{N} p(y|\mathbf{x}^{(i)}) is the average output probability of all generated data (from InceptionV3, 1000-dim vector)

        • DKL(pq)=jpjlogpjqjD_{KL} (\mathbf{p}||\mathbf{q}) = \sum_{j} p_{j} \log \frac{p_j}{q_j}, where jj is the dimension of the output probability.

    • Reference

  • FID Score [1706.08500]

    • Formulation

      • FID=μrμg2+Tr(Σr+Σg2(ΣrΣg)1/2)\text{FID} = ||\mu_r - \mu_g||^2 + Tr(\Sigma_{r} + \Sigma_{g} - 2(\Sigma_r \Sigma_g)^{1/2})​

      • where

        • XrN(μr,Σr)X_r \sim \mathcal{N}(\mu_r, \Sigma_r) and XgN(μg,Σg)X_g \sim \mathcal{N}(\mu_g, \Sigma_g) are the 2048-dim activations the Inception v3 pool3 layer

        • μr\mu_r is the mean of real photo's feature

        • μg\mu_g is the mean of generated photo's feature

        • Σr\Sigma_r is the covariance matrix of real photo's feature

        • Σg\Sigma_g is the covariance matrix of generated photo's feature

    • Reference