📝
Tzu-Heng's wiki
  • Tzu-Heng's wiki
  • Machine Learning
    • Traditionals
    • Deep Learning
    • Image Classification (CNN)
    • Detection
    • Semantic Segmentation
    • Generative Adversarial Networks
    • Style Transfer
    • Recommender Systems
    • Meta Learning
  • Notes
    • Differientiable Sampling and Argmax
    • GAN theory
    • Multi-task Learning (MTL)
    • Disentanglement in GANs
    • CNN practical notes
    • 3D Clothes
    • OpenGL
    • Generative Art
    • nginx usage
    • Deploy Deep Learning Models
    • Character Motion Synthesis
  • Data Structure & Algorithms
    • Sorting Algorithms
Powered by GitBook
On this page
  • Survey Papers / Repos
  • Resources
  • Models
  • Loss functions
  • Regularization
  • Architecture
  • Conditional GANs
  • Others
  • Tricks
  • Metrics (my implementation: lzhbrian/metrics)

Was this helpful?

  1. Machine Learning

Generative Adversarial Networks

GANs (mainly in image synthesis)

Survey Papers / Repos

  • Are GANs Created Equal? A Large-Scale Study [1711.10337]

  • Which Training Methods for GANs do actually Converge? [1801.04406]

  • A Large-Scale Study on Regularization and Normalization in GANs [1807.04720]

  • hindupuravinash/the-gan-zoo

Resources

  • google/compare_gan

  • TF-GAN: TensorFlow-GAN

  • torchgan

  • PyTorch-GAN

  • lzhbrian/metrics: IS, FID implementation in TF, PyTorch

Models

Loss functions

  • Vanilla GAN [1406.2661]

  • EBGAN [1609.03126]

  • LSGAN [1611.04076]

  • WGAN [1701.07875]

  • BEGAN [1703.10717]

  • Hinge Loss [1705.02894]

Regularization

  • Gradient Penalty [1704.00028]

  • DRAGAN [1705.07215]

  • SNGAN [1802.05957]

  • Consistency Regularization [1910.12027]

Architecture

  • Deep Convolution GAN (DCGAN) [1511.06434]

  • Progressive Growing of GANs (PGGAN) [1710.10196]

  • Self Attention GAN (SAGAN) [1805.08318]

  • BigGAN [1809.11096]

  • Style based Generator (StyleGAN) [1812.04948]

  • Mapping Network (StyleGAN) [1812.04948]

  • LOGAN: Latent Optimisation for Generative Adversarial Networks [1912.00953]

Conditional GANs

  • Vanilla Conditional GANs [1411.1784]

  • Auxiliary Classifer GAN (ACGAN) [1610.09585]

Others

Tricks

  • Two time-scale update rule (TTUR) [bioinf-jku/TTUR] [1706.08500]

  • Self-Supervised GANs via Auxiliary Rotation Loss (SS-GAN) [1811.11212]

Metrics (my implementation: lzhbrian/metrics)

  • Inception Score [1606.03498] [1801.01973]

    • Assumption

    • Formulation

      • where

    • Reference

      • Official TF implementation is in openai/improved-gan

      • Pytorch Implementation: sbarratt/inception-score-pytorch

      • TF seemed to provide a good implementation

      • scipy.stats.entropy

      • zhihu: Inception Score 的原理和局限性

      • A Note on the Inception Score

  • FID Score [1706.08500]

    • Formulation

      • where

    • Reference

      • Official TF implementation: bioinf-jku/TTUR

      • Pytorch Implementation: mseitzer/pytorch-fid

      • TF seemed to provide a good implementation

      • zhihu: Frechet Inception Score (FID)

      • Explanation from Neal Jean

PreviousSemantic SegmentationNextStyle Transfer

Last updated 3 years ago

Was this helpful?

MEANINGFUL: The generated image should be clear, the output probability of a classifier network should be [0.9, 0.05, ...] (largely skewed to a class). p(y∣x)p(y|\mathbf{x})p(y∣x) is of low entropy.

DIVERSITY: If we have 10 classes, the generated image should be averagely distributed. So that the marginal distribution p(y)=1N∑i=1Np(y∣x(i))p(y) = \frac{1}{N} \sum_{i=1}^{N} p(y|\mathbf{x}^{(i)})p(y)=N1​∑i=1N​p(y∣x(i)) __is of high entropy.

Better models: KL Divergence of p(y∣x)p(y|\mathbf{x})p(y∣x) and p(y)p(y)p(y) should be high.

IS=exp⁡(Ex∼pgDKL[p(y∣x)∣∣p(y)])\text{IS} = \exp (\mathbb{E}_{\mathbf{x} \sim p_g} D_{KL} [p(y|\mathbf{x}) || p(y)] )IS=exp(Ex∼pg​​DKL​[p(y∣x)∣∣p(y)])

x\mathbf{x}x is sampled from generated data

p(y∣x)​p(y|\mathbf{x})​p(y∣x)​ is the output probability of Inception v3 when input is x​\mathbf{x}​x​

p(y)=1N∑i=1Np(y∣x(i))p(y) = \frac{1}{N} \sum_{i=1}^{N} p(y|\mathbf{x}^{(i)})p(y)=N1​∑i=1N​p(y∣x(i)) is the average output probability of all generated data (from InceptionV3, 1000-dim vector)

DKL(p∣∣q)=∑jpjlog⁡pjqjD_{KL} (\mathbf{p}||\mathbf{q}) = \sum_{j} p_{j} \log \frac{p_j}{q_j}DKL​(p∣∣q)=∑j​pj​logqj​pj​​, where jjj is the dimension of the output probability.

FID=∣∣μr−μg∣∣2+Tr(Σr+Σg−2(ΣrΣg)1/2)​\text{FID} = ||\mu_r - \mu_g||^2 + Tr(\Sigma_{r} + \Sigma_{g} - 2(\Sigma_r \Sigma_g)^{1/2})​FID=∣∣μr​−μg​∣∣2+Tr(Σr​+Σg​−2(Σr​Σg​)1/2)​

TrTrTr is trace of a matrix (wikipedia)

Xr∼N(μr,Σr)X_r \sim \mathcal{N}(\mu_r, \Sigma_r)Xr​∼N(μr​,Σr​) and Xg∼N(μg,Σg)X_g \sim \mathcal{N}(\mu_g, \Sigma_g)Xg​∼N(μg​,Σg​) are the 2048-dim activations the Inception v3 pool3 layer

μr\mu_rμr​ is the mean of real photo's feature

μg\mu_gμg​ is the mean of generated photo's feature

Σr\Sigma_rΣr​ is the covariance matrix of real photo's feature

Σg\Sigma_gΣg​ is the covariance matrix of generated photo's feature