MEANINGFUL: The generated image should be clear, the output probability of a classifier network should be [0.9, 0.05, ...] (largely skewed to a class).
p(yβ£x)
is of low entropy.
DIVERSITY: If we have 10 classes, the generated image should be averagely distributed. So that the marginal distribution
p(y)=N1ββi=1Nβp(yβ£x(i))
__is of high entropy.
Better models: KL Divergence of
p(yβ£x)
and
p(y)
should be high.
Formulation
β
IS=exp(ExβΌpgββDKLβ[p(yβ£x)β£β£p(y)])
β
where
β
x
is sampled from generated data
β
p(yβ£x)β
is the output probability of Inception v3 when input is
xβ
β
β
p(y)=N1ββi=1Nβp(yβ£x(i))
is the average output probability of all generated data (from InceptionV3, 1000-dim vector)