An Explanatory Analysis of the Geometry of Latent Variables Learned by Variational Auto-Encoders
Alexandra Pește, Luigi Malagò, Septimia Sârbu
Variational AutoEncoders are generative models, consisting of two cascading networks: the recognition network and the generative network. Under the framework of variational inference, the original training algorithms of VAEs optimize a lowerbound on the log-likelihood, derived using the Kullback-Leibler divergence. More recent literature focused on improving the log-likelihood using alternative bounds, such as the ones derived from the Rényi divergence and their reformulations in terms of importance sampling. A thorough description of the influence of such bounds on the quality of the latent representation is lacking. Defining what makes a given latent representation better than another is not trivial. Learning adequate such descriptions represents one of the main determinants of the performance of VAEs. Representations in the latent space are reportedly distributed in a coherent way and the sub-manifold of observations appear to be mapped into an affine space. However, the explicit choice of the prior over the latent space remains the only known element in the construction of the geometry of this space. By means of an explanatory analysis, in our work-in-progress paper, we investigate the factors that shape the geometry of the latent space of VAEs. We evaluate the impact of different structural parameters of the model and that of the cost function optimized during training.