Optimal learning in Shallow AutoEncoders

  • Bousquet, RĂ©mi (LISN)
  • Nore, Caroline (LISN)
  • Lucor, Didier (LISN)

Please login to view abstract download link

Autoencoders have become a cornerstone of nonlinear dimensionality reduction and have been successfully applied to the efficient integration of various dynamical systems, the prediction of future trajectories, and the reconstruction of statistical properties. In the early stages of this research, Bourlard and Kamp established a fundamental link between Proper Orthogonal Decomposition (POD), computed via the Singular Value Decomposition algorithm, and the minimum mean square error achievable by a fully connected autoencoder with a single hidden layer. Building on their methodology, we extend the investigation of optimal mean square error to networks with two hidden layers and to those utilizing convolutional layers. Using the flow around an oscillating cylinder as a test case, we expand upon a previous study to characterize more precisely the relationships between architecture, nonlinearities, and optimal MSE. This work defines a set of optimal error curves within the hyper-parameter space (Pareto fronts), which are shown to depend on the chosen architecture, the activation functions, and the underlying data as described through its POD.