Preconditioning for Neural Networks and Vice Versa

  • Heinlein, Alexander (Delft University of Technology)

Please login to view abstract download link

Neural networks are a central technique in artificial intelligence, which has transformed many research fields in recent years. This talk focuses on scientific machine learning, the combination of scientific computing and machine learning, and specifically how neural networks can enhance preconditioners and vice versa. Neural networks are typically trained with gradient descent-type methods and exhibit a spectral bias, meaning that low-frequency components are learned faster while high-frequency components are only learned slowly. This is the opposite for stationary iterations for classical discretizations, where high-frequency errors are reduced first. We relate this difference to the global support of common activation functions in neural networks versus the local basis functions in classical methods. In the first part of the presentation, we discuss how techniques from numerical linear algebra can address the spectral bias of neural networks to improve training performance. We employ localization via domain decomposition to combat the spectral bias. We then consider Gauss--Newton-type linearization and preconditioning of the resulting linear systems. Second, we show how preconditioning ideas can leverage the complementary spectral bias of classical stationary iterations and neural networks. We use this complementarity to enhance preconditioners for classical problems and explore how performance depends on training paradigms and update strategies. This presentation is based on joint work with J.W. Beek (TU/e), V. Dolean (TU/e), T. Kapoor (WUR), Y. Meng (TU Delft), S. Mishra (ETH), B. Moseley (ICL), Y. Shang (XJTU), J. Taraz (TU Delft), F. Wang (XJTU), Y. Wu (KTH), and J. Zhao (Deltares).