Continuous-time Latent Dynamics: from Trajectories to Velocities

  • Farenga, Nicola (Politecnico di Milano)
  • Manzoni, Andrea (Politecnico di Milano)

Please login to view abstract download link

The concept of latent dynamics resides at the core of many recent deep learning-based architectures for modeling high-dimensional time-evolving processes. The introduction of model architectures characterized by a continuous-time inductive bias, such as Neural ODEs, coupled with suitable nonlinear dimensionality reduction techniques, has enabled the development of data-driven continuous-time order reduction strategies in the context of time-dependent PDEs, resulting in the broader notion of latent dynamics models. Despite many advantages stemming from their continuous-time inductive bias, ranging from mathematical interpretability to zero-shot time super-resolution, their training procedure is nontrivial. Indeed, whether trained via a two-stage approach or in an end-to-end manner, the training objectives are commonly trajectory-based, thus requiring unrolling the predictions over multiple steps, either in state space or in latent space. Rollout-based training, while motivated by empirical evidence in providing stable predictions, involves high computational costs and memory requirements due to backpropagation through time, which are further compounded by high-order numerical integration of the Neural ODE. In this talk, we first address the pitfalls of rollout-based training in the context of learning continuous-time dynamics, analyzing the bias introduced by the numerical solver when unrolling predictions in the infinite-horizon limit, thereby hindering proper identification of the underlying dynamics. Then, we discuss the advantages of a velocity-based training objective, both in the state space and in the latent space, proposing the adoption of a stochastic objective that results in a higher-order approximation of the population risk, and is characterized by an implicit regularization effect. Numerical experiments, carried out in the context of dynamical systems and time-dependent PDEs, validate the efficiency of the proposed approach, highlighting faster convergence and improved generalization compared to a range of rollout-based training strategies.