Active sampling for sparse dictionary learning of nonlinear dynamical systems under data constraints

  • Larrañaga, Ana (University of Washington)
  • Fasel, Urban (Imperial College London)
  • Brunton, Steven L (University of Washington)

Please login to view abstract download link

Choosing where and how to sample becomes critical when the goal is not only high predictive accuracy, but also the identification of the underlying dynamics of a potentially unknown system in a simple yet meaningful form. Many existing approaches rely on uncertainty-driven sampling strategies that focus on regions of poor predictive accuracy. While effective in some cases, these methods may insufficiently explore the domain, preventing the evaluation of regions that, although not immediately informative, are necessary to recover the dynamics governing the system. We investigate this challenge using the Sparse Identification of Nonlinear Dynamical Systems (SINDy) framework, a computationally efficient algorithm for recovering the structure of dynamical systems from time-series data. We analyze the impact of noise levels and data availability on SINDy’s performance, with the goal of identifying governing equations using the minimum amount of information possible. The study considers both ODEs and PDEs, with particular focus on sampling strategies in the PDE setting that reduce the number of evaluations and refitting steps required by the algorithm. Active learning approaches typically aim to select samples that are informative, representative, and diverse. In this work, uncertainty estimates and coefficient stability are computed from an ensemble of SINDy-based models and used to inform an active sampling strategy that targets sparsity and stability. Compared with uniform random sampling, the method recovers the correct dynamics with substantially fewer observations in low-data regimes.