Data assimilation and discrepancy modeling with shallow recurrent decoders

  • Bao, Yuxuan (University of Washington)
  • Kutz, Nathan (University of Washington)

Please login to view abstract download link

Data assimilation integrates observational data with predictive simulation models to produce coherent estimates of the full state of complex physical systems. However, simulation models inevitably neglect small-scale processes, are sensitive to perturbations, or oversimplify parameter correlations, leading to reconstructions that diverge from sensor measurements of real physics. This creates a critical simulation-to-reality (SIM2REAL) gap. We propose DA-SHRED (Data Assimilation with SHallow REcurrent Decoder), a machine learning framework that bridges the SIM2REAL gap by leveraging the latent space learned from reduced simulation models via the SHRED architecture. SHRED exploits three key concepts: separation of variables, Takens embedding theorem, and a decoding-only strategy to reconstruct full state-space dynamics from sparse sensor measurements. Our framework updates latent variables using real sensor data to accurately reconstruct full system states that cannot be directly observed. Furthermore, DA-SHRED incorporates a sparse identification of nonlinear dynamics (SINDy) regression model in the latent space to identify functionals corresponding to missing dynamics in simulation models. Two SINDy-based algorithms are presented for extracting parsimonious missing physics from libraries of candidate terms. The method is demonstrated on four challenging systems: 2D damped Kuramoto-Sivashinsky equations, 2D Kolmogorov flow, 2D Gray-Scott reaction-diffusion equations, and rotating detonation engines. Results show that DA-SHRED closes the SIM2REAL gap by approximately an order of magnitude in error within short evolution windows while successfully recovering missing physics terms. Theoretical analysis establishes that DA-SHRED preserves representational structure under suitable assumptions, with port-Hamiltonian systems providing rigorous examples of eigenspace preservation under perturbations. The algorithm is robust to noise, requires minimal training data, and enables efficient laptop-level computing. DA-SHRED demonstrates a powerful synergy between data assimilation, reduced-order modeling, and physics-informed model discovery for real-time, data-efficient state estimation in scientific and engineering applications.