Knowledge-informed machine learning for shaping model behaviour

  • Fang, Zhou (National University of Singapore)
  • Mengaldo, Giamarco (National University of Singapore)

Please login to view abstract download link

Machine learning (ML) methods have become essential tools for modeling and forecasting complex, nonlinear systems across a wide range of disciplines, including climate science, fluid dynamics, neuroscience, finance, and environmental engineering. In particular, ML-based forecasting models often achieve predictive performance comparable to, or even exceeding, that of traditional statistical or physics-based approaches, while operating at significantly lower computational cost. This makes ML especially attractive for applications involving large-scale and high-dimensional datasets. However, purely data-driven ML models frequently struggle in regions characterized by data scarcity and high complexity, leading to degraded predictive skill and rapid error growth. To address these challenges, recent research has increasingly focused on knowledge-informed machine learning, which incorporates prior knowledge, such as physical laws, invariants, or structural constraints, into different stages of the ML pipeline to guide model behavior. In this work, we propose a general framework that embeds intrinsic dynamical information into ML models, that shaping model behavior in dynamical regimes deemed most relevant for forecasting performance. Unlike physics-informed approaches that rely on known governing equations or explicit physical constraints, our framework leverages general dynamical characteristics, making it broadly applicable across problem settings. We evaluate the proposed framework on a range of canonical dynamical systems, including the Lorenz system, the Kuramoto–Sivashinsky equation, and Kolmogorov flow, as well as on realistic atmospheric data from ERA5 reanalysis. Across these experiments, our approach consistently reduces mean squared error for direct forecasting and effectively mitigates error growth in recursive forecasting compared to baseline ML models. Beyond empirical improvements, we analyze the underlying mechanisms through which dynamical information enhances learning and identify the conditions under which the proposed framework is most effective. Overall, this work provides a principled and flexible strategy for improving ML-based forecasting of dynamical systems and offers new insights into understanding ML model behavior in complex dynamical regimes.