Gradient-Informed Neural Networks for Data-Efficient Surrogate Modeling Using Prior Beliefs: Application to Diesel Engine Emissions Estimation
Please login to view abstract download link
Scientific ML aims to bridge data-driven models and physical reasoning in order to enable reliable and data-efficient surrogate models for complex systems. While PINN and gradient-based approaches such as Sobolev training have proven effective when PDE or explicit derivative information are available, many engineering systems exhibit complex or poorly understood physics, for which no explicit PDE or ground-truth gradients are available. In these settings, the available information therefore consists of sparse measurements and qualitative domain knowledge about the expected behavior of the system. In this context, we propose Gradient-Informed NN (GradINN), a framework that enables the incorporation of unstructured prior beliefs on the target function gradients through soft constraints, without requiring access to governing equations or explicit derivative observations. The approach relies on a coupled neural architecture, in which an auxiliary network encodes qualitative assumptions on gradient behavior and constrains the training of a primary network via a customized loss function. This formulation allows prior beliefs to guide the learning process while remaining adaptive to the available data. The framework is evaluated using smoothness-based and highly oscillatory gradient priors, showing improved gradient shaping and generalization in low-data regimes on several benchmark problems. GradINN is then applied to a real industrial problem involving surrogate modeling for diesel engine emissions. Results show that the introduction of smooth prior beliefs improves data efficiency, achieving comparable predictive accuracy with approximately 25% fewer training samples than standard NNs and Gaussian process regressors. This improvement enables either a reduction of experimental activities or a redistribution of experiments toward less explored regions of the operating space. Overall, the proposed approach highlights that qualitative assumptions on gradient behavior, often available through domain expertise, can be exploited to guide learning and enhance data efficiency in complex engineering systems.
