Injecting Domain Knowledge into Bayesian Optimization: A Comparative Study

  • Schuscha, Bernd (Materials Center Leoben Forschung GmbH)
  • Rohrhofer, Franz Martin (Know-Center Research GmbH)
  • Scheiber, Daniel (Materials Center Leoben Forschung GmbH)
  • Geiger, Bernhard C (Graz University of Technology)

Please login to view abstract download link

Bayesian optimization (BO) is widely used for data-efficient optimization of expensive black-box functions and has become increasingly important in materials design, where experiments are costly and data are scarce. In such settings, optimization is performed under observational noise, model misspecification, and strong prior domain knowledge that exists in heterogeneous forms. Although numerous domain-informed BO approaches have been proposed, they are typically developed in isolation, making it difficult to understand how different methods of knowledge integration influence optimization behavior. This work presents a systematic study of strategies for injecting domain knowledge into Bayesian optimization and analyzes how these strategies alter optimization performance and decision-making. The investigated approaches span multiple levels of the BO pipeline, including domain-informed surrogate models, physically motivated priors over the search space, and acquisition-level regularization through auxiliary domain models. Knowledge is incorporated through feature transformations, kernel and mean-function design, simplified algebraic relations, and static physics-based models, all embedded within a unified Bayesian framework. Focusing on the small-data regime, we analyze how the representation and placement of domain knowledge affect exploration–exploitation dynamics, uncertainty calibration, robustness to noise, and behavior near domain boundaries. A set of physics-based optimization problems from materials science is used to demonstrate these effects in realistic process–structure–property scenarios. Rather than proposing a single optimal strategy, this study reveals the trade-offs associated with different forms of knowledge integration and their interaction with uncertainty-driven decision-making. The results provide practical guidance for the principled design of domain-informed Bayesian optimization workflows in data-scarce scientific applications.