On the Effect of Prior Specification in Bayesian Symbolic Regression
Please login to view abstract download link
Symbolic regression (SR) has emerged as a powerful tool for discovering interpretable mathematical models from data, particularly relevant in computational mechanics where equation-based models are fundamental to simulation workflows. This work extends our previous Sequential Monte Carlo (SMC) based Bayesian symbolic regression framework (SMC-SR) to systematically quantify the effect of prior distributions on algorithm efficacy and model interpretability. While our prior work demonstrated that SMC-SR provides enhanced robustness in noisy environments and enables uncertainty quantification, the choice of prior distributions over symbolic expressions has been shown to play a crucial role in Bayesian SR performance. We develop and evaluate domain-informed priors on functional forms, moving beyond the uniform priors employed previously, and incorporate these priors into the SMC-SR framework to guide posterior sampling toward physically meaningful regions of the expression space. The investigation evaluates the impact of various prior specifications on standard benchmark problems, demonstrating how informed priors affect discovery accuracy, generalization performance, and convergence characteristics. We then apply the enhanced framework to identify hardening laws for crystal plasticity models using micromechanical data. The discovered hardening laws are designed for direct incorporation into structure-property relationship modeling pipelines for additively manufactured metals, where interpretable constitutive models are essential. Results demonstrate that judicious prior selection significantly impacts both predictive accuracy and model interpretability, particularly in data-limited regimes common to materials characterization.
