WCCM ECCOMAS 2026

On the Effect of Prior Specification in Bayesian Symbolic Regression

Bomarito, Geoffrey (NASA Langley Research Center)
Leser, Patrick (NASA Langley Research Center)
Pribe, Joshua (Analytical Mechanics Associates)
Weber, George (NASA Langley Research Center)

In session: MS233A - Advancing Computational Mechanics with Equation Discovery and Symbolic Regression I

Please login to view abstract download link

Symbolic regression (SR) has emerged as a powerful tool for discovering interpretable mathematical models from data, particularly relevant in computational mechanics where equation-based models are fundamental to simulation workflows. This work extends our previous Sequential Monte Carlo (SMC) based Bayesian symbolic regression framework (SMC-SR) to systematically quantify the effect of prior distributions on algorithm efficacy and model interpretability. While our prior work demonstrated that SMC-SR provides enhanced robustness in noisy environments and enables uncertainty quantification, the choice of prior distributions over symbolic expressions has been shown to play a crucial role in Bayesian SR performance. We develop and evaluate domain-informed priors on functional forms, moving beyond the uniform priors employed previously, and incorporate these priors into the SMC-SR framework to guide posterior sampling toward physically meaningful regions of the expression space. The investigation evaluates the impact of various prior specifications on standard benchmark problems, demonstrating how informed priors affect discovery accuracy, generalization performance, and convergence characteristics. We then apply the enhanced framework to identify hardening laws for crystal plasticity models using micromechanical data. The discovered hardening laws are designed for direct incorporation into structure-property relationship modeling pipelines for additively manufactured metals, where interpretable constitutive models are essential. Results demonstrate that judicious prior selection significantly impacts both predictive accuracy and model interpretability, particularly in data-limited regimes common to materials characterization.