WCCM ECCOMAS 2026

Uncertainty Quantification of Foundation Models for Scientific Machine Learning

ZAKI, MOHD (JOHNS HOPKINS UNIVERSITY)
Thiagarajan, Ponkrshnan (JOHNS HOPKINS UNIVERSITY)
Chakraborty, Souvik (INDIAN INSTITUTE OF TECHNOLOGY DELHI)
Shields, Michael (JOHNS HOPKINS UNIVERSITY)

In session: MS316B - Fundamental Concepts in Scientific ML II

Please login to view abstract download link

The data-driven foundation models (FMs) for scientific machine learning (SciML) have enabled generalizable modelling of complex physical systems and simultaneously amplified the challenge of understanding and managing uncertainty. The development of FMs require large multi-domain datasets for training models having parameters ranging from a few millions to billions. Therefore, Uncertainty Quantification (UQ) for FMs must be considered at all stages of the model development lifecycle. In this work, we present a systematic framework to guide UQ across four critical stages: (1) Data and Dataset Curation; (2) Tokenisation and discretization; (3) Model Architecture; and (4) Training and Fine-tuning to quantify epistemic and aleatoric uncertainties. To demonstrate the proposed approach, we perform a comparative study on different SciML FMs spanning several orders of magnitude in number of parameters (ranging from specialised architectures with a few thousand parameters to large-scale foundation models with billions of parameters). Specifically, we use the recently proposed technique of combining the variational inference (VI) with Hamiltonian Monte Carlo (HMC) methods to perform UQ for neural networks and neural operators [1]. The VI based training of the FMs enable identification of model parameters contributing towards the uncertainty in the model output, followed by the HMC only on the identified parameters. Through our study, we highlight how uncertainty evolves with model and data scale and task complexity during pretraining and finetuning. Our proposed approach will enable researchers and FM developers to effectively probe the development pipelines and create reliable models.