Calibrated Uncertainty for Fine-tuned Scientific Foundation Models via Stochastic Attention

  • Yadav, Akash (University of Houston)
  • Adebiyi, Taiwo (University of Houston)
  • Zhang, Ruda (University of Houston)

Please login to view abstract download link

Scientific foundation models (SciFMs) are increasingly used for predictive modeling of complex systems, but their deployment is hindered by the lack of practical uncertainty quantification (UQ) and error characterization. A key challenge is that UQ approaches developed for traditional computational models do not directly transfer to large transformer-based models, where their black-box nature and potential distribution shifts pose significant hurdles. We introduce stochastic attention, a lightweight mechanism that injects controlled randomness into attention layers of transformer-based SciFMs through a single scalar hyperparameter. This induces a distribution over predictions and yields a scalable and efficient procedure for post-hoc UQ. The hyperparameter is tuned via an efficient univariate objective function to achieve well-calibrated and sharp predictive uncertainty. Our approach operates entirely at inference time, requires no retraining, and adds negligible runtime and storage overhead, making it practical for large-scale foundation models. We demonstrate our method on SciFMs for Earth system prediction, characterizing post–fine-tuning model error. We compare against probabilistic fine-tuning baselines, including non-Bayesian and approximate Bayesian methods. Results indicate that stochastic attention delivers well-calibrated, sharp uncertainty estimates and maintains competitive point forecast accuracy with minimal added complexity.