Probabilistic recovery of unobserved exogenous variables in expert-defined causal DAGs

  • Boussaid, Badr-Eddine (École nationale supérieure d'arts et métiers)
  • Ghnatios, Chady (University of North Florida)
  • Jebahi, Mohamed (École nationale supérieure d'arts et métiers)
  • Bonidal, Rémi (ArcelorMittal Maizières Research SA)
  • Chinesta, Francisco (École nationale supérieure d'arts et métiers)

Please login to view abstract download link

Unobserved exogenous variables are critical drivers in many physical and engineered systems, yet they are typically not measured. Examples include hidden initial conditions, unknown operating regimes, and latent root causes encoded by domain expertise. The inability to measure these variables complicates inference from partial observations and limits interpretability and robustness in predictive modelling. When the causal structure is known, leveraging it can substantially improve latent inference. We present a probabilistic modeling framework for recovering unobserved exogenous variables in expert-defined directed acyclic graphs (DAGs). Each conditional relationship is represented by a flexible neural conditional probability distribution to handle nonlinear mechanisms while preserving the prescribed graph semantics. Root-variable priors can be specified or learned to capture non-Gaussian variability. Latent roots (and any intermediate latent nodes) are inferred jointly with observed variables using an amortized variational guide, enabling scalable inference and explicit posterior uncertainty quantification. We evaluate the approach on synthetic ground-truth DAGs designed to mimic partially observable subsystems. Metrics include latent recovery accuracy via correlation with ground truth and predictive performance on observed variables. Results demonstrate robust recovery of unobserved exogenous drivers while maintaining an interpretable causal structure from the expert-defined DAG.