Multimodal Atmospheric Super-resolution With Deep Generative Models

  • Chakraborty, Dibyajyoti (Penn State)
  • Guan, Haiwen (Penn State)
  • Stock, Jason (Argonne National Lab)
  • Arcomano, Troy (AI2)
  • Cervone, Guido (Penn State)
  • Maulik, Romit (Purdue University)

Please login to view abstract download link

Diffusion models are a class of generative machine learning algorithms that can be used to sample from complex distributions. They achieve this by learning a score function, i.e., the gradient of the log-probability density of the data, and reversing a noising process using the same. Once trained, these diffusion models not only generate new samples but also enable zero-shot conditioning of the generated samples on observed data. This promises a novel paradigm for data and model fusion, wherein the implicitly learned distributions of pretrained diffusion models can be updated given the availability of online data in a Bayesian formulation. In this article, we apply such a concept to the super-resolution of a high-dimensional dynamical system, given the real-time availability of low-resolution and experimentally observed sparse measurements from multimodal data. Our experiments are performed for a super-resolution task that generates the ERA5 atmospheric dataset given sparse observations from a coarse-grained representation of the same and/or from unstructured experimental observations of the IGRA radiosonde dataset. We also perform a data fusion task that can additionally leverage predictions from a data-driven atmospheric emulator. Furthermore, we introduce a novel method for optimal sensor placement that improves posterior sampling by identifying the most informative observation locations within the dynamical field. We discover that the generative model can balance the influence of multiple dataset modalities and detect important spatial features for spatiotemporal state reconstructions of high-dimensional dynamical systems.