Multi-Level Patch Diffusion Models

  • Jiang, Shuai (Sandia National Labs)
  • Cyr, Eric (Sandia National Labs)
  • Zhang, Guannan (Oak Ridge National Labs)
  • Zhang, Zezhong (Auburn University)
  • Heinlein, Alexander (Delft University of Technology)

Please login to view abstract download link

Diffusion models are gaining popularity in scientific domains beyond traditional image generation. Recent work includes learning stochastic differential equations \cite{liu2025training} and efficiently sampling high-dimensional distributions \cite{tran2025diffusion}. While latent space models excel in image synthesis, scientific applications often necessitate high-resolution generation directly in physical space to maintain accuracy and bypass encoder training difficulties. However, this direct approach incurs significant computational and memory costs, particularly for large-scale or 3D problems. We propose a patch-based diffusion framework inspired by multi-level methods in numerical PDEs. Standard patch-based models process segments independently, which can sacrifice global statistical coherence. Drawing inspiration from DDU-Net \cite{verburg2025ddu}, we introduce a two-level architecture where patches are coupled through a Vision Transformer (ViT) coarse-grid network. Each patch is processed by a shared U-Net encoder-decoder, with the ViT facilitating cross-patch information exchange. This design enables patches to communicate within a compressed latent space, preserving global statistical properties while leveraging the benefits of domain decomposition. Although the transformer layer introduces memory overhead, the patch-based decomposition allows for parallel processing and scalability to large domains. The shared U-Net architecture, combined with positional encoding, ensures consistent local dynamics across all patches. We demonstrate the effectiveness of this approach on image generation benchmarks and scientific datasets, such as turbulent flows, where maintaining multi-scale statistics and coherent structures is critical. This approach opens new avenues for scaling data-driven modeling in computational science and engineering.