Deep Convolutional–Transformer Models for Stable Autoregressive Evolution of Parametric Partial Differential Equations

  • Mikuš, Iva (Faculty of Electrical Engineering and Computi)
  • Muha, Boris (Faculty of Science, Department of Mathematics)
  • Vlah, Domagoj (Faculty of Electrical Engineering and Computi)

Please login to view abstract download link

Accurate temporal evolution of parametric partial differential equations (PDEs) with deep learning remains challenging, in particular due to instability and error accumulation in autoregressive rollouts. In this work, we propose a model based on an encoder–processor–decoder architecture that incorporates parametric conditioning and spatial information to predict PDE dynamics in an autoregressive setting. The model is trained directly on solution fields and does not rely on explicit latent-space evolution. To improve rollout stability, we employ scheduled sampling during training. Empirically, we observe that the relative rollout error grows approximately linearly in time, rather than exponentially, even well beyond the scheduled sampling window. This behavior is consistent across different parameter regimes and long prediction horizons. We compare the proposed approach with autoencoder-based models that use flattened latent representations combined with learned latent evolution operators, including transformer-based latent dynamics. On a parametric advection–reaction–diffusion benchmark, our method achieves more accurate and stable long-term predictions. The difference becomes more pronounced for Navier–Stokes flow past a cylinder, where the model is trained on velocity and pressure fields. In addition, our approach requires significantly fewer trainable parameters than latent-evolution-based alternatives, while achieving substantially lower relative error. These results suggest that preserving spatial structure throughout the model and avoiding explicit latent temporal evolution can lead to more stable and parameter-efficient autoregressive predictors for time-dependent parametric PDEs.