Identifying State Variables from Video for Predicting Physical Processes

  • Metschies, Fabian (TU Darmstadt)
  • Brugger, Jannis (TU Darmstadt)
  • Hemschik, Rico (Fraunhofer IWS)
  • Riede, Mirko (Fraunhofer IWS)
  • Mezini, Mira (TU Darmstadt)

Please login to view abstract download link

Directed Energy Deposition (DED) can reduce material waste by depositing metal only where needed, yet its industrial eco-efficiency is often limited by process instabilities that cause defects, rework, and wasted energy input. Building on the premise that robust forecasting is a key enabler of smart, sustainability-oriented manufacturing, we present a video-driven latent-state model that predicts DED process evolution over long horizons from the planned control program. We propose the Recurrent Dynamics Decoder (RDD), a learned reduced-order model that identifies a compact latent state directly from on-axis melt-pool video and evolves it forward in time conditioned on control setpoints. From this latent trajectory, the model jointly predicts future video frames and post-process track-quality metrics derived from 3D scans (e.g., cross-section area, peak height/width, peak offset), enabling a digital-twin-like simulation capability for pre-execution assessment. After a short observation window, rollouts are performed using only the programmed controls, matching practical deployment where future sensor measurements are unavailable. We evaluate RDD on real DED runs with synchronized video and control signals and compare (i) black-box long-horizon rollouts optimized for predictive accuracy and (ii) a neuro-symbolic variant that fits sparse equation models in latent space for interpretability. Results show stable long-horizon prediction after short warm-up and indicate that compact latent spaces (4–8 dims) can be sufficient at 128×128 resolution, while higher resolution and larger latents improve predictability of fine-grained effects (e.g., peak offset and melt-pool boundary structure). Overall, the approach demonstrates how video-driven latent state identification can support smart monitoring and paves the way toward pre-execution program optimization for improved process stability and eco-efficiency.