WCCM ECCOMAS 2026

Toolpath Planning for Bed-Based Additive Manufacturing using Reinforcement Learning

Schmeitz, Ruben (Eindhoven University of Technology)
Wolfs, Rob (Eindhoven University of Technology)
Remmers, Joris (Eindhoven University of Technology)

In session: MS344G - Agentic AI and Physics-Informed Machine Learning for Next-Generation Design and Manufacturing VII

Please login to view abstract download link

Additive manufacturing enables complex, customized parts, but print quality depends on how the heat source is driven. In laser powder bed fusion (PBF), scan strategy affects thermal gradients and residual stress, influencing defects and distortion [1]. In practice, toolpaths are often generated with simple, fixed patterns (e.g., zigzag) that do not adapt to local geometry or the evolving thermal field. Reinforcement learning (RL) provides a framework for learning adaptive decisions from process feedback. Recent studies have used RL to improve thermal uniformity or generate toolpaths, but often rely on simplified thermal assumptions [2, 3]. In this work, we couple RL directly to a finite element thermal simulation so that policies learn from temperature evolution rather than from precomputed datasets [4]. We formulate toolpath planning as a sequential decision problem where an agent selects the next scan move based on the current state. To separate routing from thermal effects, we use two environments: (i) a geometric grid environment that isolates coverage and travel efficiency, and (ii) a thermally coupled environment that augments the grid with a diffusive Gaussian heat source and spatially resolved temperature feedback. Procedurally generated target geometries provide a training set and enable evaluation on held-out irregular shapes. We train Deep Q-Network (DQN) and Proximal Policy Optimization (PPO) agents and benchmark them against a zigzag baseline. Across test geometries, the learned policies complete targets faster (more filled cells per step) and, in the thermal environment, reduce peak temperatures without increasing path length. The policies generalize beyond the training set and exhibit thermally informed behavior, scanning boundaries first to preheat interior regions before a final pass. These results demonstrate the potential of simulation-in-the-loop RL for part-scale, physics-informed toolpath optimization in PBF, and motivate extensions to stacked 3D layers and defect-aware objectives.