WCCM ECCOMAS 2026

Predictive Maintenance Planning with Influence-Based Abstraction and Reinforcement Learning

Bhustali, Prateek (Delft University of Technology)
Andriotis, Charalampos (Delft University of Technology)
Oliehoek, Frans (Delft University of Technology)

In session: MS139C - Optimization under Uncertainty III

Please login to view abstract download link

Engineering systems operate in complex environments with multiple uncertainties and demand proactive inspection and maintenance (I&M). Planning these interventions optimally, however, remains challenging. I&M is a sequential decision-making problem characterized by stochastic deterioration processes, partial observability of damage states, long planning horizons and operational risk and budget constraints. These difficulties are amplified in multi-component I&M settings such as, k-out-of-n and networked systems, where optimal solutions require reasoning over an exponentially large set of joint policies. This curse of dimensionality renders the computation of optimal I&M policies cumbersome at scale. To scale to large policy spaces, recent approaches predicated on multi-agent reinforcement learning (MARL) decentralize decision-making across system components by learning component-level policies under a shared long-term objective. Through repeated interaction with an environment simulator MARL agents can learn coordinated policies, but require extensive sampling. This renders training impractical when the simulation is computationally intensive. To address this, we introduce influence-based abstraction (IBA) in I&M planning problems. IBA is a principled approach that summarizes inter-agent interaction through compact yet sufficient influences. Under this solution framework, each agent learns policies in an influence-augmented local model (IALM) conditioned on a low-dimensional influence variable summarizing the remainder of the system. These influences are periodically estimated using trajectories from the expensive full-system simulator. Consequently, policy learning depends on this fast, compact abstraction rather than on the full joint state and joint policy, improving sample efficiency. This separation of model estimation and policy optimization yields a model-based reinforcement learning formulation that leverages structured abstraction informed by engineering knowledge, rather than relying solely on implicit representation learning. To highlight the gains in sample efficiency enabled by IBA, we evaluate it on cannonical benchmark environments designed for I&M planning, and compare it against widely used MARL solutions.