Assessment of three methods to train deep learning models with hybrid data in the context of forging processes

  • Pasquelin, Marius (LCFC Arts et Metiers Sciences et Technologie)
  • Durand, Camille (LCFC Arts et Metiers Sciences et Technologie)
  • Baudouin, Cyrille (LCFC Arts et Metiers Sciences et Technologie)
  • Bigot, Régis (LCFC Arts et Metiers Sciences et Technologie)

Please login to view abstract download link

The development of surrogate models based on deep learning techniques makes it possible to design models that are both accurate and computationally efficient. Such models can be leveraged in forging processes, where instrumentation is often challenging due to harsh operating conditions (high temperatures, high strain rates, vibrations), and could provide real-time feedback to the operator during processing. A pipeline relying on hybrid training data—combining synthetic data generated from finite element (FE) simulations and experimental data from measurements—is proposed. Three approaches enabling such hybrid training are examined in this study: transfer learning, residual learning, and Model Agnostic Meta Learning (MAML). Their ability to transfer knowledge from one or several initial datasets—rich but biased—toward a target dataset that is sparse in data is assessed. Particular attention is given to the performance of each method as a function of the amount of available target data. In this work, the so-called experimental data are generated using FE simulations in order to evaluate model performance. Having full control over these “experimental” data makes it possible to quantify the impact of potential sensor failures by adding noise, as well as to determine the amount of data required to achieve satisfactory training. The models employed are Multi Layer Perceptrons trained using a Mean Squared Error (MSE) loss function, while the Mean Absolute Percentage Error (MAPE) is used as the evaluation metric. The preliminary results indicate that transfer learning and MAML clearly outperform the other approaches in low data regimes, with performance differences between the methods diminishing as the size of the target dataset increases.