Hyper Reduction on Retraining of Data-Augmented Predictive Deep Neural Network
Please login to view abstract download link
The Data-Augmented Predictive Deep Neural Network (DAPredDNN) framework introduced in~[1] combines convolutional autoencoders (CAE), feed-forward neural networks (FFNN), and kernel dynamic mode decomposition (KDMD) to enhance the time extrapolation capabilities of non-intrusive surrogate models for parametric, nonlinear dynamical systems. This framework highlights the data augmentation, in which the latent dynamics is first learned and extrapolated by KDMD. Then the KDMD-extrapolated data is decoded and integrated into the original training data. The network is retrained with this augmented data to overcome the extrapolation limitations encountered by many state‑of‑the‑art methods. However, retraining the full network on an enlarged augmented dataset still requires a non-negligible computational cost, motivating further reduction strategies. This work focuses on hyper-reducing the computational effort during the retraining stage of DAPredDNN from two perspectives: reducing the network complexity and reducing the size of the retraining dataset. Low-rank adaptation (LoRA)~[2] is employed to introduce a parameter-efficient update of the pretrained network, thereby significantly decreasing the number of trainable parameters during retraining. In addition, discrete empirical interpolation in the tensor t-product framework~[3] is used to identify and select representative samples from the augmented dataset, enabling retraining on a significantly smaller subset while preserving the essential dynamical information. Together, retraining a low-rank network on a small but informative dataset forms a hyper-reduction strategy that accelerates retraining within the original DAPredDNN pipeline. Numerical experiments show substantial reductions in both, the number of trainable parameters and required data during the retraining phase, yielding significant speedups with only minor accuracy loss and thereby improving the efficiency of data-augmented surrogate modeling for extrapolation in time domain. We further present ablation studies and analyze a special case that incorporates limited real data into the augmentation process, which shows improved prediction accuracy.
