Multi-backend Xcompact3D for Heterogeneous Architectures
Please login to view abstract download link
Xcompact3D [1] is a Computational Fluid Dynamics framework to study turbulent flows on supercomputers. The next generation of Xcompact3D, code‑named x3d2, targets heterogeneous CPU-GPU architectures while preserving the high‑order compact finite‑difference schemes, spectral Poisson solvers, and immersed boundary method capabilities of the original solver. Designed for high-fidelity, scale-resolving simulations of incompressible flows, the new architecture introduces a modular backend layer that separates numerical algorithms from hardware‑specific implementation. This enables a single modern Fortran codebase to drive distinct hardware-optimised backends (CUDA/OpenMP). The CUDA backend supports NVIDIA GPUs, while the OpenMP backend targets multicore CPUs with planned extensions for AMD GPU offloading. x3d2 focuses on bandwidth‑intensive simulations of turbulent flows, where memory access patterns and communication overheads are critical bottlenecks on multi‑GPU systems. Data layout, kernel interfaces, and halo exchanges are expressed in a hardware‑agnostic layer and specialised at compile time. This approach allows for backend‑specific optimisation while maintaining numerical consistency with the validated legacy Xcompact3D implementation. Parallel I/O is handled through ADIOS2 [2], integrated for scalable checkpoint‑restart, MPI‑parallel output, and coupling to post‑processing workflows. The design exploits GPU‑aware ADIOS2 features to minimise data movement and enable in‑situ analysis. We present canonical validation benchmarks, including channel flow and Taylor-Green vortex configurations, to assess backend consistency. Building on this infrastructure, we present simulations of the turbulent wake behind a smooth fixed cylinder (Re=300) and the wakes of two aligned turbines. These cases evaluate scaling and throughput on multi‑GPU systems, demonstrating that x3d2 successfully evolves a mature CPU‑oriented code into a performance‑portable solver for exascale-class heterogeneous platforms. REFERENCES [1] S. Laizet and N. Li, Incompact3d: A powerful tool to tackle turbulence problems with up to O(105) computational cores, Int. J. Numer. Methods Fluids, Vol. 67(11), pp. 1735–1757, (2011). [2] W. F. Godoy et al., ADIOS 2: The Adaptable Input Output System. A framework for high-performance data management, SoftwareX, Vol. 12, 100561, (2020).
