Performance Improvement of Flow Computation on Overset Grids Method with Multigrid Method and Domain Decomposition Hybrid Parallelization
Please login to view abstract download link
A study of a performance improvement on flow computations using the overset grids method with multigrid method and domain decomposition parallelization is introduced. Results are based an in-house structured CFD solver with the finite volume method and an in-house system for the determination of overset relations. At first, the integration of the aggregation multigrid method to the overset grids method and the extension to the full multigrid method(FMG) are indicated. Weight values for an overset grid interpolation can be determined on the fine grid level of the multigrid method, however, weight values might be difficult to determine on coarser grid level. Therefore, cell flags to be solved or unsolved are inherited from finer grid levels to coarser grid levels on the multigrids. And, the multigrid method can be extended to FMG that a computation starts from coarser grid level and the solution is interpolated to finer grids as better initial solutions for finer grid. The FMG achieves a fast convergence, and the effectiveness is revealed. Next, the hybrid parallelization based on the domain decomposition and the share memory type is introduced. Computational domains and interpolation data are divided considering overlapping regions including multigrid levels, and the hybrid parallelization method achieves faster computation comparing with the flat MPI and pure OpenMP methods. Finally, the flow computation with the dynamic overset grids method is conducted. The computational grids which are used for the generating of the pre-decomposition overset information are also loaded and stored. After updating the coordinate values by body motions, overset information is also updated on each rank process using its assigned number of threads considering overlapping regions. Based on the relationship between pre- and post-decomposition, the overset information is reallocated and updated. The performance improvement is examined using the supercomputer system.
