Parallel Solution of High-Order Numerical Schemes for Solving Incompressible Flows


Book Description

A new parallel numerical scheme for solving incompressible steady-state flows is presented. The algorithm uses a finite-difference approach to solving the Navier-Stokes equations. The algorithms are scalable and expandable. They may be used with only two processors or with as many processors as are available. The code is general and expandable. Any size grid may be used. Four processors of the NASA LeRC Hypercluster were used to solve for steady-state flow in a driven square cavity. The Hypercluster was configured in a distributed-memory, hypercube-like architecture. By using a 50-by-50 finite-difference solution grid, an efficiency of 74 percent (a speedup of 2.96) was obtained. Milner, Edward J. and Lin, Avi and Liou, May-Fun and Blech, Richard A. Glenn Research Center NASA-TM-4451, E-7345, NAS 1.15:4451 NASA ORDER C-99066-G; RTOP 505-62-52...






















Solving Incompressible Navier-Stokes Equations on Heterogeneous Parallel Architectures


Book Description

In this PhD thesis, we present our research in the domain of high performance software for computational fluid dynamics (CFD). With the increasing demand of high-resolution simulations, there is a need of numerical solvers that can fully take advantage of current manycore accelerated parallel architectures. In this thesis we focus more specifically on developing an efficient parallel solver for 3D incompressible Navier-Stokes (NS) equations on heterogeneous CPU/GPU architectures. We first present an overview of the CFD domain along with the NS equations for incompressible fluid flows and existing numerical methods. We describe the mathematical model and the numerical method that we chose, based on an incremental prediction-projection method.A balanced distribution of the computational workload is obtained by using a domain decomposition method. A two-level parallelization combined with SIMD vectorization is used in our implementation to take advantage of the current distributed multicore machines. Numerical experiments on various parallel architectures show that this solver provides satisfying performance and good scalability.In order to further improve the performance of the NS solver, we integrate GPU computing to accelerate the most time-consuming tasks. The resulting solver can be configured for running on various heterogeneous architectures by specifying explicitly the numbers of MPI processes, threads and GPUs. This thesis manuscript also includes simulation results for two benchmarks designed from real physical cases. The computed solutions are compared with existing reference results. The code developed in this work will be the base for a future CFD library for parallel CPU/GPU computations.