| industrial collaborators: | Roxar |
| academic collaborators: | Bournemouth University |
| initiated : | 2009/09/07 |
| last updated: | 2010/01/05 |
The problem
Newton-Raphson iteration and solving sparse linear equations is a standard numerical technique used in many fields. In the domain of oilfield reservoir simulation the bulk of computational time is spent solving the linear equations.
The aim of the project is to accelerate solving sparse linear systems of equations using GPUs. A key component is sparse matrix vector multiplication. This is limited by memory bandwidth; a typical high-end CPU chip has a memory bandwidth of about 30GB/s. A typical high-end GPU has a memory bandwidth of about 100GB/s. We would expect a significant improvement in performance running on the GPU as compared to a CPU. Indications suggest future GPUs will expand memory bandwidth faster than CPUs.
The approach
Our approach was to benchmark an existing MPI parallel C++ application using both single core and all 8 cores of a high-end CPU based system. These results were then compared with results using the latest NVidia GPUs.
| Nehalem Xeon X5560 2.80 GHz x 2 24 GB RAM |
GPU GTX 295 (using single GPU) |
|
| 1 Core | 14.7 ms | 5.94 ms |
| 8 Cores | 4.4 ms |
Table 1: CSR Matrix Vector Multiplication.
| Nehalem Xeon X5560 2.80 GHz x 2 24 GB RAM |
GPU GTX 295 (using single GPU) |
|
| 1 Core | 14.7 ms | 1.12 ms |
| 8 Cores | 4.4 ms |
Table 2: Diagonal Matrix Representation (DIA) Matrix Vector Multiplication.
The initial results were disappointing, see Table 1. Using all cores of a 2 chip CPU system was similar to the GPU. When the data representation was changed the GPU was able to use its maximum memory bandwidth and was significantly faster, see Table 2.
A benchmark result with Diagonal Matrix Representation (DIA) has shown that the GPU is 4 times faster than the latest multi-core CPU system.
The results have shown that a single GPU can run significantly faster than multiple high-end CPUs. However, attaining good performance is highly dependent on the particulars of the memory layout of data. Furthermore, GPU performance is much more sensitive than the CPU to the implementation of the algorithm.
The internship has enabled Roxar to evaluate the value and cost of development to support GPUs in our applications. The work has given a better understanding of the link between numerical mathematics and how it is mapped to computer hardware. Finally, the intern Ehtzaz has been exposed to problems outside of his academic field and the issues in developing commercial products.
related resources:
| Using a Graphical Processing Unit in oilfield reservoir simulation | |
| » | Technical Summary |
| [Find other Energy and utilities projects] |