2D diffusion equation
The assignment is to numerically solve the diffusion (heat) equation
in two dimensions, using GPU acceleration, in either Python or C++.
You can find serial, CPU-based solutions in both languages in the course’s source tarball.
If you choose Python, you can follow the instructions on the
appropriate slide in the handouts, titled Setting up the environment
(Python). If you want graphics on Mist, please also install the
matplotlib
package in your Conda environment using conda install
(no
need on Graham as the package is provided by the scipy-stack
module).
You could modify the file diff2d.py
such that the bulk of the
calculation (within the time loop) is done using the GPU. You can follow
the gravitational potential calculation example shown in class, Numba
and/or CuPy can be used in the solution. Note that the sample solution in Python is equivalent to the “naïve” solution for the gravitational potential problem, therefore very bad.
If you choose C++, we count on you being familiar with how to compile
source code. You could modify the file diff2d.cpp
such that the bulk
of the calculation (within the time loop) is done using the GPU. On
both Mist and Graham, load the following modules: cuda
, gcc
, pgplot
(you
may skip PGPLOT but then please remove the plotting calls from the main
source file and do not compile diff2dplot.cpp
). If you choose to work
with HIP instead of CUDA, on Mist you can load the hip
module in
addition to the cuda module (which is still necessary since HIP uses the
CUDA compiler under the hood when compiling for Nvidia GPUs). HIP is
not currently available on Graham, but you can try to install it locally
there.
The sample C++ code uses rarray,
you can install it locally or just pull the header.
The usual suffix of CUDA files is .cu
and the nvcc
command is used to
compile the source (instead of g++
for example). You can keep the .cpp
suffix, but then have to pass --x=cu
to nvcc
(this is not recommended for files that contains a kernel launch with triple angle brackets, as that is not legal C++ syntax). The usual suffix of HIP
files is just .cpp
, and the hipcc
command is used to compile. GPU kernels
can be in the same file where the main function is, as we saw in the
vector addition example, but in more complex applications the GPU code
(including kernel launches) would generally be separated out to one or more compilation units that
are linked to the rest of the code during the build process.
The Mist login node has four GPUs that can be used for the assignment, on Graham one has to submit a job to the scheduler. Unlike for submitted jobs, there is no guarantee that a GPU on the Mist login node would be free when you launch your application, as the node is shared by everyone. You could use the nvidia-smi
command to see the current usage of the GPUs. By default, the first device (number 0) is used, but this behaviour can be change by setting an environment variable as shown below. For example, if you want to use device number 1:
CUDA_VISIBLE_DEVICES=1 python code.py
There are three “bonus” tasks that you can try for your own amusement (2 & 3 are beyond the scope of this workshop):
- The smaller Δx, the more accurate and computationally heavy the solution. Plot the timing for your solution and of the serial CPU-based solution (and possibly improved CPU-based solutions) as a function of Δx.
- Decompose the domain and solve the problem with multiple GPUs on the same node.
- Use a distributed memory library to deploy your solution on multiple nodes.
Hint: for a single node you could use multiprocessing
in Python and thread
or OpenMP in C++.
For multiple nodes you could use mpi4py
(Python) or MPI (C++).