# Assignment 9: Parallelize NetCDF data analysis

**Opened:**Sunday, 2 April 2023, 12:00 AM

**Due:**Monday, 10 April 2023, 11:59 PM

**Analysis of Wave Energies**

The wave2d application of the previous assignment has been used to produce a large NetCDF file "precisewave.nc" containing a time sequence of snapshots of the damped two-dimensional wave at a high resolution, along with all relevent parameters.

What we are after now is to compute the overall potential energy \(V\) and the overall kinetic energy \(T\) of the wave field \(\rho\), as well as the sum of these two energies \(E=T+V\), as a function of time.

The potential energy can be expressed in terms of the spatial derivatives as (in arbitrary units):

$$V(t) = \int c^2 \left[ \left(\frac{\partial\rho(x,t)}{\partial x}\right)^2+\left(\frac{\partial\rho(x,t)}{\partial y}\right)^2 \right] dx$$

while the kinetic energy can be expressed in terms of the temporal derivative (in the same arbitrary units):

$$T(t)= \int \left(\frac{\partial\rho(x,t)}{\partial t}\right)^2 dx$$

and the total energy is simply \(E(t)=T(t)+V(t)\). If the wave had not been damped, this total energy would have been conserved, but in this case, we expect it to decay in time.

**Serial Starting Code**

As a starting code, you are given a serial implementation to do this. Because this assignment is both I/O and CPU heavy, you should do work on a compute node of the Teach cluster. To do so interactively, after logging in, you can type e.g.

`$ debugjob -n 4`

to get a quarter of a compute node (save using full nodes for your scaling analysis).

Since home directories are read-only on compute nodes, you need to work on the scratch file system, i.e., do

`$ cd $SCRATCH`

You can get the serial code and the data file on the Teach cluster with

$ git clone /scinet/course/phy1610/a9analyzewave

$ cd a9analyzewave

$ source setup

$ make

$ cp /scinet/course/phy1610/a9analyzewave/precisewave.nc . $ ./analyzewave precisewave.nc energies.tsv

We are using cp above because the data file is 17 GB which is, practically, too large to keep in a repo. The analyzewave code takes two arguments, the first is the input file name, and the second is the output file name. The latter consists of columns separated by tabs, hence the extension tsv (tab separated values).

In the serial code, the partial derivatives are replaced with finite differences, so that the expressions for \(V\) and \(T\) have become

$$V = \sum_{i=1}^{n-1} c^2 \left[\left(\rho_{s,i,j}-\rho_{s,i,j-1}\right)^2 +\left(\rho_{s,i,j}-\rho_{s,i-1,j}\right)^2 \right]$$

and

$$T= \sum_{i=1}^{n-1} \frac{\Delta x^2}{t_{out}^2}\left(\rho_{s,i,j}-\rho_{s-1,i,j}\right)^2$$

where \(s\) is the time index, \(i\) and\(j\) are spatial grid indices, \(\Delta x\) is the grid spacing , and \(t_{out}\) is the time between snapshots, so \(t=s\times t_{out}\). The \(t_{out}\) can be different from the time step \(\Delta t\) used in wave2d.

Remember that the i=0 and i=n-1 rows and the j=0 and j=n-1 columns were set to zero as boundary conditions in wave2d, and are therefore not included in the above sums (and also because their stencils do not allow this). Note also that we need \(\rho\) at two time points, \(s\) and \(s-1\) to compute the kinetic energy \(T\).

**Your Assignment**

It is now your task to parallelize this code using domain decomposition with MPI, and to determine the scaling of your MPI code on a single Teach node. Keep in mind that:

- You should create an integrated test based on the output of the serial code.
- No MPI process should contain the full wave at any moment.
- It is okay to use the usual netcdf routines to read part of the data from each process (no need to use the specialized parallel netcdf routines in this case).
- You will need to adapt the Makefile for MPI as well.
- We still expect you to use git commits.
- For the scaling analysis, you must write and submit a jobscript that runs the analysis on one node for 1, 2, 3, 5, 9, and 16 processes and collects the timings. Collect/copy these in a table in a text file.
- You should produce a plot of the speed up as a function of the number of processes.

Submit your code as a repo, including the timing table, and the plot by Monday April 10, 2023, 11:59 PM. Do not include the data file in the repo!