PHY1610 - Winter 2023: Assignment 10: Managing many serial computations

Opened: Monday, 10 April 2023, 12:00 AM

Due: Monday, 17 April 2023, 11:59 PM

For this assignment, your task is to (1) perform a parameter sweep calculation using the mzasolve application which implements a solution of the modified zombie apocalypse equations (see assignment 5), and (2) post-process the results.

The mzasolve application is given to you and should not be changed (even though it is probably slower than your own solution to assignment 5). You can find the mzasolve executable on the Teach cluster in /scinet/course/phy1610/mzasolve/mzasolve .

This application has the behaviour as prescribed in assignment 5, including all parameter values and the fact that you can pass it the following parameters as command-line arguments:

argument 1: Y(0), i.e., the initial uninfected fraction of the population without zombie-killing knowledge;
argument 2: Z(0), i.e., the initial fraction of the population that has turned into zombies;
argument 3: the name of the netcdf file to be written with the time evolution of the populations until convergence.

In addition to the netcdf file, the application produces output to console as well, of the following form:

[PARAMETERS]
alpha=3
beta=2
gamma=1
delta=1.5
X0(regular)=0.978
Y0(killers)=0.018
Z0(zombies)=0.004
filename=timeseries.nc
[OUTCOME]
runtime=25.81
survival=59%
winners=humans

This particular output would be result of the command line "./mzasolve 0.018 0.004 timeseries.nc".

For step (1) of this assignment, you are to perform a parameter scan for all combinations of 51 values of Y(0) between 0 and 1 (inclusive), and 51 values of Z(0) between 0 and 1 (inclusive), so a total of 2601 cases, each writing to a netcdf file and directing the console output to a file. These files should be written to a different directory for each parameter combination, so different runs do not overwrite each others output.

You should use GNU Parallel for this part such that it uses all 16 cores on a Teach cluster compute node. The GNU Parallel command should be embedded in a job script to be submitted to the scheduler. Prior to the GNU Parallel command in this job script, you probably need some code to create the directories that will contain the output, perhaps utilizing GNU parallel as well.

GNU Parallel is a very versatile tool. A few of GNU Parallel and bash features that you should consider using:

The "--joblog" parameter to write a record of how each job went. You will need to use bash redirection to capture the console output in this case.
Alternatively to "--joblog", you can use "--files --results DIR", as it does the output capturing for you into the directory "DIR".
Use the ":::" syntax to create combinations of parameters .
Use the "{1}", "{2}", etc replacement string syntax to specify the template command from which GNU Parallel constructs the actual command list.

Consult the GNU Parallel documentation for further explanations: https://www.gnu.org/software/parallel/parallel_tutorial.html or type "man parallel" on the Teach cluster (after loading the gnu-parallel module of course).

Also very useful is the bash command `seq` which can create a range of non-integer values (see "man seq"). For instance, you can generate a range of numbers between 0 and 1 with a step of 0.1 with "seq 0 0.1 1", and you can insert this in a bash command with "$(seq 0 0.1 1)" , for instance, "echo $(seq 0 0.1 1)".

Note that some of the 2601 computations will not be successful as the total population would exceed 100%.

For step (2), the postprocessing, after this job has run, the console output of the successful computations should be collected into two ASCII text files as follows. A script/programm should create one file called "humanswin.dat", containing lines with the "Y(0) Z(0)" pairs for which the humans win, and it should create another file called "zombieswin.dat" containing the pairs for which the zombies win. (Note: we are not using the netcdf files here.)

You can write the post-processing script/program in any language you wish, as long as it works on the Teach cluster and produces these two files.

Continue using git to maintain your job script and your post-processing script. Include the results of the post-processing, i.e., do include "humanswin.dat" and "zombieswin.dat", but do not include the output of the individual computations.

Submit your repo by midnight April 17, 2023.