Skip to main content
SciNet
  • Home
  • All Courses
  • Calendar
  • Certificates
  • SciNet
    Main Site Documentation my.SciNet
  • CCDB
  • More
Close
Toggle search input
English
English Français
You are currently using guest access
Log in
SciNet
Home All Courses Calendar Certificates SciNet Collapse Expand
Main Site Documentation my.SciNet
CCDB
Expand all Collapse all
  1. Dashboard
  2. PHY1610 - Winter 2025
  3. Assignment 9: Parallel histogram computation with MPI

Assignment 9: Parallel histogram computation with MPI

Completion requirements
Opened: Monday, 31 March 2025, 12:00 AM
Due: Monday, 7 April 2025, 11:59 PM

This week's assignment is similar to that of assignment 8, i.e., we wish to compute the distributions of the logarithm of the number of time steps walkers needed to reach the bottom of a porous medium. However, we now have a 200x larger data set, which you can download below or on the teach cluster from /home/l/lcl_uotphy1610s1001/morestepnumbers.dat .

To parallelize this larger data set, we will distribute the data points over MPI processes. With this distributed array, we will be able to compute histograms in parallel.

Your task is to write and run an MPI program that performs the following:

1) First, a root process reads the command line arguments that are the logbase, the filename with the data, as well as the batch size Z.

2) Next, the root process reads a batch of numbers of size Z.  After each batch is read, the data points are to be distributed to the MPI processes using a scatter.  This is repeated until all numbers have been distributed. 

3) Once all data points have been distributed over the MPI processes, each process should compute a histogram of its points (using the same log base).

4) The results of the distributed histograms should be collected by the root process.  The normalized histogram should then be printed out to the console as two columns (the start of the histogram in column 1 and the fraction of the data points in the second)

5) Create a second version where, instead of steps (2) and (3), the numbers are read in in parallel by the MPI processes.

Your program should a batch size Z=100'000, and log base of 1.1.  Write job scripts to run both versions of this code for P=1, 8, 20, 40 and 80 processes on the Teach cluster, timing the result.  Submit these jobs to the queue and save their output. The output of these five runs should be identical except for the timing.

As before, we expect you to use make and git with have several meaningful commits, and to have added a README file to the project.

Submit your work (code, Makefile, job scripts, README, git directory and job script outputs) in one zip file by April 7th, 23:59 PM. (Do not include the data!) The usual late penalty applies.

  • morestepnumbers_1.7GB.dat morestepnumbers_1.7GB.dat
    31 March 2025, 2:44 PM
Contact site support
You are currently using guest access (Log in)
Data retention summary


All content on this website is made available under the Creative Commons Attribution 4.0 International licence, with the exception of all videos which are released under the Creative Commons Attribution-NoDerivatives 4.0 International licence.
Powered by Moodle