A few quick questions regarding jobscript

A few quick questions regarding jobscript

by Dave Bhullar -
Number of replies: 2

Hi Dr. van Zon,

I have a jobscript with the following headers:

#!/bin/bash

#SBATCH --nodes=1

#SBATCH --cpus-per-task=1

#SBATCH --time=1:00:00

#SBATCH --ntasks=16

#SBATCH --output=mpi_output_%j.txt

But when I run it, I get the following errors in my mpi_ouput file. Is it OK to safely ignore? Is it OK if we manually copy the timings to a textfile table ourselves? Should the textfile not contain anything else but two blank columns with process number and associated timings? Or should we put maybe '//process number   timing result' 

Please let me know if possible and I would appreciate it. 


In reply to Dave Bhullar

Re: A few quick questions regarding jobscript

by Dave Bhullar -
Update: The timing result I get for mpirun -n 1 is a lot faster than for 2, 3, 5, 9, 16 which should not be the case I don't think. These NetCDF errors get thrown for only mpi -n 1 but not the others. Does the command 'mpirun - n 1' not work?
In reply to Dave Bhullar

Re: A few quick questions regarding jobscript

by Ramses van Zon -

Yes, that would mean that the -n 1 case is not working.  When size=1, ie., if you have one process, that process is at the same time the first and the last, so I could imagine that the logic in the program is not quite prepared for that and it is trying to read a part of the netcdf file that is out of bounds (i.e, row -1 or row nrow+1 or something like that), and the program abort and does not do any work. That's why it would be fast.