Assignment 1
Due date: Thursday, September 23rd at 11:55 pm.
Please note that all of the commands and techniques you need to solve this assignment were given in class. No internet searches should be necessary to complete this assignment. If you aren't sure where to start, review the class slides.
The purpose of this assignment is to practise your bash scripting skills on a data set. Before you begin, be sure to create a new directory to hold your assignment, and move into that directory:
[ejspence.mycomp]
[ejspence.mycomp] pwd
/c/Users/ejspence/MSC1090
[ejspence.mycomp]
[ejspence.mycomp] mkdir assignment1
[ejspence.mycomp]
[ejspence.mycomp] cd assignment1
[ejspence.mycomp]
[ejspence.mycomp] pwd
/c/Users/ejspence/MSC1090/assignment1
[ejspence.mycomp]
From within this directory, perform the following steps to download the data for this assignment.
[ejspence.mycomp]
[ejspence.mycomp] curl -O -L tinyurl.com/IMSDataFile
[ejspence.mycomp]
[ejspence.mycomp] ls
IMSDataFile
[ejspence.mycomp]
[ejspence.mycomp] tar -z -x -f IMSDataFile
[ejspence.mycomp]
[ejspence.mycomp] ls
data IMSDataFile
[ejspence.mycomp]
[ejspence.mycomp] cd data
[ejspence.mycomp]
[ejspence.mycomp] pwd
/c/Users/ejspence/MSC1090/assignment1/data
[ejspence.mycomp]
The "curl" command downloads files from the internet. The "-O" flag tells curl to download the file without changing the name, and the "-L" flag tells curl to follow links.
The "tar" command unbundles a file which contains many other files. The "-x" flag tells tar to extract the file, "-z" indicates that the file has been gzipped (a type of compression), and the "-f" flag is used to indicate the file to apply the operation to.
This data represents the results of interviews of patients which have undergone auditory surgery. The data were collected by different graduate students, with the data from each student in a different directory. Once your data has been uncompressed, spend a little time examining the content of the directories, and the content of the files, so that you will understand what is being requested in the assignment.
1) Write a script called 'tenth_smallest.sh', which is to be run from the 'data' directory downloaded above. The script takes the name of one of the directories as an argument. The script is to determine and output the tenth-smallest "CI type" of the subjects in that directory. If there are multiple subjects with the same value, the first which is found should be output.
The script will be sourced from the command line, and should output similarly to below.
[ejspence.mycomp] pwd
/c/Users/ejspence/MSC1090/assignment1/data
[ejspence.mycomp]
[ejspence.mycomp] source tenth_smallest.sh Lawrence
Lawrence/Data0234:CI type: 6
[ejspence.mycomp]
[ejspence.mycomp] pwd
/c/Users/ejspence/MSC1090/assignment1/data
[ejspence.mycomp]
[ejspence.mycomp] source tenth_smallest.sh gerdal
gerdal/Data0250:CI type: 3
[ejspence.mycomp]
[ejspence.mycomp] source tenth_smallest.sh jamesm
jamesm/data_509.txt:CI type: 5
[ejspence.mycomp]
Note that you should still be in the data directory after the code is run.
2) Write another script called 'count_Sept_births.sh', also to be run from the 'data' directory, and which also takes a directory name as an argument. This script should count and output the number of subjects that were born in September, using a nice sentence.
[ejspence.mycomp] pwd
/c/Users/ejspence/MSC1090/assignment1/data
[ejspence.mycomp]
[ejspence.mycomp] source count_Sept_births.sh Lawrence
The number of subjects born in September is 3.
[ejspence.mycomp]
[ejspence.mycomp] pwd
/c/Users/ejspence/MSC1090/assignment1/data
[ejspence.mycomp]
[ejspence.mycomp] source count_Sept_births.sh gerdal
The number of subjects born in September is 4.
[ejspence.mycomp]
[ejspence.mycomp] source count_Sept_births.sh alexander
The number of subjects born in September is 4.
[ejspence.mycomp]
Some points to consider:
- Full points will be awarded for implementations which store the number of subjects born in September in a local variable, before printing the output.
- Similarly, full points will be awarded for solutions which use DO NOT use "grep -c". You may, however, use "grep".
- Mac users may find that there is extra white space around the numbers in their output sentences. Do not worry about this white space; extra spaces within the sentences are not important.
- After the script is finished running you should be in the same directory you were in before the script was run, the 'data' directory.
- The TAs will test your scripts by running them on their computers. DO NOT put anything into your script which will prevent it from being run on somebody else's computer.
Submit both ".sh" files which contain your code.
Assignments will be graded on 10 points basis.
Due date is September 23rd 2021 (midnight), with 0.5 point penalty per day for late submission until the cut-off date of September 30th, 2021, at 12:00pm (noon).