Assignment 8
Due date: Thursday, March 26th at midnight.
Be sure to use version control ("git"), as you develop your code. Do "git add ...., git commit
" repeatedly as you add and edit your code. You will hand in the output of "git log
" for your assignment repository as part of the
assignment.
In this assignment you will be creating two functions for creating a professional-looking plot. Ideally you could use this approach for any figure you would need to produce for your papers, thesis, etc. For doing so, we would like to invite you to use a representative set from your own data. It doesn't need to be unpublished nor new data, just something that might resemble the actual data you have to deal with in your research. If you don't have any data available, you can still use other data that is close to your interests, either from the R data sets or from other websites, like Open Data Toronto. If you are going to use some of the R data sets, do *not* use the ones we have been presenting and discussing in class and try to avoid overlapping with other students as well!
If you need some inspiration we invite you to visit our "Visualization Gallery" which is entirely composed of outstanding submissions from students from previous years (maybe next year we could have your plots displayed here as well).
Your script, named generatePlots.R
, or generatePlots.py
, should receive two command line arguments. The first will be the data file to work on. The second will indicate which action to perform:
- if the command line argument is "plot1", the script will generate a professional/publication quality plot, preferably using your own data, following the criteria and conventions discussed in class (lecture 12).
- if the command line argument is "plot2", the script will generate a professional/publication quality plot of a different type, preferably using your own data, following the criteria and conventions discussed in class. You may use the same data as used in for the "plot1" argument.
Please make sure your plots follow the professional-plotting criteria outlined in class. You can use basic plotting tools available in R or Python. You may also use ggplot2
if you wish
The plots should contain more than one graphical representation, i.e. it can *not* be just dots representing the data; it should be something like the data points + a fit, i.e., at least two graphical representations, or additional statistical results, should be present! Please select an appropriate file type to save the plots generated in 1) and 2), such that it preserves the quality of your figure!
Within your script, add comments to briefly describe what data or analysis are you using, and how you are plotting it.
Additionally,
- you will have to create a git-repository
- your script should have implemented defensive programming strategies for dealing with the command line argument
- you will have to have at least two modules: a main driver script and a utilities file (named
plottingTools.R
orplottingTools.py
) where the functions used for plotting purposes in the main driver will be defined - the functions should have arguments for receiving information and return-statements in the cases where you need to communicate further information to the rest of the code
- you must have one or more data-loading functions to load the data, either yours or whatever data you use
- no global variables of any kind, i.e. functions can not access variables that are not passed to them
- you can also use any of the functions you have been developing in previous assignments, in case you need to perform any statistical analysis in order to generate your plots.
Please submit:
- your driver script and utilities file,
- the final products of your R or Python script, i.e. two plot files,
- your data, so that when the script is run it will run successfully. If your data is too big to submit, contact us so that another means of getting us the data can be arranged. If you download the data from the internet, save the data in a file, whose name will then be the first argument to your driver script. The point is that, however you accomplish it, the script will run successfully on our computers, without modification!
- The output of 'git log' for this assignment.
Submit your main driver script and utiltites file, as well as any data set you decided to use, and the output of "git log" from your assignment repository.
To capture the output of 'git log
' use redirection, git log > git.log
, and hand in the "git.log" file.
Assignments will be graded on a 10 point basis.
Due date is March 26th, 2024 at midnight, with 0.5 point penalty per day for late submission until the cut-off date of April 2nd, 2024 at 10:00am.