Passer au contenu principal
SciNet
  • Accueil
  • Tous les cours
  • Calendrier
  • Certificats
  • SciNet
    Site principal Documentation my.SciNet
  • CCDB
  • Plus
Fermer
Activer/désactiver la saisie de recherche
Français
English Français
Vous êtes connecté anonymement
Connexion
SciNet
Accueil Tous les cours Calendrier Certificats SciNet Replier Déplier
Site principal Documentation my.SciNet
CCDB
Tout déplier Tout replier
  1. Tableau de bord
  2. DAT112 - Apr 2022
  3. Assignment 1

Assignment 1

Conditions d’achèvement
Ouvert le : jeudi 12 mai 2022, 12:00
À rendre : jeudi 19 mai 2022, 23:59

Due date: Thursday, May 19th, 2021 at midnight.

Consider the Dogs vs. Cats data set, which consists of a collection of photos of dogs and cats. I've created a modified version of this data set, in which I've scaled all the photos to 50 x 50 pixels, with black on the borders if the image was not a square. This data can be found here (192MB).

The goal of this assignment is to build the best neural network you can, which will categorize a given image into its respective category, dog or cat. Your script should not overfit, as much as possible, while simultaneously getting the highest score it can on the test data.

Create a Python script, called "dogs_cats_nn.py", which performs the following steps:

  • reads in the dogs vs. cats data set, given in the link above (the numpy function "load" will be helpful here).  You may assume that the data file is colocated with the script; the file name may be hard-coded.
  • splits the input and target data into training and testing data sets, with 20% in the test set.  Note that the data will be returned as a dictionary, with the keys 'images' and 'labels'.
  • builds a neural network, using Keras, to predict the category of the input images,
  • trains the network on the training data, and prints out the final training accuracy,
  • evaluates the network on the test data, and prints out the test accuracy.
  • creates a plot of the model's training loss as a function of epoch.

Your script will be tested from the Linux command line, thus:

$ python dogs_cats_nn.py
Reading dogs vs. cats data file.
Building network.
Training network.
The training score is [0.4612, 0.7828]
The test score is [0.4664459792613983, 0.7906]
$

Be sure, to the best of your ability, to try to address the problem that this data set has: it's too small (even though it's got 25,000 photos). Overfitting is a problem with neural networks applied to this data set. To attempt to address this problem, explore various ways of addressing overfitting:

  • Experiment with creating the smallest network you reasonably can.
  • Explore the ability to create new, artificial data, by using the ImageDataGenerator class, which can be found in the tensorflow.keras.preprocessing.image subpackage. You can read about how to use this subpackage here. Use this enlarged data set to train your model.
  • Experiment with regularization or dropout.

Experiment with your hyperparameters to create the best model you can which minimizes overfitting.  You should run the training until the loss stops improving, as demonstrated by your plot. The best model I have found in which the training and testing accuracies are similar returns a training and test accuracy of about 78%.  See if you can do better.

Submit your script which generates and trains your best model. The script will be graded on functionality, but also on form.  This means your script should use meaningful variable names and be well commented.


Submit your dogs_cats_nn.py, and the final plot of your training loss.

Assignments will be graded on a 10 point basis.
Due date is May 19th 2022 (midnight), with 0.5 penalty point per day off for late submission until the cut-off date of May 26th, at 11:00am.

Contacter l’assistance du site
Vous êtes connecté anonymement (Connexion)
Résumé de conservation de données


All content on this website is made available under the Creative Commons Attribution 4.0 International licence, with the exception of all videos which are released under the Creative Commons Attribution-NoDerivatives 4.0 International licence.
Fourni par Moodle