Skip to main content
SciNet
  • Home
  • All Courses
  • Calendar
  • Certificates
  • SciNet
    Main Site Documentation my.SciNet
  • CCDB
  • More
Close
Toggle search input
English
English Français
You are currently using guest access
Log in
SciNet
Home All Courses Calendar Certificates SciNet Collapse Expand
Main Site Documentation my.SciNet
CCDB
Expand all Collapse all
  1. Dashboard
  2. BCH2203 - Winter 2024
  3. Assignment 5: Seed classification

Assignment 5: Seed classification

Completion requirements
Opened: Monday, 1 April 2024, 12:00 AM
Due: Monday, 8 April 2024, 11:59 PM

Get the data file seeds_dataset.txt from the zip file that you can download from https://archive.ics.uci.edu/dataset/236/seeds. The file is in tab-separated value format. Each row is a sample of wheat seeds, and the columns have the following meaning: area, perimeter, compactness, length_kernel, width, asymmetry, length_groove, and label.  The first 7 are features while the last column contains is 1,2,3 depending on the species of wheat, i.e., Kama, Rosa, or Canadian wheat; this is the target value.

Write a python script that reads this file (you may assume it lives in the current directory) and builds decision trees to predict the label from only 3 features.  Since there are 7 features available, different features could be selected. In fact, this can be done in 35 ways, and your script should do all of these.  Pick the same maximal depth and minimal samples per leave for all cases.  

The script should use the usual separation of training and test data to score the accuracy of the tree, and pick out which decision tree is the most accurate, and print out the names of the three features used in the most accurate tree.  In a sense, this tells us which features are the most defining ones for the species.


Contact site support
You are currently using guest access (Log in)
Data retention summary


All content on this website is made available under the Creative Commons Attribution 4.0 International licence, with the exception of all videos which are released under the Creative Commons Attribution-NoDerivatives 4.0 International licence.
Powered by Moodle