Skip to main content
SciNet
  • Home
  • All Courses
  • Calendar
  • Certificates
  • SciNet
    Main Site Documentation my.SciNet
  • CCDB
  • More
Close
Toggle search input
English
English Français
You are currently using guest access
Log in
SciNet
Home All Courses Calendar Certificates SciNet Collapse Expand
Main Site Documentation my.SciNet
CCDB
Expand all Collapse all
  1. Dashboard
  2. BCH2202 - Winter 2023
  3. Assignment 6

Assignment 6

Completion requirements
Opened: Wednesday, 26 April 2023, 9:30 AM
Due: Wednesday, 3 May 2023, 11:59 PM

Note: late assignments will not be accepted!

Let us consider the following situation: you've bought a half dozen eggs. You're worried about salmonella contamination, so you take DNA samples from the 6 eggs. You decide to test which batches contain salmonella by aligning the DNA samples against a reference salmonella genome and a reference chicken genome.

For this assignment, we will not get our genomes from Genbank. Instead, we will use the data contained in the file a6data.tar.gz. This zip file contains the following files:

  • salmonella.fa, which contains the salmonella genome,
  • chicken.fa, which contains the chicken genome (for the purpose of this assignment, chicken.fa is actually only one tenth of one of the chicken's chromosomes), and
  • eggX-fragmentYY.fa, which contain the sequences sampled from the eggs. There are roughly 40 fragments per egg, each about 150 nucleotides long.  Do not hard-code these files, but rather search for them use dir and startsWith.

Your task is to write a driver function, using as many other functions as you think are necessary to perform modular programming, which will

  • build the BLAST indices for the two reference genomes.  You may hard-code the reference files.  You may also assume that the 'data' directory is located in the same directory as your driver function.
  • use rBLAST to align the fragments with the references,
  • count the number of matches for each reference,
  • for each egg, use the ratio of the number of hits against the salmonella reference to the number of hits for chicken reference as a measure on the rottenness of that egg.
  • print out the rottenness of each egg, sorted from freshest to most rotten.

In this assignment, you'll be using the rBLAST library to perform alignment. You will need BLAST installed on your laptop. Some functions I found helpful were dir, startsWith, and order. There are many warnings and errors generated when this code runs, which is annoying. I found the 'silent' flag in the predict function to be helpful, as well as the suppressWarnings function.

Keep best practices in mind, i.e., use functions, comment your code, use good names for variables and functions.

Submit your code by May 3rd, 2023, 23:55 PM. Late assignments will not be accepted!

Contact site support
You are currently using guest access (Log in)
Data retention summary


All content on this website is made available under the Creative Commons Attribution 4.0 International licence, with the exception of all videos which are released under the Creative Commons Attribution-NoDerivatives 4.0 International licence.
Powered by Moodle