|
|
Methods for Computational Gene Prediction |
|
|
|
COURSE MATERIALS |
Slides : PDF / Powerpoint
IMPORTANT: You will need some extra fonts if you download the powerpoint slides. |
Addional exercises [coming soon] click here to suggest additional exercises |
Data sets Synthetic data: G. simplicans data from chapter 5:
Model parameters from chapter 5: NOTE TO INSTRUCTORS: You can generate your own synthetic data using this script. It will generate separate training and test sets using the same codon frequencies, signal weight matrices, and GC% (these biases are randomly generate anew at each run of the program). Exon, intron, and intergenic length distributions will be similar to those for the data sets used in the book (G. simplicans, above). Real data: FASTA and GFF files from
various organisms (human, mouse, mosquito, rice, and others) can be
found here
|