Biol375 2015: Difference between revisions
imported>Weigang |
imported>Weigang |
||
Line 179: | Line 179: | ||
| | | | ||
# [Do NOT use computer for this part] Compare [[Datafile|these two Ebola VP30 sequences]], one from the 2014 outbreak and the other from the 1994 outbreak. | # [Do NOT use computer for this part] Compare [[Datafile|these two Ebola VP30 sequences]], one from the 2014 outbreak and the other from the 1994 outbreak. | ||
## | ## Calculate the proportion of difference (''p'') between the two sequences | ||
## Calculate Jukes-Cantor distance (''d'') between the two sequences (specify unit) | ## Calculate Jukes-Cantor distance (''d'') between the two sequences (specify unit) | ||
## Count the number of transitions and transversions (arrange in a table, as we did in the class) | ## Count the number of transitions and transversions (arrange in a table, as we did in the class) | ||
## Identify the number of synonymous and nonysynonymous substitutions | ## Identify the number of synonymous and nonysynonymous substitutions | ||
## Assuming that the total number of synonymous sites S=174 and the total number of nonsynonymous sites N=690, calculate <i>d<sub>s</sub> and d<sub>n</sub></i> (with Jukes-Cantor correction) | ## Assuming that the total number of synonymous sites S=174 and the total number of nonsynonymous sites N=690, calculate <i>d<sub>s</sub> and d<sub>n</sub></i> (with Jukes-Cantor correction) | ||
# Calculate genetic distances among the primate mitochondria sequences using R-Studio | # [Computer Exercise] Calculate & compare genetic distances among the primate mitochondria sequences using R-Studio | ||
## Make sure you have a file "mt_primate.fas" in your working directory (e.g., "/Users/john/Documents") | ## Make sure you have a file "mt_primate.fas" in your working directory (e.g., "/Users/john/Documents") [Note: Refer back to Assignment #3 if you couldn't locate the file.] | ||
## Load library: library(ape) | ## Load library: library(ape) | ||
## Read alignment: mt = read.FASTA("mt_primate.fas") | ## Read alignment: mt = read.FASTA("mt_primate.fas") | ||
## Calculate raw distance: mt.raw = dist.dna(mt, model = "raw") | ## Calculate raw distance: mt.raw = dist.dna(mt, model = "raw") | ||
## | ## Apply Juke-Cantor (one-parameter model) correction: mt.jc = dist.dna(mt, model = "JC") | ||
## | ## Apply Kimura(two-parameter model, for Ts and Tv) correction: mt.k80 = dist.dna(mt, model = "K80") | ||
## Plot JC distance vs the raw distance: plot(mt.raw, mt.jc, xlab = "uncorrected distance (diff/site)", ylab = "corrected distance (sub/site)", xlim = c(0,0.4), ylim = c(0,0.5), las =1) | ## Plot JC distance vs the raw distance: plot(mt.raw, mt.jc, xlab = "uncorrected distance (diff/site)", ylab = "corrected distance (sub/site)", xlim = c(0,0.4), ylim = c(0,0.5), las =1) | ||
## Add a 1:1 line: abline(0,1, col = 2) | ## Add a 1:1 line: abline(0,1, col = 2) | ||
## Add K80 distances: points(mt.raw, mt.k80, pch = 3, col = 4) | ## Add K80 distances: points(mt.raw, mt.k80, pch = 3, col = 4) | ||
## Add a legend: legend(0.05, 0.45, legend = c("JC (1-parameter)", "K80 (2-parameter)"), pch = c(1,3), col = c(1,4), bty = "n") | ## Add a legend: legend(0.05, 0.45, legend = c("JC (1-parameter)", "K80 (2-parameter)"), pch = c(1,3), col = c(1,4), bty = "n") | ||
## Explain (1) Why it is necessary to correct for raw distances between two sequences; (2) | ## Explain (1) Why it is necessary to correct for raw distances between two sequences; (2) Why K80 models is more realistic than JC model | ||
|} | |} | ||
* 11/5 (TH). Distance methods (Chapter 6). | * 11/5 (TH). Distance methods (Chapter 6). |
Revision as of 14:26, 3 November 2015
Course Description
Molecular evolution is the study of the change of DNA and protein sequences through time. Theories and techniques of molecular evolution are widely used in species classification, biodiversity studies, comparative genomics, and molecular epidemiology. Contents of the course include:
- Population genetics, which is a theoretical framework for understanding mechanisms of sequence evolution through mutation, recombination, gene duplication, genetic drift, and natural selection.
- Molecular systematics, which introduces statistical models of sequence evolution and methods for reconstructing species phylogeny.
- Bioinformatics, which provides hands-on training on data acquisition and the use of software tools for phylogenetic analyses.
This 3-credit course is designed for upper-level biology-major undergraduates. Hunter pre-requisites are BIOL203, and MATH150 or STAT113.
Please note that starting from fall 2015, completing this course no longer counts towards research credits for biology majors.
Textbooks
- (Required) Roderic M. Page and Edward C. Holmes,1998, Molecular Evolution: A phylogenetic Approach, Blackwell Science Ltd.
- (Recommended) Baum & Smith, 2013. Tree Thinking: an Introduction to Phylogenetic Biology, Roberts & Company Publishers, Inc.
Learning Goals
- Be able to describe evolutionary relationships using phylogenetic trees
- Be able to use web-based as well as stand-alone software to infer phylogenetic trees
- Understand mechanisms of DNA sequence evolution
- Understand algorithms for building phylogenetic trees
Links for phylogenetic tools
- NCBI sequence databases
- R Tools
- R source: download & install from a mirror site
- R Studio: download & install
- APE package
- A Molecular Phylogeny Web Server
- EvolView: an online tree viewer
Exams & Grading
- Attendance (or a note in case of absence) is required. Bonus for active participation in classroom discussions.
- Assignments. All assignments should be handed in as hard copies only. Email submission will not be accepted. Late submissions will receive 10% deduction (of the total grade) per day.
- Three Mid-term Exams (30 pts each)
- Comprehensive Final Exam (50 pts)
Academic Honesty
While students may work in groups and help each other for assignments, duplicated answers in assignments will be flagged and investigated as possible acts of academic dishonesty. To avoid being investigated as such, do NOT copy anyone else's work, or let others copy your work. At the least, rephrase using your own words. Note that the same rule applies regarding the use of textbook and online resources: copied sentences are not acceptable and will be considered plagiarism.
Hunter College regards acts of academic dishonesty (e.g., plagiarism, cheating on examinations, obtaining unfair advantage, and falsification of records and official documents) as serious offenses against the values of intellectual honesty. The College is committed to enforcing the CUNY Policy on Academic Integrity and will pursue cases of academic dishonesty according to the Hunter College Academic Integrity Procedures.
Course Schedule
Part 1. Tree Thinking
- 8/27 (TH). Overview & Introduction. Lecture slides:
Assignment 1 (10 pts; Due: 8/31, Monday) |
---|
Pre-test: Full credits will be given as long as each question is answered with some reasoning. In other words, it will NOT be graded on being right or wrong. It's an assessment tool, to be compared with later test outcomes to show teaching/learning results. |
- 8/31 (M). 1.1. Introduction (Continued). In-class exercise 1. Tutorial: R & R-Studio (Bring your own computer)
- 9/3 (TH). 2.1. Intro to trees
Assignment 2 (10 pts; Due: 9/10, Thursday) |
---|
Watch Origin of Species: Lizards in an Evolutionary Tree. Provide short answer (1-3 sentences) to each of the following three questions.
|
R exercises
|
- 9/7 (M). Labor Day. No class
- 9/10 (TH). 2.2 & 2.3. Tree Distance. In-class exercise 2.
Assignment 3 (10 pts; Due: 9/17, Th) |
---|
R exercises
|
- 9/14 (M). No class
- 9/17 (TH). 2.4 & 2.5. Species Tree & Lineage Sorting
- 9/21 (M). 2.5. Consensus Tree & Review. Chapter 2 Slides:File:Part-1-tree-thinking.pdf. In-class Exercise 3:
- 9/24 (TH). 4:10 - 5:10pm Midterm Exam I Bring pencils, erasers, and a calculator
Part 2. Trait Evolution
- 9/28 (M). Traits & trait matrix
Assignment #4 (5 pts; Due Monday, 10/5) |
---|
|
- 10/1 (TH). Homoplasy & consistency
- 10/5 (M). Parsimony reconstruction (Chapter 5). In-Class Exercise 4:
Assignment #5 (10 pts; Due 10/15) |
---|
|
- 10/8 (TH). Genome & gene structure (Chapter 3)
- 10/12 (M). No Class
- 10/15 (TH). Genome and gene evolution.
- 10/19 (M). Review & Practices. Lecture slides:
- 10/22 (TH). Midterm Exam 2
Part 3. Tree Algorithms
- 10/26 (M). BLAST & Alignments (Chapter 5)
Assignment #6 (5 pts; Due 11/2) |
---|
Based on the NCBI Gene Page for cytochrome C (CYCS), answer the following questions:
|
- 10/29 (TH). Maximum parsimony (Chapter 6). In class exercise #6:
- 11/2 (M). Genetic distances (Chapter 6)
Assignment #7 (5 pts; Due 11/9, Monday) |
---|
|
- 11/5 (TH). Distance methods (Chapter 6).
- 11/9 (M). Likelihood methods (Chapter 6)
Assignment #8 (10 pts; Due 11/16) |
---|
An international team of scientists recently sequenced 99 genomes of ebola viruses. They reported their work in this recent publication.
|
- 11/12 (TH). Instructor traveling. No class
- 11/16 (M). Tree-testing & Review (Chapter 6). Lecture slides:
- 11/19 (TH). Midterm Exam 3
Part 4. Population Genetics
- 11/23 (M). Mechanism of molecular evolution: Overview & SNP statistics
Assignment #10 (10 pts; Due 12/4, Thursday) |
---|
The left figure shows a codon alignment of 38 strains of a bacterium, with an outgroup sequence (which starts with a string of SNPs: "....g...c..ca..", etc), answer the following questions (with the outgroup sequence excluded.) Do not print the figure directly. Hand-copy the sequences to a graph sheet, include only sequences at the two variable codon positions:
|
- 11/30 (TH). Genetic Drift
- 12/3 (M). Instructor traveling. No class
- 12/7 (M). Neutral Theory & Molecular Clock
- 12/10 (TH). Tests of Natural Selection. Lecture Slides:
- 12/14 (M). Review & Course evaluations. Review slides: . Submit your Teacher's Evaluation, using either:
- Personal computer at www.hunter.cuny.edu/te; or,
- Smartphone at www.hunter.cuny.edu/mobilete
- 12/17 (TH) Comprehensive Final Exam (Regular class hours & Room)
- 12/31 (Wed). Grades Submitted to Registrar Offices (Hunter and Graduate Center)