Biol375 2014: Difference between revisions

From QiuLab
Jump to navigation Jump to search
imported>Weigang
imported>Weigang
 
(26 intermediate revisions by the same user not shown)
Line 126: Line 126:
|- style="background-color:lightsteelblue;"
|- style="background-color:lightsteelblue;"
! Assignment #8 (5 pts; Due 11/10)
! Assignment #8 (5 pts; Due 11/10)
|- style="background-color:lightblue;"
|- style="background-color:white;"
| This alignment is produced by a team that recently sequenced 99 genomes of ebola viruses. Their work appeared in [http://www.sciencemag.org/content/345/6202/1369.full?sid=6b4ac53f-18af-4b71-8f41-87e3a51c2105 this recent publication].
| An international team of scientists recently sequenced 99 genomes of ebola viruses. They reported their work in [http://www.sciencemag.org/content/345/6202/1369.full?sid=6b4ac53f-18af-4b71-8f41-87e3a51c2105 this recent publication].
# Go to the the [http://phylogeny.lirmm.fr/phylo_cgi/index.cgi phylogeny.fr website] and select "Phylogenetic Analysis" and then "One Click" analysis
# Go to the the [http://phylogeny.lirmm.fr/phylo_cgi/index.cgi phylogeny.fr website] and select "Phylogenetic Analysis" and then "One Click" analysis
# Copy and paste the sequences into the text box and click on "Submit"
# Copy and paste [[Datafile|these VP30 sequences]] into the text box and click on "Submit"
# When analysis is finished, you should see a phylogenetic tree. Re-root the tree using three strains isolated on 1976 or 1977 as outgroup. Save and print the tree. Answer the following questions with explanation.
# When analysis is finished, you should see a phylogenetic tree. Re-root the tree using three strains isolated on 1976 or 1977 as outgroup. Save and print the tree. Answer the following questions with explanation.
## Describe the alignment program and the phylogenetic methods used to produce your tree
## Name the alignment program and the phylogenetic methods (Distance, parsimony, likelihood, or other method?) used to produce your tree
## Are isolates collected from different years all monophylogenetic, all paraphyletic, or some monophyletic and some paraphyletic?
## Are isolates collected from different years all monophylogenetic, all paraphyletic, or some monophyletic and some paraphyletic?
## Are outbreaks in different years independent from each other, or one outbreak leads to another?
## Are outbreaks in different years independent from each other, or one outbreak leads to another?
## What would you conclude based on your tree regarding the reservoir source of the ebolavirus: Are Ebolaviruses more likely to have a human or non-human reservoir?
## What would you conclude based on your tree regarding the reservoir source of the ebolavirus: Are Ebolaviruses more likely to have a human or non-human reservoir?
|}
|}
* 11/6 (TH). Distance methods (Chapter 6)  
* 11/6 (TH). Distance methods (Chapter 6).
* 11/10 (M). Likelihood methods (Chapter 6)
* 11/10 (M). Likelihood methods (Chapter 6)
* 11/13 (TH). Tree-testing (Chapter 6)
{| class="wikitable sortable mw-collapsible"
|- style="background-color:lightsteelblue;"
! Assignment #9 (5 pts; Due 11/13, Thursday)
|- style="background-color:white;"
| Compare [[Datafile|these two Ebola VP30 sequences]], one from the 2014 outbreak and the other from the 1994 outbreak.
# Calculate Jukes-Cantor distance between the two sequences (specify unit)
# Identify the number of transitions and transversions
# Identify the number of synonymous and nonysynonymous substitutions
# Assuming that the total number of synonymous sites S=174 and the total number of nonsynonymous sites N=690, calculate <i>d<sub>s</sub> and d<sub>n</sub></i> (with Jukes-Cantor correction)
|}
* 11/13 (TH). Tree-testing & Review (Chapter 6).  Lecture slides: [[File:Part-3-tree-construction-small.pdf|thumbnail]]
* 11/17 (M). '''Midterm Exam 3'''
* 11/17 (M). '''Midterm Exam 3'''


===Part 4. Population Genetics ===
===Part 4. Population Genetics ===
* <font color="gray">11/20 (TH). Instructor traveling. No class</font>
* <font color="gray">11/20 (TH). Instructor traveling. No class</font>
* 11/24 (M). TBD
* 11/24 (M). Mechanism of molecular evolution: Overview & SNP statistics
{| class="wikitable sortable mw-collapsible"
|- style="background-color:lightsteelblue;"
! Assignment #10 (10 pts; Due 12/4, Thursday)
|- style="background-color:lightblue;"
|[[File:Snp-pa1.png|thumbnail]]
The left figure shows a codon alignment of 38 strains of a bacterium, with an outgroup sequence (which starts with a string of SNPs: "....g...c..ca..", etc), answer the following questions (<font color="red">with the outgroup sequence excluded.</font>) <font color="green">Do not print the figure directly. Hand-copy the sequences to a graph sheet, include only sequences at the two variable codon positions</font>:
# There are two SNP sites. For each SNP, determine whether it is a synonymous or nonsynonymous change (could be both if more than 2 states). You may simply list the codons and their corresponding amino acids, at each aligned codon site.
# Calculate allele frequencies at each SNP site (for 3 SNP states, calculate frequencies of all three separately)
# List all haplotypes using the 2 SNP sites
# Calculate frequencies of all haplotypes
# Using the outgroup sequence, determine the ancestral and derived SNP, codon, and amino-acid states at each codon site. Without the outgroup sequence, could derived and ancestral states be determined (e.g., by majority)? Explain with a tree including the outgroup sequence.
# (Bonus: +2) For sites that are fixed differences between the outgroup sequences and others (e.g., the 5th nucleotide site), could one determine which is the ancestral and which is the derived state? Explain with a tree.
|}
* <font color="gray">12/1 (M). Instructor traveling. No class</font>
* <font color="gray">12/1 (M). Instructor traveling. No class</font>
* 12/4 (TH). TBD
* 12/4 (TH). Genetic Drift
* 12/8 (M). TBD
* 12/8 (M). Neutral Theory & Molecular Clock
* 12/11 (TH). TBD
* 12/11 (TH). Tests of Natural Selection. Lecture Slides: [[File:Part-4-evol-mechanisms.pdf|thumbnail]]
* 12/15 (M). Review
* 12/15 (M). Review & Course evaluations. Review slides: [[File:Final-review-slides.pdf|thumbnail]]. '''Submit your Teacher's Evaluation''', using either:
* 12/19 (TH) '''Comprehensive Final  Exam''' (Regular class hours & Room)
** Personal computer at [http://www.hunter.cuny.edu/te www.hunter.cuny.edu/te]; or,
** Smartphone at [http://www.hunter.cuny.edu/mobilete www.hunter.cuny.edu/mobilete]
* 12/18 (TH) '''Comprehensive Final  Exam''' (Regular class hours & Room)
* 12/31 (Wed). Grades Submitted to Registrar Offices (Hunter and Graduate Center)
* 12/31 (Wed). Grades Submitted to Registrar Offices (Hunter and Graduate Center)

Latest revision as of 01:43, 16 December 2014

Molecular Evolution (BIOL 375.00/790.64/793.03, Fall 2014)
Instructor: Dr Weigang Qiu, Associate Professor, Department of Biological Sciences
Teaching Assistant: Ms Saymon Akther <saymon.akther@gmail.com>
Room: 926 HN (Seminar Room, North Building)
Hours: Mon. & Thur 4:10-5:25 pm
Office Hours: Room 839 HN; Wed 5-7 pm or by appointment
Course Website: http://diverge.hunter.cuny.edu/labwiki/Biol375_2014

Borreliabase-screenshot-1.png

Course Description

Molecular evolution is the study of the change of DNA and protein sequences through time. Theories and techniques of molecular evolution are widely used in species classification, biodiversity studies, comparative genomics, and molecular epidemiology. Contents of the course include:

  • Population genetics, which is a framework of understanding mechanisms of sequence evolution through mutation, recombination, gene duplication, genetic drift, and natural selection.
  • Molecular systematics, which introduces statistical models of sequence evolution and methods of reconstructing species phylogeny.
  • Bioinformatics, which provides hands-on training on data acquisition and the use of software tools for phylogenetic analyses.

This 3-credit course is designed for upper-level biology-major undergraduates. Hunter pre-requisites are BIOL203, and MATH150 or STAT113.

Textbooks

  • (Required) Roderic M. Page and Edward C. Holmes,1998, Molecular Evolution: A phylogenetic Approach, Blackwell Science Ltd.
  • (Recommended) Baum & Smith, 2013. Tree Thinking: an Introduction to Phylogenetic Biology, Roberts & Company Publishers, Inc.

Learning Goals

  • Understand mechanisms of DNA sequence evolution
  • Be able to describe evolutionary relationships using phylogenetic trees
  • Understand the computational algorithms for building phylogenetic trees
  • Be able to use web-based as well as stand-alone software to infer phylogenetic trees

Links for phylogenetic tools

Exams & Grading

  • Assignments. All assignments should be handed in as hard copies only. Email submission will not be accepted. Late submissions will receive 10% deduction (of the total grade) per day.
  • Three Mid-term Exams (30 pts each)
  • Comprehensive Final Exam (50 pts)

Bonus for active participation in classroom discussions

Academic Honesty

While students may work in groups and help each other for assignments, duplicated answers in assignments will be flagged and investigated as possible acts of academic dishonesty. To avoid being investigated as such, do NOT copy anyone else's work, or let others copy your work. At the least, rephrase using your own words. Note that the same rule applies regarding the use of textbook and online resources: copied sentences are not acceptable and will be considered plagiarism.

Hunter College regards acts of academic dishonesty (e.g., plagiarism, cheating on examinations, obtaining unfair advantage, and falsification of records and official documents) as serious offenses against the values of intellectual honesty. The College is committed to enforcing the CUNY Policy on Academic Integrity and will pursue cases of academic dishonesty according to the Hunter College Academic Integrity Procedures.

Course Schedule

Part 1. Tree Thinking

Assignment 1 (10 pts; Due: 9/4, Thursday)
  • 9/1 (M). Labor Day. No class
  • 9/4 (TH). 1.1. Introduction (Continued). In-class exercise 1.
Assignment 2 (5 pts; Due: 9/8, Monday)
Watch Origin of Species: Lizards in an Evolutionary Tree. Provide short answer (1-3 sentences) to each of the following three questions.
  1. What are the two hypotheses explaining the origin of different ecomorphs of lizards on Caribbean Islands?
  2. What is the expected phylogeny under each hypothesis?
  3. Which hypothesis is supported by the phylogeny of actual DNA sequences?
  • 9/8 (M). 2.1. Intro to trees
  • 9/11 (TH). 2.2 & 2.3. Tree Distance. In-class exercise 2.
Assignment 3 (5 pts; Due: 9/15, Monday)
Computer exercise. Obtain an account on EvolView. Once logged in, under the "Basic" tab, click the first icon & copy and paste the following NEWICK string: "(monkey:0.09672,((tarsier:0.18996,lemur:0.14790)0.999:0.09005,(macaque:0.18524,(gibbon:0.10388,(orang-utan:0.09481,(human:0.03391,(gorilla:0.06135,chimpanzee:0.05141):0.01580)0.316:0.05381)1.000:0.03019)0.978:0.05616)0.997:0.05042)0.965:0.09672);". Name your project as "Assignment 3" and the tree as "primate". Render the tree in all five available formats. Using the "Export" tab to download all tree graphs (in "jpeg" or "png" format). Copy and Paste your tree graphs into a single page of Microsoft Word or PowerPoint. Turn in a printed hard copy.

Part 2. Trait Evolution

  • 9/25 (TH). Holiday Recess. No Class
  • 9/29 (M). Traits & trait matrix
Assignment #4 (5 pts; Due 10/6)
Based on the lizard card, construct a character-state matrix for all lizard species. For each species, list its character state for each of the following two characters (as columns): (1) Geographic origin, and (2) Habitat. Re-watch the video may help this assignment. Hint: use Excel & hand in a printout of your Excel sheet.
  • 10/2 (TH). Homoplasy & consistency
  • 10/6 (M). Parsimony reconstruction (Chapter 5). In-Class Exercise 4:
Assignment #5 (5 pts; Due 10/9)
Use EvolView to display the following tree of Caribbean lizards: "((Anolis_chlorocyanus:0.15297,(Anolis_evermanni:0.09207,(Anolis_cristatellus:0.14363,Anolis_pulchellus:0.07962)0.931:0.02884)0.997:0.04280)0.897:0.02232,(Anolis_cybotes:0.17149,Anolis_olssoni:0.12747)0.974:0.03034,(((Anolis_ophiolepis:0.06969,Anolis_sagrei:0.06284)1.000:0.09480,(Anolis_valencienni:0.10249,(Anolis_grahami:0.10016,Anolis_lineatopus:0.10064)0.613:0.01700)0.999:0.04077)0.997:0.04169,((Leiocephalus_barahonensis:0.24783,Anolis_occultus:0.15489)0.978:0.05261,(Anolis_alutaceus:0.14271,(Anolis_porcatus:0.10377,(Anolis_sheplani:0.15083,Anolis_angusticeps:0.12285)0.943:0.02748)0.898:0.01870)0.989:0.03278)0.514:0.01385)0.404:0.01061);" Note:
  1. Show the "rectangular phylogram" format (2nd tree option). Show species names (aligned, use the button "align/unalign leaf node labels"), scale bar and bootstrap values.
  2. Mouse over the branch "Leiocephalus_barahonensis" and click "reroot here"
  3. Make a printout of the tree & then hand-draw a tree without un-supported (support value < 0.8) branches (i.e., making polytomies).
  • 10/9 (TH). Genome & gene structure (Chapter 3)
Assignment #6 (10 pts; Due 10/16)
Anolis-tree.png
  1. Match the character matrix from Assignment 3 and tree from Assignment 4 (you may use the tree on the right). Hand-draw a diagram with tree on the left and matrix on the right (use 1-letter code for character states & include a legend of your codes).
  2. Reconstruct ancestral locations and habitat of Caribbean lizards. Pick an arbitrary ancestral states if the ancestral state cannot be resolved.
  3. Based on your reconstructed trait evolution, count the number of character-state changes and calculate consistency index for each trait.
  4. Use the difference between the two consistency indices to explain why the molecular phylogeny supports convergent evolution in habitat.
  5. Bonus (+5): Use the EvolView ColorStrip feature to automatically generate the combined diagram with tree and character-state matrix. Notes: (1) you can't have species names with spaces. e.g. "Anolis sagrei" needs to be written has "Anolis_sagrei"; (2) you don't need to remove unsupported branches; (3) the outgroup species ("Leiocephalus barahonensis") has unknown character states. Assign "gray" colors to both characters.
  • 10/13 (M). No Class
  • 10/16 (TH). Genome and gene evolution. Lecture slides (with answer keys to assignments & in-class exercises):
  • 10/20 (M). Review & Practices
  • 10/23 (TH). Midterm Exam 2

Part 3. Tree Algorithms

  • 10/27 (M). BLAST & Alignments (Chapter 5)
Assignment #7 (5 pts; Due 11/3)
Based on the NCBI Gene Page for cytochrome C (CYCS), answer the following questions:
  • What is the molecular function of CYCS?
  • Describe its chromosomal location and gene structure (number of introns and exons, length of protein)
  • Click the link "HomoloGene" and then in the section "Pairwise alignments generated using BLAST", run BLAST between Human and Mouse protein sequences. Show BLAST report.
  • Pick another species and generate a BLAST alignment between the Human and this species. Show BLAST report.
  • 10/30 (TH). Maximum parsimony (Chapter 6). In class exercise #6:
  • 11/3 (M). Genetic distances (Chapter 6)
Assignment #8 (5 pts; Due 11/10)
An international team of scientists recently sequenced 99 genomes of ebola viruses. They reported their work in this recent publication.
  1. Go to the the phylogeny.fr website and select "Phylogenetic Analysis" and then "One Click" analysis
  2. Copy and paste these VP30 sequences into the text box and click on "Submit"
  3. When analysis is finished, you should see a phylogenetic tree. Re-root the tree using three strains isolated on 1976 or 1977 as outgroup. Save and print the tree. Answer the following questions with explanation.
    1. Name the alignment program and the phylogenetic methods (Distance, parsimony, likelihood, or other method?) used to produce your tree
    2. Are isolates collected from different years all monophylogenetic, all paraphyletic, or some monophyletic and some paraphyletic?
    3. Are outbreaks in different years independent from each other, or one outbreak leads to another?
    4. What would you conclude based on your tree regarding the reservoir source of the ebolavirus: Are Ebolaviruses more likely to have a human or non-human reservoir?
  • 11/6 (TH). Distance methods (Chapter 6).
  • 11/10 (M). Likelihood methods (Chapter 6)
Assignment #9 (5 pts; Due 11/13, Thursday)
Compare these two Ebola VP30 sequences, one from the 2014 outbreak and the other from the 1994 outbreak.
  1. Calculate Jukes-Cantor distance between the two sequences (specify unit)
  2. Identify the number of transitions and transversions
  3. Identify the number of synonymous and nonysynonymous substitutions
  4. Assuming that the total number of synonymous sites S=174 and the total number of nonsynonymous sites N=690, calculate ds and dn (with Jukes-Cantor correction)

Part 4. Population Genetics

  • 11/20 (TH). Instructor traveling. No class
  • 11/24 (M). Mechanism of molecular evolution: Overview & SNP statistics
Assignment #10 (10 pts; Due 12/4, Thursday)
Snp-pa1.png

The left figure shows a codon alignment of 38 strains of a bacterium, with an outgroup sequence (which starts with a string of SNPs: "....g...c..ca..", etc), answer the following questions (with the outgroup sequence excluded.) Do not print the figure directly. Hand-copy the sequences to a graph sheet, include only sequences at the two variable codon positions:

  1. There are two SNP sites. For each SNP, determine whether it is a synonymous or nonsynonymous change (could be both if more than 2 states). You may simply list the codons and their corresponding amino acids, at each aligned codon site.
  2. Calculate allele frequencies at each SNP site (for 3 SNP states, calculate frequencies of all three separately)
  3. List all haplotypes using the 2 SNP sites
  4. Calculate frequencies of all haplotypes
  5. Using the outgroup sequence, determine the ancestral and derived SNP, codon, and amino-acid states at each codon site. Without the outgroup sequence, could derived and ancestral states be determined (e.g., by majority)? Explain with a tree including the outgroup sequence.
  6. (Bonus: +2) For sites that are fixed differences between the outgroup sequences and others (e.g., the 5th nucleotide site), could one determine which is the ancestral and which is the derived state? Explain with a tree.
  • 12/1 (M). Instructor traveling. No class
  • 12/4 (TH). Genetic Drift
  • 12/8 (M). Neutral Theory & Molecular Clock
  • 12/11 (TH). Tests of Natural Selection. Lecture Slides:
  • 12/15 (M). Review & Course evaluations. Review slides: . Submit your Teacher's Evaluation, using either:
  • 12/18 (TH) Comprehensive Final Exam (Regular class hours & Room)
  • 12/31 (Wed). Grades Submitted to Registrar Offices (Hunter and Graduate Center)