BIOL200 2013: Difference between revisions

Revision as of 19:01, 4 March 2013

EXPERIMENT # 4

BIOL 200 Cell Biology II LAB, Spring 2013

Hunter College of the City University of New York

Course information

Instructors: TBD

Class Hours: Room TBD HN; TBD

Office Hours: Room 830 HN; Thursdays 2-4pm or by appointment

Contact information:

Dr. Weigang Qiu: weigang@genectr.hunter.cuny.edu, 1-212-772-5296

Experiment #4

The Tree of Life and Molecular Identification of Microorganisms

Objective

To classify microorganisms and determine their relatedness using molecular sequences.

LAB REPORT GRADING GUIDE

CELL BIO II Experiment #4:

Introduction 1 point :

 Statement of objectives or aims of the experiment in the student’s own words.
 (not to be copied from the Lab Manual)

MATERIALS AND METHODS 0 points :

 This should be a brief synopsis and must include any changes or deviations 
 from the procedures outlined in the Lab Manual. Specify which organisms were 
 used to create the phylogram.

RESULTS 4 points :

 A print out of the phylogram will suffice.

DISCUSSION 4 points :

 Responses to discussion questions.

SUMMARY |CONCLUSION 1 point :

 Two sentence summary of your findings.

REFERENCES 1 point :

 Credit is given for pertinent references obtained from sources other than the Lab Manual.
 This point is in addition to the 10 for the lab report..

INTRODUCTION

Introduction
Evolution can be defined as descent with modification. In other words, changes in the nucleotide sequence of an organsim’s genomic DNA is inherited by the next generation. According to this, all organisms are related through descent from an ancestor that lived in the distant past. Since that time, about 4 billion years ago, life has undergone an extensive process of change as new kinds of organisms arose from other kinds existing in the past. The evolutionary history of a group is called a phylogeny, and can be represented by a phylogram (Figure 1). A major goal of evolutionary analysis is to understand this history. We do not have direct knowledge of the path of evolution, as by definition, extinct organisms no longer exist. Therefore, phylogeny must be inferred indirectly. Originally, evolutionary analysis was based upon the organisms’ morphology and metabolism. This is the basis for the Linnaean classification scheme (the “Five Kingdoms” scheme). However, this method can lead to mistaken relationships. Different species living in the same environment may have similar morphologies in order to deal with specific environmental factors. Thus these similarities have nothing to do with how related the organisms are, but are a direct result of shared surroundings. However, with the advent of genomics, organisms can be grouped based upon their sequence relatedness. Since evolution is a process of inherited nucleotide change, analyzing DNA sequence differences allows for the reconstruction of a better phylogenetic history.
File:TreeLife.png Tree of life based on 16S ribosomal RNA (image credit: NR Pace, Science 1997)
Of course, when comparing DNA sequences, the question of which genes to use arises. The most widely used genes are those coding for the 16S rRNA gene in prokaryotes and the 18S rRNA gene in eukaryotes. These genes code for small subunit ribosomal RNA and are used for evolutionary analysis because they 1) are found in all organisms, 2) are functionally conserved, 3) vary only slightly between organisms (their nucleotide sequence changed slowly throughout evolution), and 4) have adequate length. In this lab, you will be performing evolutionary analysis by constructing a phylogram of 15 microbes spanning bacteria, archaea and eukarya. You will find and download rRNA sequences, align them and use that alignment to create a phylogram.

MATERIALS

Required hardware: Computer

Procedure
Examine Table I, select representative species from Bergey’s Manual. Select 2 prokaryotic species from each group, giving 14 prokaryotic species total. Also select the Eukaryotic representative, Saccharomyces cerevisiae. Access the NCBI website: http://www.ncbi.nlm.nih.gov/ Under the “Search” category, select “Nucleotide” Under the “for” category, type the accession number for your first organism, and hit the “Go” button. This takes you to the access for the 16S rRNA for your organism. Download the 16S rRNA sequence for your first organisms by choosing “FASTA” under the “Display” category. Copy and paste the entire output into a Microsoft Word file. Edit the sequence id to match the format of “Genus_Species_Genbank#” (eg. > Escherichia_coli_174375). Repeat process for all of your organisms, pasting the sequences into the same Microsoft Word file. (note: be sure to place a blank line between each sequence entry) Access the EMBL CLUSTALW alignment website: http://www.ebi.ac.uk/Tools/clustalw/, and copy and paste your entire Microsoft Word file into the area which asks you to “Enter or paste a set of sequences in any supported format”. Click “Run”. This program will make an alignment of all of your sequences. Click “Show as Phylogram Tree” to create a tree showing the relatedness of your organisms based on their 16S rRNA sequences. To print your phylogram tree.. a. hit the “Print Screen” button on your keyboard b. open the Paint program from your “accessories” menu on your computer c. hit paste to paste your screen d. “select” your phylogram tree e. copy and paste it into a new paint file f. print your tree and email it to yourself

Table 1

Volume 1A (Gram-negative bacteria)
Escherichia coli	ACCESSION #174375
Helicobacter pylori	ACCESSION #402670
Salmonella typhi	ACCESSION #2826789
Serratia marcescens	ACCESSION #4582213
Treponema pallidum	ACCESSION #176249
Additional species: Agrobacterium tumefaciens, Boredetella pertussis, Thermus aquaticus, Yersinia pestis, Borrelia burgdorferi. (Note: To search for unlisted 16S sequences, type key words such as “yersinia AND 16S [gene]” in the NCBI GenBank search box.)
Volume 1B (Rikettsias and endosymbionts)
Baronella bacilliformis	ACCESSION #173825
Chlamydia trachomatis	ACCESSION #2576240
Rickettsia rickettsii	ACCESSION #538436
Additional species: Coxiella burnetii, Thermoplasma acidophilum
Volume 2A (Gram-positive bacteria)
Bacillus subtilis	ACCESSION #8980302
Dinococcus radiodurans	ACCESSION #145033
Staphylococcus aureus	ACCESSION #576603
Additional species: Bacillus anthracis, Clostridium botulinum, Lactobacillus acidophilus, Streptococcus pyogenes
Volume 2B (Mycobacteria and nocardia)
Mycobacterium haemophilum	ACCESSION #406086
Mycobacterium tuberculosis	ACCESSION #3929878
Additional species: Mycobacterium bovis, Nocardia orientalis
Volume 3A (Phototrophs, chemolithotrophs, sheathed bacteria, gliding bacteria)
Anabaena sp.	ACCESSION #39010
Cytophaga latercula	ACCESSION #37222646
Nitrobacter wiogradskyi	ACCESSION #402722
Additional species: Heliothrix oregonensis, Myxococcus fulvus, Thiobacillus ferrooxidans
Volume 3B (Archeobaceria)
''Methanococcus jannaschii	ACCESSION #175446
Thermotoga subterranean	ACCESSION #915213
Additional species: Desulfurococcus mucosus, Halobacterium salinarium, Pyrococcus woesei
Volume 4 (Actinomycetes)
Actinomyces bowdenii	ACCESSION #6456800
Actinomyces neuii	ACCESSION #433527
Actinomyces turicensis	ACCESSION #642970
Eukaryotic representative (used as outgroup for rooting the phylogenetic tree)
Saccharomyces cerevisiae	ACCESSION #172403

ANALYSIS

Analyzing your phylogram
A phyolgram is composed of nodes and branches (Figure 2). The internal nodes represent extinct ancestors, and the tips of the branches, also called nodes, are individual strains of microorganisms that exist now, and from which the sequence data were obtained. The internal nodes are points in evolution where an extinct ancestor diverged into two new entities, each of which began to accumulate differences during its subsequent independent evolution. The branches define the order of descent and the ancestry of the nodes. The branch length represents the number of changes that have occurred along that branch. Thus, the more recently two organisms share a common ancestor, the more closely related they are. Trees can be either “unrooted” or “rooted”. Unrooted trees show the relationships among the microorganisms under study, but not the evolutionary path leading from an ancestor to a strain.
Phylogram with internal nodes (a, b, c, d) and tips (1, 2, 3, 4, 5). Nodes at the tips are species that exist today, and internal nodes are extinct ancestors.
A rooted tree shows the unique path from an ancestor (internal node) to each strain. Trees are rooted by inclusion of an outgroup in the analysis. An outgroup is an organism that is less closely related to the other organisms under study than the organisms are to each other.

DISCUSSION

Discussion Questions
Answer the following questions based on a Tree of Life shown in Figure 1. a. What do internal and terminal nodes represent? b. What do branch lengths represent? What’s the unit and meaning of the scale bar? c. Identify the positions of Humans (Homo), corn (Zea), E.coli, and Bacillus on the tree. Use the scale bar to estimate which pair is evolutionarily more distant: human/corn or E.coli/Bacillus? In Figure 2, which two species are more closely related: 1 and 2, 2 and 3, or 1 and 4? Which are more distantly related? How did you determine this? In Figure 2, is 1 more, less, or equally related to 4 and 5? Explain your rationale. List and describe the key steps of constructing a phylogenetic tree. Why do we use 18S rRNA information for yeast and 16S for prokaryotes? Could we use other molecules as phylogenetic markers? What constitutes a “good” phylogenetic marker for building a tree of life? Bonus Question Define 16S “phylo-species” and “metagenomics”. Describe how PCR amplification and sequencing of 16S rRNA molecules from environmental microbial samples (e.g., sea water, soil, human gut, hot springs) can be used to define species composition of an environment.

References

Reference & Resource
Jungck, J. R.; Fass, M.F.; Stanley, E. D. (ed.). 2003 (2006 Revision). Microbes Count! Problem Posing, Problem Solving, and Peer Persuasion in Microbiology. BioQUEST Curriculum Consortium. (Chapter 6, pg 191) Holt. J. G. Editor-in-Chief (1984). Bergey’s Manual of Systematic Bacteriology, Volume 1-4. Williams & Wilkins: Baltimore. http://www.cme.msu.edu/bergeys/pubinfo.html

@@ Line 336: / Line 336: @@
 #Holt. J. G. Editor-in-Chief (1984). Bergey’s Manual of Systematic Bacteriology, Volume 1-4. Williams & Wilkins: Baltimore. http://www.cme.msu.edu/bergeys/pubinfo.html
 |}
-===March 12===
-*Chapter 3. Molecular Evolution [[Media:CH3.pdf|Lecture Slides Ch.3-Che]]
-* '''Homework:''' (TBA)
-===March 19===
-*REVIEW Session for MID-TERM EXAMS
-<!--*Assignment #7. '''(To be posted)'''
-Questions & Problems (pg.54-55): 2.1, 2.2, 2.3, 2.4-->
-===March 26===
-*MID-TERM
-<!--*Assignment #8. '''(To be posted)'''
-Questions & Problems (pg.75-76): 3.1, 3.2, 3.3 (use first ten codons), 3.4, 3.5, 3.7-->
-===April 2===
-*'''Chapter 4.''' Phylogenetics I. Distance Methods  [[Media:CH4.pdf|Lecture Slides Ch.4-Che]]
-*"Tree Thinking" Puzzles - ([http://diverge.hunter.cuny.edu/~weigang/lab-website/SummerWorkshop/Baum_etal05_sup_part1.pdf Download])
-*'''Tutorial:''' PROTDIST and NEIGHBOR using [http://mobyle.pasteur.fr/cgi-bin/portal.py#welcome Mobyle Pasteur]
-{| class="collapsible collapsed wikitable"
-|- style="background-color:lightsteelblue;"
-! Assignment #6
-|-style="background-color:powderblue;"
-| '''Chapter 4 ''' Questions & Problems (pg.95-96): 4.1, 4.3, 4.4, 4.7, 4.8
-|}
-===April 9===
-*'''Chapter 5.''' Phylogenetics II. Character-Based Methods  [[Media:CH4.pdf|Lecture Slides Ch.5-Che]]
-*'''Tutorial:''' DNAML and bootstrap analysis using [http://mobyle.pasteur.fr/cgi-bin/portal.py#welcome Mobyle Pasteur]
-<!--*Assignment #10. '''(To be posted)'''
-Questions & Problems (pg.115-116): 5.1, 5.2, 5.3, 5.4-->
-===April 16===
-*'''Topic:''' Relational Database and SQL
-*'''Tutorial:''' the Borrelia Genome Database
-*'''Homework:''' SQL-embedded PERL
-{| class="collapsible collapsed wikitable"
-|- style="background-color:lightsteelblue;"
-! Assignment #7
-|- style="background-color:powderblue;"
-| '''SQL-embedded PERL'''<br />
-Continue work on the assignment we began in class. It is reproduced below, with some added functionality.
-Your script will:
-# Retrieve TEN orfs from the orf table that belong to the strain Pko.
-# Find and store the sequences described by those orfs and their lengths.
-# Determine if the orf is on the reference or reverse complement strand, and use that information to print the correct sequence.
-# Print the orf name, sequence, and the length for each orf.
-# '''In addition to printing the above information to the screen,''' write out the sequence information '''(in FASTA format)''' to a file
-called "Pko_orfs.fasta". The sequence ID should be of the form:
- Pko_orfname
-Note that the above will require the use of BioPerl.
-For those looking for extra challenges, you can try adding the following:
-* Ask the user for the strain and contig *names* that they want orfs from, and only retrieve those rows. This means you must find a way
-of obtaining their respective IDs from just their names. Make sure the sequence IDs are informative. They should look like this:
- strainname_contigname_orfname
-* If asking users for input, fail if they gave a strain or contig name which does not exist in the database.
-* Also if asking users for input, the output file's name should be changed to reflect the chosen strain.
-* Ask the user the minimum length the orf is allowed to be, and only print orfs as long, or longer, than what the user specifies.
-Sample scripts will go up slowly, over time, including example SQL statements.
-|-style="background-color:powderblue;"
-| '''Questions from Text''' <br /> (pg.115-116): 5.1, 5.3
-|}
-===April 23===
-'''NO CLASSES''' (Spring recess)
-===April 30===
-*'''Topic:''' Statistics
-*'''In-class exercise:''' [https://docs.google.com/document/d/1wq-s8WpqyURVeGiLUxhEyBvHRDrK__Cr7XjkuLicP-c/edit?hl=en&authkey=CJ2g4qsI R basics and short demonstration of a simple boxplot]
-*'''Tutorial:''' Statistical Visualization using R  [[Media:R-implementations.pdf|Lecture Slides-Che]]
-<!--*Assignment #12. '''(To be posted)'''
-R Exercises-->
-===May 7===
-*'''Chapter 6''' (Gene Expression) & '''Chapter 8''' (Proteomics)
-*'''Tutorial:''' Array Data Visualization and Analysis ([[Media:Array_Data_Visualization_and_Analysis.pdf| Micro-Array Analysis Slides]])
-*'''Homework:'''Data Analysis using R
-{| class="collapsible collapsed wikitable"
-|- style="background-color:lightsteelblue;"
-! Assignment #8
-|-style="background-color:powderblue;"
-| '''Part 1 Data Analysis:'''
-For this assignment, you will use sample data to answer the question: '''Do men and women have different body temperatures?'''
-The file '''temps.txt''' located in ../bio425_2011/data on eniac, contains body temperature data for a sample of adults.
-Use a hypotheses test with α = .05 to answer the above question of interest.
-NOTE: For this part of the assignment you will need to turn in your answer to the question with p-values in addition to the R syntax used. '''Indicate your null hypothesis'''.
-'''Part 2 Gene Expression Data Analysis:'''
-Using the files '''GSM129276_cy3.txt''' & '''GSM129276_cy5.txt''' located in ./bio425_2011/data on eniac, conduct an analysis to produce a histogram of fold changes.
-In addition to the histogram, you will need to turn in the R syntax used in every step of the analysis in R, along with an explanation as to why the step was necessary.
-|-style="background-color:powderblue;"
-| '''Read'''
-'''For next class, read CH 7'''
-|}
-===May 14===
-*'''Chapter 7.''' Protein Structure Prediction
-<!--*Assignment #14 (Final Comprehensive Project). '''(To be posted)'''-->
-===May 21===
-*Final Project Due (TBA)
-==Useful Links==
-===Unix Tutorials===
-*A very nice [http://www.ee.surrey.ac.uk/Teaching/Unix/ UNIX tutorial] (you will only need up to, and including, tutorial 4).
-*FOSSWire's [http://files.fosswire.com/2007/08/fwunixref.pdf Unix/Linux command reference] (PDF). Of use to you: "File commands", "SSH", "Searching" and "Shortcuts".
-===Perl Help===
-* Professor Stewart Weiss has taught CSCI132, a UNIX and Perl class. His slides go into much greater detail and are an invaluable resource. They can be found on his course page [http://compsci.hunter.cuny.edu/~sweiss/course_materials/csci132/csci132_f10.php here].
-* Perl documentation at [http://perldoc.perl.org perldoc.perl.org]. Besides that, running the perldoc command before either a function (with the -f option ie, perldoc -f substr) or a perl module (ie, perldoc Bio::Seq) can get you similar results without having to leave the terminal.
-===Bioperl===
-* BioPerl's [http://www.bioperl.org/wiki/HOWTOs HOWTOs page].
-* BioPerl-live [http://doc.bioperl.org/bioperl-live developer documentation]. (We use bioperl-live in class.)
-* Yozen's tutorial on [http://diverge.hunter.cuny.edu/wiki/HOWTO:Bioperl-live_on_Mac_OS_X installing bioperl-live on your own Mac OS X machine]. (Let me know if there are any issues!).
-* [https://spreadsheets.google.com/pub?key=0AjfPzjrqY7BndHpyRHlDZUlGcktINm1IbXVzX1QzMXc&single=true&gid=0&output=html A small table] showing some methods for BioPerl modules with usage and return values.
-===SQL===
-* [https://docs.google.com/document/d/1zYLPeenwsqPYchkpXnndzphBbTKqX2GjjLHDxlBnt78/edit?hl=en&authkey=CLnh_88K SQL Primer], written by Yozen.
-===R Project===
-* Install location and instructions for [http://lib.stat.cmu.edu/R/CRAN/bin/windows/base/ Windows]
-* Install location and instructions for [http://lib.stat.cmu.edu/R/CRAN/ Mac OS X]
-* For users of Ubuntu/Debian:
- sudo apt-get install r-base-core
-* For users of Fedora/Red Hat:
- su -
- yum install R
-===Utilities===
-*An [https://chrome.google.com/webstore/detail/nlbjncdgjeocebhnmkbbbdekmmmcbfjd RSS button extension] for chrome. Can add feeds to Google Reader and others.
-*A [https://chrome.google.com/webstore/detail/hcamnijgggppihioleoenjmlnakejdph similar extension] which adds a "Live bookmarks"-like feature to Chrome (like Firefox's RSS bookmarks).
-===Other Resources===
-* [http://www.ccrnp.ncifcrf.gov/~toms/papers/primer/primer.pdf Information Theory Primer] by Thomas D. Schneider. Useful in understanding sequence logo maps.
 © Weigang Qiu, Hunter College, Last Update Jan 2013