Population Genomics Course: Difference between revisions
Jump to navigation
Jump to search
imported>Weigang |
imported>Weigang |
||
(3 intermediate revisions by the same user not shown) | |||
Line 31: | Line 31: | ||
*In-class exercise: 2-3 | *In-class exercise: 2-3 | ||
*Data set: three pairs of sister-group cp26 plasmids | *Data set: three pairs of sister-group cp26 plasmids | ||
*LDhat: [http://sourceforge.net/ | #MUGSY visualization | ||
#Download and Install: < | *LDhat: [http://ldhat.sourceforge.net/instructions.shtml Instructions]; [http://ldhat.sourceforge.net/manual.pdf PDF Manual]; Based on the "Four-Gamete Test" by [http://www.genetics.org/content/111/1/147.full.pdf+html Hudson & Kaplan, 1985] | ||
#Download and Install: <pre style="white-space: pre-wrap;">svn checkout https://ldhat.svn.sourceforge.net/svnroot/ldhat; make; make clean</pre> | |||
#Data: cp26.ss.fas | |||
#Convert to LDhat file format: <pre style="white-space: pre-wrap;">../ldhat/convert -seq cp26.ss.fas -2only -prefix cp26</pre> | |||
#Generate likelihood lookup table: <pre style="white-space: pre-wrap;">../ldhat/lkgen -lk lk_n100_t0.01 -nseq 14; mv new_lk.txt lk_n14_t0.01</pre> | |||
#Estimate recombination rates: <pre style="white-space:pre-wrap;">../ldhat/pairwise -lk lk_n14_t0.01 -seq cp26sites.txt -loc cp26locs.txt</pre> | |||
#Identify hotspots: <pre style="white-space:pre-wrap;">../ldhat/interval -lk lk_n14_t0.01 -seq cp26sites.txt -loc cp26locs.txt</pre> | |||
#Summarize results:<pre style="white-space:pre-wrap;">../ldhat/stat -input bounds.txt -burn 500 -loc cp26locs.txt</pre> | |||
#Visualize in RStudio | |||
##<pre pre style="white-space: pre-wrap;">source("http://ldhat.sourceforge.net/R/coalescent.r")</pre> | |||
##Plot "outfile.txt" | |||
##Plot "fit.txt" | |||
##Plot "res.txt" | |||
===Part 4. Simulation of natural selection & Summary=== | ===Part 4. Simulation of natural selection & Summary=== |
Latest revision as of 04:54, 24 June 2013
Learning Goals
- Identification of lineage-specific genomic changes of pathogens
- Estimate recombination, mutation, and selection in natural pathogen populations
Learning outcomes
- Be able to construct genome trees using genome-wide SNPs
- Use genome trees to identify orthologs and paralogs, and gene gains and losses
- Detecting recombination among bacterial genomes
- Use of coalescence tree to describe process of microbial genome evolution
Syllabus
Part 1. Introduction & Overview
- Lecture: 8:30-9:30
- Population processes
- Recombination: Muller's Rachet; Hill-Roberson effect
- Recombination and natural selection: Background selection & selective sweeps
- Applications
- GWAS
- Population history: phylogeny, structuring, gene flow, and selective sweeps (e.g., Neandertal genomes; Borrelia burgdorferi in Northeast US)
- Genomic surveillance of infectious diseases
- Bioinformatics pipeline/protocol
- In-Class Exercise: Software setup & data download
Part 2. Building genome phylogeny/Geographic structuring/Population growth?
- In-class exercise: 10:00-11:30
- Data set: cp26 plasmids from 23 B. burgdorferi sensu lato genomes
- Genome alignment: MUGSY & Alignment viewer: Gmaj
- Genome tree: FastTree
- Tree re-rooting: R package APE see syllabus
- Interactive tree viewer: trexonline
Part 3. Estimation of recombination rate
- In-class exercise: 2-3
- Data set: three pairs of sister-group cp26 plasmids
- MUGSY visualization
- LDhat: Instructions; PDF Manual; Based on the "Four-Gamete Test" by Hudson & Kaplan, 1985
- Download and Install:
svn checkout https://ldhat.svn.sourceforge.net/svnroot/ldhat; make; make clean
- Data: cp26.ss.fas
- Convert to LDhat file format:
../ldhat/convert -seq cp26.ss.fas -2only -prefix cp26
- Generate likelihood lookup table:
../ldhat/lkgen -lk lk_n100_t0.01 -nseq 14; mv new_lk.txt lk_n14_t0.01
- Estimate recombination rates:
../ldhat/pairwise -lk lk_n14_t0.01 -seq cp26sites.txt -loc cp26locs.txt
- Identify hotspots:
../ldhat/interval -lk lk_n14_t0.01 -seq cp26sites.txt -loc cp26locs.txt
- Summarize results:
../ldhat/stat -input bounds.txt -burn 500 -loc cp26locs.txt
- Visualize in RStudio
source("http://ldhat.sourceforge.net/R/coalescent.r")
- Plot "outfile.txt"
- Plot "fit.txt"
- Plot "res.txt"
Part 4. Simulation of natural selection & Summary
- In-class exercise: 3:30-5
- ms, seq-gen; Genomes
- BacSim