Pseudomonas population genomics: Difference between revisions

Revision as of 21:23, 18 June 2013

Projects

Build a local genome database
1. Database schema:
  1. "genome": genome_id, strain_name, ncbi_taxid
  2. "orf": genome_id, locus_tag, start, stop, strand, genome_name, product_name
  3. "orth_orf": orth_orf_id, locus_name, genome_id, orth_class
2. Parsing scripts
  1. Rayees Parsing code, requires that you remove columns 9-27 using bash command: cut -c 1-8 (I will write a bash script that does this and runs the program) https://www.dropbox.com/s/lpxxbkxeyw7frrn/parser.pl
3. Database loading scripts
Molecular Evolution of flagellum genes
1. Download orthologs
2. Reconstruct phylogenetic tree
3. Run PAML tests

Update: June 18, 2013

Ortholog Aligning and Phylogenetic Tree Material and Methods

Protein ortholog data: fleN, fleQ, flhF
Protocol:
1. Fasta headers are too long for tree, run: rename.pl to shorten names. Usage: >rename.pl <FASTA_file> > <OUTPUTfilename.fas> (will rewrite script to create automatic output file)
2. To align use muscle Usage: >muscle -in <FASTA_File> -out <OUTPUTfilename> -clwstrict
3. To create tree file run clustalW2 Usage >clustalw2 -infile= <Aligned_file> -output= Phylip
4. To create tree run R using following commands:
  1. setwd("<directory containing phylip files>")
  2. library ("ape")
  3. library ("phangorn")
  4. <gene_name> = read.tree("<file_gene_name.phy>")
  5. <gene_name> = midpoint(gene_name)
  6. plot(<gene_name>)

Gene	Alignment	Notes
fleN	fleN pep alignment	Conserved Domain
fleQ	fleQ pep alignment	Conserved Domain
flhf	flhF pep alignment	Conserved Domain

Benchmark: June 11, 2013

Finish parsing the genome files to upload the "orf" table (Raymond & Rayees)
1. Rayees Parsed genome files: https://www.dropbox.com/sh/k0zktvvmv39op9i/1zBercEky8
Parsing the ortholog file to upload the "orth_orf" table (Raymond)
Identify and download fleN, fleQ, and flhF orthologs & align them (Rayees)

@@ Line 14: / Line 14: @@
 ==Update: June 18, 2013==
-# Material and Methods
+Ortholog Aligning and Phylogenetic Tree
-## Protein ortholog data: [http://pseudomonas.com/alignPolymorphicGeneSequencesStep1.do?feature_id_parent=105676&start=1583956&stop=1584798&replicon_id_reference=136&alphabet=protein&limit_to_species=false fleN], [http://pseudomonas.com/alignPolymorphicGeneSequencesStep1.do?feature_id_parent=104960&start=1187587&stop=1189059&replicon_id_reference=136&alphabet=protein&limit_to_species=false fleQ], [http://pseudomonas.com/alignPolymorphicGeneSequencesStep1.do?feature_id_parent=105674&start=1582528&stop=1583817&replicon_id_reference=136&alphabet=protein&limit_to_species=false flhF]
+Material and Methods
-## Protocol:
+# Protein ortholog data: [http://pseudomonas.com/alignPolymorphicGeneSequencesStep1.do?feature_id_parent=105676&start=1583956&stop=1584798&replicon_id_reference=136&alphabet=protein&limit_to_species=false fleN], [http://pseudomonas.com/alignPolymorphicGeneSequencesStep1.do?feature_id_parent=104960&start=1187587&stop=1189059&replicon_id_reference=136&alphabet=protein&limit_to_species=false fleQ], [http://pseudomonas.com/alignPolymorphicGeneSequencesStep1.do?feature_id_parent=105674&start=1582528&stop=1583817&replicon_id_reference=136&alphabet=protein&limit_to_species=false flhF]
-### Fasta headers are too long for tree, run: [https://www.dropbox.com/s/x2p4joeqg7omfub/rename.pl rename.pl] to shorten names. Usage:<code> >rename.pl <FASTA_file> > <OUTPUTfilename.fas> </code> (will rewrite script to create automatic output file)
+# Protocol:
-### To align use [http://www.drive5.com/muscle/Alignment muscle] Usage: <code> >muscle -in <FASTA_File> -out <OUTPUTfilename> -clwstrict </code>
+## Fasta headers are too long for tree, run: [https://www.dropbox.com/s/x2p4joeqg7omfub/rename.pl rename.pl] to shorten names. Usage:<code> >rename.pl <FASTA_file> > <OUTPUTfilename.fas> </code> (will rewrite script to create automatic output file)
-### To create tree file run [http://www.clustal.org/clustal2/ clustalW2] Usage <code>>clustalw2 -infile= <Aligned_file> -output= Phylip </code>
+## To align use [http://www.drive5.com/muscle/Alignment muscle] Usage: <code> >muscle -in <FASTA_File> -out <OUTPUTfilename> -clwstrict </code>
-### To create tree run [http://www.r-project.org/ R] using following commands:
+## To create tree file run [http://www.clustal.org/clustal2/ clustalW2] Usage <code>>clustalw2 -infile= <Aligned_file> -output= Phylip </code>
-####<code> setwd("<directory containing phylip files>")</code>
+## To create tree run [http://www.r-project.org/ R] using following commands:
-####<code> library ("ape")</code>
+###<code> setwd("<directory containing phylip files>")</code>
-####<code> library ("phangorn")</code>
+###<code> library ("ape")</code>
-####<code> <gene_name> = read.tree("<file_gene_name.phy>")</code>
+###<code> library ("phangorn")</code>
-####<code> <gene_name> = midpoint(gene_name)</code>
+###<code> <gene_name> = read.tree("<file_gene_name.phy>")</code>
-####<code> plot(<gene_name>)</code>
+###<code> <gene_name> = midpoint(gene_name)</code>
+###<code> plot(<gene_name>)</code>
 {| class="wikitable"
 ! Gene !! Alignment !! Tree !! Notes

Pseudomonas population genomics: Difference between revisions

Revision as of 21:23, 18 June 2013

Projects

Update: June 18, 2013

Benchmark: June 11, 2013

Navigation menu

Search