Annotate-a-genome

From QiuLab
Revision as of 22:08, 13 February 2014 by imported>Weigang (→‎Protocol)
Jump to navigation Jump to search

Project Goals

  • Annotate and add newly sequenced Borrelia genomes to BorreliaBase
  • Build an informatics pipeline for gene prediction, ortholog calls, databasing, and synteny analysis

Download genome sequences from GenBank

Strain Species Group Genome Sequences Notes
B31 B. burgdorferi (reference genome) Lyme Disease Example
CA382 B. burgdorferi (California) Lyme Disease Example
CA8 B. burgdorferi (California) Lyme Disease Example
BgVir B. garinii (Russia) Lyme Disease Example
NMJW1 B. garinii (China) Lyme Disease Example
HLJ01 B. afzelii (China) Lyme Disease Example
Ly B. duttonii (Tanzania) Relapsing Fever Example
A1 B. recurrentis (Ethiopia) Relapsing Fever Example
DAH B. hermsii (Washington State) Relapsing Fever Example
91E135 B. turicatae (Texas) Relapsing Fever Example
Achema B. crocidurae (Mauritania) Relapsing Fever Example
HR1 B. parkeri (??) Relapsing Fever Example
LB-2001 B. miyamotoi (Northeast US) Relapsing Fever Example

Protocol

Fetch genome sequences

./bioseq -z 'b31_accession' -o 'genbank' > b31.gb # Reference genome for ortholog identification. Choose main, cp26, or lp54
./bioseq -z 'gb_accession' -o 'genbank' > new.gb
./gb2fas -n b31.gb > b31.nuc # Extract CDS
./gb2fas -n new.gb > new.nuc
./bioseq -t b31.nuc > b31.pep # Translate (and remove those with internal stop codons)
./bioseq -t new.nuc > new.pep

Predict orthologs with reciprocal BLAST

makeblastdb -in b31.pep -parse_seqids # Prepare the reference DB
makeblastdb -in new.pep -parse_seqids # Prepare the new genome DB
blastp -query new.pep -db b31.pep -outfmt 6 -evalue 1e-3 out forward_blast.out # Forward BLAST
blastp -query b31.pep -db new.pep -outfmt 6 -evalue 1e-3 out reverse_blast.out # Reverse BLAST
./check-reciprocal.pl forward_blast.out reverse_blast.out > new.orthlogs 2> new.not-orthologs # Identify orthologs

=Verification with synteny broswer

./gb2fas -t new.gb > new-to-orf-table.txt