Borrelia codon usage

From QiuLab
Revision as of 21:03, 31 July 2013 by imported>Rayrah (→‎Charts)
Jump to navigation Jump to search

Data

Report for BBA15

Codon Adaptation Index (CAI) for sequence : 0.47

GC percentage for sequence : 33.58%

GENETIC CODE USED : 1 <a href="http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi">more about genetic code</a>

CODON USAGE

CODONGCCGCAGCGGCTCGAAGGCGTCGGAGACGCAACAATGACGATTGCTGTCAACAGGAGGAAGGAGGTGGGGGCCACCATATTATCATACTTCTGTTACTACTCTTGAAGAAAATGTTTTTCCCGCCCCCACCTAGTTCAAGCTCGTCCTCTTAGTGATAAACTACAACCACGTGGTATTACGTCGTGGTTGTA
AMINOACIDAlaAlaAlaAlaArgArgArgArgArgArgAsnAsnAspAspCysCysGlnGlnGluGluGlyGlyGlyGlyHisHisIleIleIleLeuLeuLeuLeuLeuLeuLysLysMetPhePheProProProProSerSerSerSerSerSerStopStopStopThrThrThrThrTrpTyrTyrValValValVal
COUNT15O7OOOO2O67117O14O41910426OO8O511O95O3439212OOO14106O16OOO13143O123O21111


Relative Synonimous Codon Usage (RSCU)

CODONGCCGCAGCGGCTCGAAGGCGTCGGAGACGCAACAATGACGATTGCTGTCAACAGGAGGAAGGAGGTGGGGGCCACCATATTATCATACTTCTGTTACTACTCTTGAAGAAAATGTTTTTCCCGCCCCCACCTAGTTCAAGCTCGTCCTCTTAGTGATAAACTACAACCACGTGGTATTACGTCGTGGTTGTA
AMINOACIDAlaAlaAlaAlaArgArgArgArgArgArgAsnAsnAspAspCysCysGlnGlnGluGluGlyGlyGlyGlyHisHisIleIleIleLeuLeuLeuLeuLeuLeuLysLysMetPhePheProProProProSerSerSerSerSerSerStopStopStopThrThrThrThrTrpTyrTyrValValValVal
RSCU0.311.540.002.150.000.000.000.006.000.000.921.081.220.780.002.002.000.000.351.651.820.730.361.090.000.001.850.001.152.360.001.931.070.000.640.191.811.000.671.330.000.000.004.000.892.221.330.000.221.330.000.000.001.731.870.400.001.000.801.200.000.331.831.83


Relative Adaptiveness of Codon

CODONGCCGCAGCGGCTCGAAGGCGTCGGAGACGCAACAATGACGATTGCTGTCAACAGGAGGAAGGAGGTGGGGGCCACCATATTATCATACTTCTGTTACTACTCTTGAAGAAAATGTTTTTCCCGCCCCCACCTAGTTCAAGCTCGTCCTCTTAGTGATAAACTACAACCACGTGGTATTACGTCGTGGTTGTA
AMINOACIDAlaAlaAlaAlaArgArgArgArgArgArgAsnAsnAspAspCysCysGlnGlnGluGluGlyGlyGlyGlyHisHisIleIleIleLeuLeuLeuLeuLeuLeuLysLysMetPhePheProProProProSerSerSerSerSerSerStopStopStopThrThrThrThrTrpTyrTyrValValValVal
RAC0.140.710.001.000.000.000.000.001.000.000.861.001.000.640.001.001.000.000.211.001.000.400.200.600.000.001.000.000.621.000.000.820.450.000.270.101.001.000.501.000.000.000.001.000.401.000.600.000.100.600.000.000.000.931.000.210.001.000.671.000.000.181.001.00


Monomers

A352
T192
G154
C121


<a href="http://search.cpan.org/~shardiwal/Bio-Tools-CodonOptTable-1.05/lib/Bio/Tools/CodonOptTable.pm">Source code</a> is available.

Report for BBA24


Codon Adaptation Index (CAI) for sequence : 0.50

GC percentage for sequence : 32.46%

GENETIC CODE USED : 1 <a href="http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi">more about genetic code</a>

CODON USAGE

CODONGCCGCAGCGGCTCGAAGGCGTCGGAGACGCAACAATGACGATTGCTGTCAACAGGAGGAAGGAGGTGGGGGCCACCATATTATCATACTTCTGTTACTACTCTTGAAGAAAATGTTTTTCCCGCCCCCACCTAGTTCAAGCTCGTCCTCTTAGTGATAAACTACAACCACGTGGTATTACGTCGTGGTTGTA
AMINOACIDAlaAlaAlaAlaArgArgArgArgArgArgAsnAsnAspAspCysCysGlnGlnGluGluGlyGlyGlyGlyHisHisIleIleIleLeuLeuLeuLeuLeuLeuLysLysMetPhePheProProProProSerSerSerSerSerSerStopStopStopThrThrThrThrTrpTyrTyrValValValVal
COUNT171721OO2O4827224O3155211O15O77O5311324562OO31351OO1OOO8811O112224


Relative Synonimous Codon Usage (RSCU)

CODONGCCGCAGCGGCTCGAAGGCGTCGGAGACGCAACAATGACGATTGCTGTCAACAGGAGGAAGGAGGTGGGGGCCACCATATTATCATACTTCTGTTACTACTCTTGAAGAAAATGTTTTTCCCGCCCCCACCTAGTTCAAGCTCGTCCTCTTAGTGATAAACTACAACCACGTGGTATTACGTCGTGGTTGTA
AMINOACIDAlaAlaAlaAlaArgArgArgArgArgArgAsnAsnAspAspCysCysGlnGlnGluGluGlyGlyGlyGlyHisHisIleIleIleLeuLeuLeuLeuLeuLeuLysLysMetPhePheProProProProSerSerSerSerSerSerStopStopStopThrThrThrThrTrpTyrTyrValValValVal
RSCU0.251.750.251.752.401.200.000.002.400.000.671.330.441.561.001.002.000.000.331.672.220.890.440.440.002.001.250.001.752.470.001.761.060.350.350.221.781.001.500.500.000.003.001.001.803.000.600.000.000.600.000.000.001.781.780.220.220.001.001.000.800.800.801.60


Relative Adaptiveness of Codon

CODONGCCGCAGCGGCTCGAAGGCGTCGGAGACGCAACAATGACGATTGCTGTCAACAGGAGGAAGGAGGTGGGGGCCACCATATTATCATACTTCTGTTACTACTCTTGAAGAAAATGTTTTTCCCGCCCCCACCTAGTTCAAGCTCGTCCTCTTAGTGATAAACTACAACCACGTGGTATTACGTCGTGGTTGTA
AMINOACIDAlaAlaAlaAlaArgArgArgArgArgArgAsnAsnAspAspCysCysGlnGlnGluGluGlyGlyGlyGlyHisHisIleIleIleLeuLeuLeuLeuLeuLeuLysLysMetPhePheProProProProSerSerSerSerSerSerStopStopStopThrThrThrThrTrpTyrTyrValValValVal
RAC0.141.000.141.001.000.500.000.001.000.000.501.000.291.001.001.001.000.000.201.001.000.400.200.200.001.000.710.001.001.000.000.710.430.140.140.131.001.001.000.330.000.001.000.330.601.000.200.000.000.200.000.000.001.001.000.130.130.001.001.000.500.500.501.00


Monomers

A248
T139
G102
C84


<a href="http://search.cpan.org/~shardiwal/Bio-Tools-CodonOptTable-1.05/lib/Bio/Tools/CodonOptTable.pm">Source code</a> is available.

Report for BBA68

Codon Adaptation Index (CAI) for sequence : 0.48

GC percentage for sequence : 27.62%

GENETIC CODE USED : 1 <a href="http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi">more about genetic code</a>

CODON USAGE

CODONGCCGCAGCGGCTCGAAGGCGTCGGAGACGCAACAATGACGATTGCTGTCAACAGGAGGAAGGAGGTGGGGGCCACCATATTATCATACTTCTGTTACTACTCTTGAAGAAAATGTTTTTCCCGCCCCCACCTAGTTCAAGCTCGTCCTCTTAGTGATAAACTACAACCACGTGGTATTACGTCGTGGTTGTA
AMINOACIDAlaAlaAlaAlaArgArgArgArgArgArgAsnAsnAspAspCysCysGlnGlnGluGluGlyGlyGlyGlyHisHisIleIleIleLeuLeuLeuLeuLeuLeuLysLysMetPhePheProProProProSerSerSerSerSerSerStopStopStopThrThrThrThrTrpTyrTyrValValValVal
COUNT26O3OOOO3O4161122O13O421321121125147295O34334721O14153O24OOO8O41136OO21


Relative Synonimous Codon Usage (RSCU)

CODONGCCGCAGCGGCTCGAAGGCGTCGGAGACGCAACAATGACGATTGCTGTCAACAGGAGGAAGGAGGTGGGGGCCACCATATTATCATACTTCTGTTACTACTCTTGAAGAAAATGTTTTTCCCGCCCCCACCTAGTTCAAGCTCGTCCTCTTAGTGATAAACTACAACCACGTGGTATTACGTCGTGGTTGTA
AMINOACIDAlaAlaAlaAlaArgArgArgArgArgArgAsnAsnAspAspCysCysGlnGlnGluGluGlyGlyGlyGlyHisHisIleIleIleLeuLeuLeuLeuLeuLeuLysLysMetPhePheProProProProSerSerSerSerSerSerStopStopStopThrThrThrThrTrpTyrTyrValValValVal
RSCU0.732.180.001.090.000.000.000.006.000.000.401.600.151.852.000.002.000.000.321.681.711.140.570.571.330.671.160.481.351.620.462.081.150.000.690.221.781.001.560.440.670.000.672.670.402.001.200.000.801.600.000.000.002.460.001.230.311.000.671.330.000.002.671.33


Relative Adaptiveness of Codon

CODONGCCGCAGCGGCTCGAAGGCGTCGGAGACGCAACAATGACGATTGCTGTCAACAGGAGGAAGGAGGTGGGGGCCACCATATTATCATACTTCTGTTACTACTCTTGAAGAAAATGTTTTTCCCGCCCCCACCTAGTTCAAGCTCGTCCTCTTAGTGATAAACTACAACCACGTGGTATTACGTCGTGGTTGTA
AMINOACIDAlaAlaAlaAlaArgArgArgArgArgArgAsnAsnAspAspCysCysGlnGlnGluGluGlyGlyGlyGlyHisHisIleIleIleLeuLeuLeuLeuLeuLeuLysLysMetPhePheProProProProSerSerSerSerSerSerStopStopStopThrThrThrThrTrpTyrTyrValValValVal
RAC0.331.000.000.500.000.000.000.001.000.000.251.000.081.001.000.001.000.000.191.001.000.670.330.331.000.500.860.361.000.780.221.000.560.000.330.121.001.001.000.290.250.000.251.000.201.000.600.000.400.800.000.000.001.000.000.500.131.000.501.000.000.001.000.50


Monomers

A346
T199
G97
C111


<a href="http://search.cpan.org/~shardiwal/Bio-Tools-CodonOptTable-1.05/lib/Bio/Tools/CodonOptTable.pm">Source code</a> is available.

Report for BBB18


Codon Adaptation Index (CAI) for sequence : 0.50

GC percentage for sequence : 29.17%

GENETIC CODE USED : 1 <a href="http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi">more about genetic code</a>

CODON USAGE

CODONGCCGCAGCGGCTCGAAGGCGTCGGAGACGCAACAATGACGATTGCTGTCAACAGGAGGAAGGAGGTGGGGGCCACCATATTATCATACTTCTGTTACTACTCTTGAAGAAAATGTTTTTCCCGCCCCCACCTAGTTCAAGCTCGTCCTCTTAGTGATAAACTACAACCACGTGGTATTACGTCGTGGTTGTA
AMINOACIDAlaAlaAlaAlaArgArgArgArgArgArgAsnAsnAspAspCysCysGlnGlnGluGluGlyGlyGlyGlyHisHisIleIleIleLeuLeuLeuLeuLeuLeuLysLysMetPhePheProProProProSerSerSerSerSerSerStopStopStopThrThrThrThrTrpTyrTyrValValValVal
COUNT16O9O2111115247153421133615537162973010O201121010429194O1611794O120OOO1094O3175121314


Relative Synonimous Codon Usage (RSCU)

CODONGCCGCAGCGGCTCGAAGGCGTCGGAGACGCAACAATGACGATTGCTGTCAACAGGAGGAAGGAGGTGGGGGCCACCATATTATCATACTTCTGTTACTACTCTTGAAGAAAATGTTTTTCCCGCCCCCACCTAGTTCAAGCTCGTCCTCTTAGTGATAAACTACAACCACGTGGTATTACGTCGTGGTTGTA
AMINOACIDAlaAlaAlaAlaArgArgArgArgArgArgAsnAsnAspAspCysCysGlnGlnGluGluGlyGlyGlyGlyHisHisIleIleIleLeuLeuLeuLeuLeuLeuLysLysMetPhePheProProProProSerSerSerSerSerSerStopStopStopThrThrThrThrTrpTyrTyrValValValVal
RSCU0.251.500.002.250.000.750.380.384.130.380.341.660.641.360.861.141.910.090.151.852.000.670.400.930.291.711.320.321.361.130.002.261.250.231.130.381.621.001.650.350.000.221.332.441.021.320.590.000.152.930.000.000.001.741.570.700.001.001.550.450.130.271.731.87


Relative Adaptiveness of Codon

CODONGCCGCAGCGGCTCGAAGGCGTCGGAGACGCAACAATGACGATTGCTGTCAACAGGAGGAAGGAGGTGGGGGCCACCATATTATCATACTTCTGTTACTACTCTTGAAGAAAATGTTTTTCCCGCCCCCACCTAGTTCAAGCTCGTCCTCTTAGTGATAAACTACAACCACGTGGTATTACGTCGTGGTTGTA
AMINOACIDAlaAlaAlaAlaArgArgArgArgArgArgAsnAsnAspAspCysCysGlnGlnGluGluGlyGlyGlyGlyHisHisIleIleIleLeuLeuLeuLeuLeuLeuLysLysMetPhePheProProProProSerSerSerSerSerSerStopStopStopThrThrThrThrTrpTyrTyrValValValVal
RAC0.110.670.001.000.000.180.090.091.000.090.211.000.471.000.751.001.000.050.081.001.000.330.200.470.171.000.970.231.000.500.001.000.550.100.500.241.001.001.000.210.000.090.551.000.350.450.200.000.051.000.000.000.001.000.900.400.001.001.000.290.070.140.931.00


Monomers

A626
T496
G248
C214


<a href="http://search.cpan.org/~shardiwal/Bio-Tools-CodonOptTable-1.05/lib/Bio/Tools/CodonOptTable.pm">Source code</a> is available.

Report for Whole Genome

Codon Adaptation Index (CAI) for sequence : 0.57

GC percentage for sequence : 28.86%

GENETIC CODE USED : 1 <a href="http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi">more about genetic code</a>

CODON USAGE

CODONGCCGCAGCGGCTCGAAGGCGTCGGAGACGCAACAATGACGATTGCTGTCAACAGGAGGAAGGAGGTGGGGGCCACCATATTATCATACTTCTGTTACTACTCTTGAAGAAAATGTTTTTCCCGCCCCCACCTAGTTCAAGCTCGTCCTCTTAGTGATAAACTACAACCACGTGGTATTACGTCGTGGTTGTA
AMINOACIDAlaAlaAlaAlaArgArgArgArgArgArgAsnAsnAspAspCysCysGlnGlnGluGluGlyGlyGlyGlyHisHisIleIleIleLeuLeuLeuLeuLeuLeuLysLysMetPhePheProProProProSerSerSerSerSerSerStopStopStopThrThrThrThrTrpTyrTyrValValValVal
COUNT2000708011878110127734279135121146558468572519537621562323973330987422246834192978360515134523285180842942376041651714011759142617941474213858043111483759577942407738136481607374641466834737251621072183498762607390268386995816124441195329815345476012042398113506016


Relative Synonimous Codon Usage (RSCU)

CODONGCCGCAGCGGCTCGAAGGCGTCGGAGACGCAACAATGACGATTGCTGTCAACAGGAGGAAGGAGGTGGGGGCCACCATATTATCATACTTCTGTTACTACTCTTGAAGAAAATGTTTTTCCCGCCCCCACCTAGTTCAAGCTCGTCCTCTTAGTGATAAACTACAACCACGTGGTATTACGTCGTGGTTGTA
AMINOACIDAlaAlaAlaAlaArgArgArgArgArgArgAsnAsnAspAspCysCysGlnGlnGluGluGlyGlyGlyGlyHisHisIleIleIleLeuLeuLeuLeuLeuLeuLysLysMetPhePheProProProProSerSerSerSerSerSerStopStopStopThrThrThrThrTrpTyrTyrValValValVal
RSCU0.441.540.261.770.421.130.300.173.780.190.431.570.391.610.841.161.630.370.521.481.651.020.680.650.591.411.580.281.141.560.192.380.630.181.070.461.541.001.730.270.260.631.481.631.281.380.960.200.341.840.590.881.541.491.740.520.251.001.530.470.230.462.171.15


Relative Adaptiveness of Codon

CODONGCCGCAGCGGCTCGAAGGCGTCGGAGACGCAACAATGACGATTGCTGTCAACAGGAGGAAGGAGGTGGGGGCCACCATATTATCATACTTCTGTTACTACTCTTGAAGAAAATGTTTTTCCCGCCCCCACCTAGTTCAAGCTCGTCCTCTTAGTGATAAACTACAACCACGTGGTATTACGTCGTGGTTGTA
AMINOACIDAlaAlaAlaAlaArgArgArgArgArgArgAsnAsnAspAspCysCysGlnGlnGluGluGlyGlyGlyGlyHisHisIleIleIleLeuLeuLeuLeuLeuLeuLysLysMetPhePheProProProProSerSerSerSerSerSerStopStopStopThrThrThrThrTrpTyrTyrValValValVal
RAC0.250.870.151.000.110.300.080.041.000.050.271.000.241.000.721.001.000.230.351.001.000.620.410.390.421.001.000.180.720.660.081.000.260.080.450.301.001.001.000.160.160.390.901.000.690.750.520.110.191.000.380.571.000.861.000.300.151.001.000.310.110.211.000.53


Monomers

A524241
T440303
G225741
C165502


<a href="http://search.cpan.org/~shardiwal/Bio-Tools-CodonOptTable-1.05/lib/Bio/Tools/CodonOptTable.pm">Source code</a> is available.

Charts

BBA15 Hist 1.png
BBA24 Hist 2.png
BBA68 Hist 3.png
BBB 18 Hist 1.png
Whole genome File:Hist 5

Hypothesis & Background

  • Central Hypothesis: Borrelia host-interacting genes show optimal codon usage
  • Background/Rationale:
    • Borrelia is an obligate, non-free-living parasite of vertebrates. A large number of genes are devoted to host invasion and to surviving the host immune defense.
    • Evolutionary theory predicts that highly expressed genes use the most abundant tRNA in the cell, and, as a result, they tend to show strong codon usage biases
    • We expect Borrelia host-interacting, virulence-conferring genes use more optimal codons than housekeeping genes
  • Importance: If the hypothesis is supported, it would establish a new computational method to identify host-interacting virulence genes based on genome analysis.

Data & Overview

  1. Data Set: The Borrelia burgdorferi B31 genome sequences, N=1500 genes
  2. Identify host-interacting genes and a set of housekeeping genes
  3. Calculate the Codon Usage Adaptation Index for each gene
  4. Test if the biases are significantly different between the virulence genes vs the house keeping genes
  5. Presentation, Report, & Future directions

Essential Computing skills

  • Operating System: Linux/Ubuntu
  • Programming Languages: BASH, Perl/BioPerl
  • Database Language: SQL
  • Statistical Language: R

Essential Readings

Weigang 11:38, 10 May 2013 (EDT)