Summer 2021

Group meeting/trip schedule

June 3, 2021 (Thursday). Summer research kickoff
June 8, 2021 (Tuesday). 11-2
- Algorithm development (Brian)
- NLP models of protein structure (Eamen, Roman, Edgar)
- Bb transcriptomics (Niemah & Jackie)
- HIV compartmentalization (Lily)
June 10, 2021 (Thursday). No meeting. Field day
- Thursday: Desiree, Lily, Lia, Weigang
- Friday: John, Weigang
- Saturday: Weigang
June 15, 2021 (Tuesday). 11-2

Participants: Niemah, Jackie
Questions & Goals:
- Upgrade database, genome pipeline, and website (Lia)
- Phylogeography & evolutionary maintenance of divided genome (Saymon)
- vls evolution (with simulation) & development of immunoflorescence microsopy methods(Lily). Live imaging.
Reading list
- Latest review book Lyme Disease and Relapsing Fever Spirochetes: Genomics, Molecular Biology, Host Interactions and Disease Pathogenesis. The chapter on gene regulation and transcriptomics (notice Fig 1, Fig 2, and Table 1)
- Schward et al (2021). Multipartite Genome of Lyme Disease Borrelia: Structure, Variation and Prophages
- Stevenson & Seshu (2018). Regulation of Gene and Protein Expression in the Lyme Disease Spirochete

Participants: Dr Saad Mneimneih (CS Department), Brian
Questions & Goals:
- Generalized algorithms for antigen with arbitrary tree shape
  - Data set 1. Neutral evolution (with exponentially distributed branch lengths). Binary strings (L=100 bits) evolved from a coalescent tree of 20 leaves. Simulated with rcoal(20); rTraitDisc; simSeq(). code from previous work
  - Data set 2. Two major clades. HA sequences from fluB
  - Data set 3. Four major clades. Dengue
  - Data set 4. Star-shaped tree, driven by recombination. OspC
  - Data set 5. Multiple major clades. vls cassette in Lyme species
- Combination algorithms
- Naive Bayes models to integrate immunogenicity data
- Natural language models to improve structural stability (see Project 4 below)
Reading list
- Di et al (2021). Maximum antigen divergence in Lyme bacterial population

Participants: Lily
Questions and goals
- Do HIV evolve cell type tropisms within the host? Specifically, the Neural(N)-tropism vs T-cell(T)-tropism?
- Build a classifier of N-tropism HIV subtypes
- A presentation for an HIV conference in October
Reading list
- HIV compartmentalized evolution: Evering et al (2014)
Data sets
- ~500 sequences of env genes from 15 patients
- 2nd time point single-cell genome sequences for some of the patients
- Experimentally verified N-tropism subtypes
Approach
- Evolutionary mechanisms: mutation, recombination, and adaptive selection
- Homoplasy index as a measure of compartmentalization? Randomization to obtain p-values of HI.
- Evolutionary rates & signature (BEAST)
- Tests of natural selection (PAML site models, branch-site models & MK analysis)
- Phylogenetic analysis: tree per individual; supertree; haplotype networks (per individual)
- Simulated compartmentalization

Learn, implement, and compare the existing tools
Fine-tuning for OspC, to be integrated with the centroid algorithm
2nd-generation centroid design: k-means algorithm (with applications to vls, Dengue, flu B)