Summer 2021: Difference between revisions
Jump to navigation
Jump to search
imported>Weigang |
imported>Weigang |
||
Line 30: | Line 30: | ||
* Questions & Goals | * Questions & Goals | ||
# Learn, implement, and compare the existing tools | # Learn, implement, and compare the existing tools | ||
# Fine-tuning for OspC, to be integrated with the centroid algorithm | # Fine-tuning for OspC, to be integrated with the centroid algorithm | ||
# 2nd-generation centroid design: k-means algorithm (with applications to vls, Dengue, flu B) | |||
* Reading list | * Reading list | ||
** Strodthoff et al (2020). Bioinformatics. [https://academic.oup.com/bioinformatics/article/36/8/2401/5698270 UDSMProt: universal deep sequence models for protein classification]. [https://github.com/nstrodt/UDSMProt Source code on Github] | ** Strodthoff et al (2020). Bioinformatics. [https://academic.oup.com/bioinformatics/article/36/8/2401/5698270 UDSMProt: universal deep sequence models for protein classification]. [https://github.com/nstrodt/UDSMProt Source code on Github] | ||
** [https://www.pnas.org/content/118/15/e2016239118 Rives et al (2021). PNAS. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences.] [https://github.com/facebookresearch/esm Github repository] | ** [https://www.pnas.org/content/118/15/e2016239118 Rives et al (2021). PNAS. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences.] [https://github.com/facebookresearch/esm Github repository] |
Revision as of 03:23, 2 June 2021
Project 1. Borrelia genomics
- Participants
- Questions & Goals:
- Upgrade database, genome pipeline, and website (Lia)
- Phylogeography & evolutionary maintenance of divided genome (Saymon)
- vls locus evolution (with simulation) (Lily)
- Reading list
- Schward et al (2021). Multipartite Genome of Lyme Disease Borrelia: Structure, Variation and Prophages
- Stevenson & Seshu (2018). Regulation of Gene and Protein Expression in the Lyme Disease Spirochete
Project 2. HIV compartmentalized evolution
- Participants
- Lily
- Questions and goals
- Do HIV evolve cell type tropisms within the host? Specifically, the Neural(N)-tropism vs T-cell(T)-tropism?
- Build a classifier of N-tropism HIV subtypes
- A presentation for an HIV conference in October
- Reading list
- HIV compartmentalized evolution: Evering et al (2014)
- Data sets
- ~500 sequences of env genes from 15 patients
- 2nd time point single-cell genome sequences for some of the patients
- Experimentally verified N-tropism subtypes
- Approach
- Evolutionary mechanisms: mutation, recombination, and test of adaptive selection
- Evolutionary rates & signature (BEAST)
Project 3. Natural Language models of proteins
- Participants
- Questions & Goals
- Learn, implement, and compare the existing tools
- Fine-tuning for OspC, to be integrated with the centroid algorithm
- 2nd-generation centroid design: k-means algorithm (with applications to vls, Dengue, flu B)