Summer 2018: Difference between revisions
Jump to navigation
Jump to search
imported>Weigang |
imported>Lab No edit summary |
||
Line 83: | Line 83: | ||
levels(x[,8]) # obtain ospC allele types; to be looped through in pairwise | levels(x[,8]) # obtain ospC allele types; to be looped through in pairwise | ||
for (i in 1:?) { for (j in ?:?) {cor.test(....)}} | for (i in 1:?) { for (j in ?:?) {cor.test(....)}} | ||
</syntaxhighlight> | |||
* (Muhammad) Output generates data frame of correlation/p values for 23 different Osp-C allele types in pairwise | |||
<syntaxhighlight lang="bash"> | |||
setwd("C:/R_OspC") | |||
x <- read.table("Table-S2.txt", sep="\t", header=T) | |||
a<-levels(x[,8]) | |||
output = data.frame(i=character(), j=character(), cor = numeric(), p = numeric()); | |||
#k <-0; | |||
for(i in 1:22) { | |||
allele.i <- a[i]; | |||
vect.i <- x[which(x[,8]==allele.i),12]; | |||
for(j in (i+1):23) { | |||
allele.j <- a[j]; | |||
vect.j <-x[which(x[,8]==allele.j),12]; | |||
cor <- cor.test(vect.i,vect.j, method = "pearson"); | |||
output <- rbind (output, data.frame(i=allele.i, j=allele.j, cor=cor$estimate, p=cor$p.value)); | |||
} | |||
} | |||
</syntaxhighlight> | </syntaxhighlight> |
Revision as of 17:48, 21 June 2018
Rules of Conduct
- No eating, drinking, or loud talking in the lab. Socialize in the lobby only.
- Be respectful to each other, regardless of level of study
- Be on time & responsible. Communicate in advance with the PI if late or absent
Participants
- Dr Oliver Attie, Research Associate
- Brian Sulkow, Research Associate
- Saymon Akther, CUNY Graduate Center, EEB Program
- Lily Li, CUNY Graduate Center, EEB Program
- Mei Wu, Bioinformatics Research Assistant
- Yinheng Li, Informatics Research Assistant
- Christopher Panlasigui, Hunter Biology
- Dr Lia Di, Senior Scientist
- Dr Weigang Qiu, Principal Investigator
- Summer Interns: Mahmad, Pavan, Roman, Benjamin, Andrew, Michelle, Hannah
Journal Club
- a Unix & Perl tutorial
- A short introduction to molecular phylogenetics: http://www.ncbi.nlm.nih.gov/pubmed/12801728
- A review on Borrelia genomics: https://www.ncbi.nlm.nih.gov/pubmed/24704760
- ospC epitope mapping: http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0067445
Projects
Borrelia genome evolution (Led by Saymon)
- Goal 1. Estimate time of cross-Atlantic dispersal using core-genome sequences
- Goal 2. Investigate codon biases with respect to levels of gene expression. Data file:
Identification of host species from ticks (Led by Lily [after first-level])
- Goal 1. Protocol optimization for PCR amplification of host DNA from ticks
- Goal 2. Protocol development: library construction for MiSeq
- Goal 3. Development of bioinformatics protocols and sequence database
Pseudomonas Genome-wide Association Studies (GWAS) (Led by Mai & Yinheng, in collaboration with Dr Xavier of MSKCC)
- Goal 1. Association of genes/SNPs with biofilm formation and c-di-GMP levels: Manuscript preparation
- Goal 2. Association of genome diversity with metabolic diversity
Machine learning approaches to evolution (Led by Oliver & Brian)
- Goal 1. Implement Hopfield network for optimization of protein structure
- Goal 2. Neural-net models of OspC. Structural alignment (S2 from Baum et al 2013):
- Goal 3. K-mer-based pipeline for genome classification
Weekly Schedule
- Summer kickoff (June 1, 2018, Friday): Introduction & schedules
- Week 1 (June 4-8):
- Monday: the Unix & Perl Tutorial, Part 1
- Tuesday: Unix Part2. Explore the "iris" data set using R, by following the the Monte Carlo Club Week 1 (1 & 2) and Week 2. Read McKay (2003), Chapters 38 & 39
- Thursday: 1st field day (Caumsett State Park); Participants: John, Mahamad, Pavan, Andrew, Dr Sun, Weigang, with 3 members of Moses team from Suffolk County Vector Control. Got ~110 deer tick nymphs
- Friday: meeting with MSKCC group at 11am; BBQ afterwards
- Week 2 (June 11-15):
- Monday: Lab meeting, projects assigned
- Tuesday: neural net tutorial (by Brian)
- Thursday: 2nd field day (Fire Island National Seashore). Participants: John, Brian, Mei, Mahamad, Pavan, Benjamin, and Weigang. Got ~100 lone-star ticks and 4 deer tick nymphs
- Week 3 (June 18-22):
- Monday: Lab meeting, 1st project reports
- Codon Bias: Theory, Coding, and Data (Andrew, Pavan, Saymon)
- OspC epitope identification: Serum correlation, sequence correlation, immunity-sequence correction (Mahamad, Roman, Brian)
- Pseudomonas metabolomics: parsing intensity file; theory & parsing SMBL file (Chris & Benjamin)
- Tuesday: working groups
- Wed: working groups
- Thursday: Big Data Workshop
- Friday: working groups
- Monday: Lab meeting, 1st project reports
- Week 4 (June 25-29):
- Monday: Lab meeting
Lab notes for Summer HS Interns
- NCI Cloud: Seven Bridges Cloud Platform. Create an user account
- Read documentation & tutorials: Documentation
Notes & Scripts
- (Weigang) A sample R script to parse Table S2 from Baum et al 2013, sera-antigen reactivity measurements
# preliminaries: save as TSV; substitute "\r" if necessary;
# substitute "N/A" to "NA"; remove extra columns
setwd("Downloads/")
x <- read.table("table-s2.txt4", sep="\t", header=T)
View(x)
colnames(x)
which(x[,8]=="A")
x[which(x[,8]=="A"),12]
x[which(x[,8]=="A3"),12]
cor.test(x[which(x[,8]=="A3"),12], x[which(x[,8]=="A"),12], method = "pearson")
x.cor$estimate
levels(x[,8]) # obtain ospC allele types; to be looped through in pairwise
for (i in 1:?) { for (j in ?:?) {cor.test(....)}}
- (Muhammad) Output generates data frame of correlation/p values for 23 different Osp-C allele types in pairwise
setwd("C:/R_OspC")
x <- read.table("Table-S2.txt", sep="\t", header=T)
a<-levels(x[,8])
output = data.frame(i=character(), j=character(), cor = numeric(), p = numeric());
#k <-0;
for(i in 1:22) {
allele.i <- a[i];
vect.i <- x[which(x[,8]==allele.i),12];
for(j in (i+1):23) {
allele.j <- a[j];
vect.j <-x[which(x[,8]==allele.j),12];
cor <- cor.test(vect.i,vect.j, method = "pearson");
output <- rbind (output, data.frame(i=allele.i, j=allele.j, cor=cor$estimate, p=cor$p.value));
}
}