QuBi/modules/biol203-geno-pheno-association: Difference between revisions
imported>Weigang |
imported>Weigang |
||
Line 109: | Line 109: | ||
# Are there exceptions? What are possible causes for exceptions? | # Are there exceptions? What are possible causes for exceptions? | ||
==Web Exercise | ==Web Exercise. Search for gene information using NCBI online databases== | ||
# Point your browser to the [http://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE_TYPE=BlastSearch&BLAST_SPEC=OGP__9606__9558&LINK_LOC=blasthome NCBI Human Genome Resource] page | # Point your browser to the [http://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE_TYPE=BlastSearch&BLAST_SPEC=OGP__9606__9558&LINK_LOC=blasthome NCBI Human Genome Resource] page | ||
# Copy and paste sequence at Locus A into the first text box (add a FASTA heading, e.g., ">Locus_A") | # Copy and paste sequence at Locus A into the first text box (add a FASTA heading, e.g., ">Locus_A") |
Revision as of 02:05, 4 April 2016
- BIOL 203 Bioinformatics Exercises for Lab 13
Test phenotype-genotype association
Introduction: GWAS & Contingency Test
Genome-Wide Association Study (GWAS) is a method for mapping phenotypes to genotypes. In a typical GWAS study, frequencies of alleles (e.g., C or T at position 785) are determined in a sample of affected individuals (the "cases") as well as in a sample of unaffected individuals (the "controls"). For example, the following table shows results of a hypothetical case-control study at a locus segregating with two alleles (C and T):
Table 1. Sample Genotype Frequencies
T/T | T/C | C/C | Total | |
---|---|---|---|---|
Case | 0 | 24 | 127 | ? |
Control | 9 | 68 | 114 | ? |
Total | ? | ? | ? | ? |
Association between the genotype and the phenotype could be assessed with a contingency table analysis. In this case, Χ2 = 26.4, p=0.0005, suggesting a significant association between genotypes and diseases. (By comparing the expected and observed counts, one could conclude that the C/C genotypes are over-represented in disease cases.)
- Perform an online contingency table analysis using the hypothetical data in Table 1.
- Deriving from Table 1, fill the following table with allele counts. Then perform a 2-by-2 contingency table analysis using the link above. Is there a statistically significant association between alleles and disease phenotype? Which allele (C or T) is over-represented in (i.e., statistically associated with) disease cases?
Table 2. Sample Allele Frequencies
T | C | Total | |
---|---|---|---|
Case | ? | ? | ? |
Control | ? | ? | ? |
Total | ? | ? | ? |
Test association at Locus A
Following the above two examples, perform both the genotype and allele association tests using the class data.
Table 3a. Genotype counts at Locus A
A1/A1 | A1/A2 | A2/A2 | Row Sum | |
---|---|---|---|---|
Taster | ? | ? | ? | ? |
Non-Taster | ? | ? | ? | ? |
Column Sum | ? | ? | ? | ? |
Calculate allele counts & then test for association
Table 3b. Allele counts at Locus A
A1 | A2 | Row Sum | |
---|---|---|---|
Taster | ? | ? | ? |
Non-Taster | ? | ? | ? |
Column Sum | ? | ? | ? |
Test association at Locus B
Table 4a. Genotype counts at Locus B for each phenotype
B1/B1 | B1/B2 | B1/B3 | B2/B2 | B2/B3 | B3/B3 | Row Sum | |
---|---|---|---|---|---|---|---|
Taster | ? | ? | ? | ? | ? | ? | ? |
Non-Taster | ? | ? | ? | ? | ? | ? | ? |
Column Sum | ? | ? | ? | ? | ? | ? | ? |
Calculate allele counts & then test for association Table 4b. Allele counts at Locus A
B1 | B2 | B3 | Row Sum | |
---|---|---|---|---|
Taster | ? | ? | ? | ? |
Non-Taster | ? | ? | ? | ? |
Column Sum | ? | ? | ? | ? |
Exit Questions
- State what is the null hypothesis in a chi-square test & what is the alternative hypothesis
- Explain what probability is represented by the p-value.
- What can you conclude when p-value is below the threshold of significance (e.g., p = 0.05)?
- What would you conclude when p-value is above the critical value?
- Which of the two genes shows significant genotype association with the PTC Taster/Non-Taster phenotype?
- Is there a statistically significant association between the alleles and the Taster phenotype?
- Which genotype is over-represented in the Non-Tasters?
- Which allele is over-represented in the Non-Tasters?
- Are there exceptions? What are possible causes for exceptions?
Web Exercise. Search for gene information using NCBI online databases
- Point your browser to the NCBI Human Genome Resource page
- Copy and paste sequence at Locus A into the first text box (add a FASTA heading, e.g., ">Locus_A")
- Expand the "Algorithm parameters" tab and change "Expect threshold" to 0.00001 (10e-5). Define "expect value" in your owns words after watching the linked Youtube video.
- Press "BLAST". Copy & Paste the top hit in your final lab report.
- Repeat the above for the sequence at Locus B. Copy and paste the top hit in your final lab report.