QuBi/modules/biol203-geno-pheno-association-2020

From QiuLab
Revision as of 01:03, 7 May 2020 by imported>Lab
Jump to navigation Jump to search
BIOL 203 Bioinformatics Exercises for Lab 13

Test phenotype-genotype association

Introduction: GWAS & Contingency Test

Genome-Wide Association Study (GWAS) is a method for mapping phenotypes to genotypes. In a typical GWAS study, frequencies of alleles (e.g., C or T at position 785) are determined in a sample of affected individuals (the "cases" e.g. disease) as well as in a sample of unaffected individuals (the "controls"). For example, the following table shows results of a hypothetical case-control study at a locus segregating with two alleles (C and T):

Table 1. Sample Genotype Frequencies

T/T T/C C/C Total
Case 0 24 127 ?
Control 9 68 114 ?
Total ? ? ? ?

Association between the genotype and the phenotype could be assessed with a contingency table analysis. In this case, Χ2 = 26.4, p<0.0005, suggesting a significant association between genotypes and diseases. (By comparing the expected and observed counts, one could conclude that the C/C genotypes are over-represented in disease cases.)

1. Perform an online contingency table analysis using the hypothetical data in Table 1. Click on "other contingency tables" and do a 2-rows and 3-columns test with the data above. Your Χ2 should be 26.4.

2. Deriving from Table 1, fill the following table with allele counts. Then perform a 2-by-2 contingency table analysis using the link above. For example, in the controls, the number of T alleles is: 18 + 68 = 86 , because homozygotes have two alleles and heterozygotes have one.

Is there a statistically significant association between alleles and disease phenotype? Which allele (C or T) is over-represented in (i.e., statistically associated with) disease cases?

Table 2. Sample Allele Frequencies

T C Total
Case ? ? ?
Control ? ? ?
Total ? ? ?

Test association with locus A

Following the above two examples, perform both the genotype and allele association tests using the class data.

Table 3a. Genotype counts at Locus A

A1/A1 A1/A2 A2/A2 Row Sum
Taster ? ? ? ?
Non-Taster ? ? ? ?
Column Sum ? ? ? ?

Calculate allele counts & then test for association

Table 3b. Allele counts at Locus A

A1 A2 Row Sum
Taster ? ? ?
Non-Taster ? ? ?
Column Sum ? ? ?
  1. Record your result in the lab report sheet for the contingency test for Locus A, including chi-square statistic, degree of freedom, and p values

Test association with Locus B

Table 4a. Genotype counts at Locus B for each phenotype

B1/B1 B1/B2 B1/B3 B2/B2 B2/B3 B3/B3 Row Sum
Taster ? ? ? ? ? ? ?
Non-Taster ? ? ? ? ? ? ?
Column Sum ? ? ? ? ? ? ?

Calculate allele counts & then test for association Table 4b. Allele counts at Locus B

B1 B2 B3 Row Sum
Taster ? ? ? ?
Non-Taster ? ? ? ?
Column Sum ? ? ? ?
  1. Record your result in the lab report sheet for the contingency test for Locus A, including chi-square statistic, degree of freedom, and p values

Web Exercise. Search for gene information using NCBI online databases

  1. Point your browser to the NCBI Human Genome Resource page
  2. Copy and paste sequence provided on Blackboard- this is the sequence of the gene associated with the taster phenotype
  3. Press "BLAST". Copy & Paste the top hit in your final lab report.
  4. Briefly describe the function of the gene based on information found on the locus page

Additional questions

Answer briefly (1-2 sentences):

  1. State what is the null hypothesis in a chi-square test & what is the alternative hypothesis
  2. Explain what probability is represented by the p-value.
  3. What can you conclude when p-value is below the threshold of significance (e.g., p = 0.05)?
  4. What would you conclude when p-value is above the critical value?
  5. Is there a statistically significant association between one of the alleles tested and the Taster phenotype?
  6. Which genotype is over-represented in the Non-Tasters?
  7. Which allele is over-represented in the Non-Tasters?