QuBi/modules/biol302
BIOL 302 Lab (Bioinformatics Exercises)
Research in molecular genetics requires effective use of bioinformatic tools to analyze and understand the genetic materials being worked with. The following exercises will expose you to real-world scenarios and introduce you to the methods and tools you can use to solve these problems.
Identification of mdm2 Splice Variants Using BLAST
Objectives
- Learn to use Genbank database and BLAST tool to analyze nucleotide sequences
- Use BLAST to identify
Key Concepts
Blast
Alternative Splicing
Exercise
Genbank Accession # | cDNA Clone | Description | Cell Line | Length (bp) |
---|---|---|---|---|
AF527840 | Genomic DNA | 34,088 | ||
EU076746 | P2-MDM2-C1 | cDNA missing exons 5-9 & 11 | MANCA | 427 |
EU076747 | P2-MDM2-10 | cDNA missing exon 10 | ML-1 | 842 |
EU076748 | P2-MDM2-C | cDNA missing exons 5-9 | A876 | 505 |
EU076749 | P2-MDM2-FL | Full-length cDNA | SJSA-1 | 845 |
- Explore the gene annotation for AF527840.
Sequences on genbank have both basic reference information (such as what the sequence is, what organism it came from, and bibliographical information) and sequence annotations. Some sequences are more richly annotated than others. For this exercise you will be working will a well-annotated sequence: accession number AF527840. Explore its annotation and use it to complete the following set of tasks.
1) DRAW a diagram of this gene, including its introns/exons, 3'/5' UTRs, +1. (Note: this diagram is going to be very handy for the last set of questions.) - Label each feature with its coordinates. For example, if an exon starts at 500bp and ends at 1000bp, label it as such. - Label the diagram with basic information, such as the gene's name and th organism's species. - The drawing does not have to be exactly to scale, a reasonable effort should be made to do so. Put length markers on your drawing (for example, every 5000bp.) - The mRNA annotation states which segments are used to create mRNA, and the CDS annotation states which parts actully code amino acids (CDS = coding sequence).
2) How does the sequence vary at positions X, X, and X for this gene?
3) What kinds of repeat regions can be found in this gene?