Biol20N02 2016
Course Description
With rapid accumulation of genome sequences and digitalized health data, biomedicine is becoming a data-intensive science. This course is a hands-on, computer-based workshop on how to visualize and analyze large quantities of biological data. The course introduces R, a modern statistical computing language and platform. Students will learn to use R to make scatter plots, bar plots, box plots, and other commonly used data-visualization techniques. The course will review statistical methods including hypothesis testing, analysis of frequencies, and correlation analysis. Student will apply these methods to the analysis of genomic and health data such as whole-genome gene expressions and SNP (single-nucleotide polymorphism) frequencies.
This 3-credit experimental course fulfills elective requirements for Biology Major I. Hunter pre-requisites are BIOL100, BIOL102 and STAT113.
Learning Goals
- Be able to use R as a plotting tool to visualize large-scale biological data sets
- Be able to use R as a statistical tool to summarize data and make biological inferences
- Be able to use R as a programming language to automate data analysis
Textbooks
- R Studio (Required): Learning RStudio for R Statistical Computing
- Digital textbook (Required): Data Analysis for the Life Sciences
Exams & Grading
- Attendance (or a note in case of absence) is required
- In-Class Exercises (50 pts).
- Assignments. All assignments should be handed in as hard copies only. Email submission will not be accepted. Late submissions will receive 10% deduction (of the total grade) per day.
- Three Mid-term Exams (3 X 30 pts each = 90 pts)
- Comprehensive Final Exam (50 pts)
- Bonus for active participation in classroom discussions
Course Outline
Feb 2. Introduction & tutorials for R/R studio
- Course overview
- Install R & RStudio on your home computers (Chapter 1. pg. 9)
- Tutorial 1: First R Session (pg. 12)
- Import abalone data set: http://archive.ics.uci.edu/ml/machine-learning-databases/abalone/abalone.data
- Assign column names: colnames(abalone) <- c("Sex", "Length", "Diameter", "Height", "Whole_Weight", "Shucked_weight", "Viscera_weight", "Shell_weight", "Rings")
- Tutorial 2. Writing R Scripts (Chapter 2. pg. 21)
- Tutorial 3. Vector
Assignment #1. Due 2/16, Tuesday |
---|
to be posted |