Biol20N02 2016: Difference between revisions

Revision as of 04:56, 2 February 2016

Analysis of Biological Data (BIOL 20N02, Spring 2015) Instructor: Dr Weigang Qiu, Associate Professor, Department of Biological Sciences Room: 1001B HN (North Building, 10th Floor, Mac Computer Lab) Hours: Tuesdays 10-1 Office Hours: Belfer Research Building (Google Map) BB-402; Wed 5-7 pm or by appointment Course Website: http://diverge.hunter.cuny.edu/labwiki/Biol20N2_2016

Course Description

With rapid accumulation of genome sequences and digitalized health data, biomedicine is becoming a data-intensive science. This course is a hands-on, computer-based workshop on how to visualize and analyze large quantities of biological data. The course introduces R, a modern statistical computing language and platform. Students will learn to use R to make scatter plots, bar plots, box plots, and other commonly used data-visualization techniques. The course will review statistical methods including hypothesis testing, analysis of frequencies, and correlation analysis. Student will apply these methods to the analysis of genomic and health data such as whole-genome gene expressions and SNP (single-nucleotide polymorphism) frequencies.

This 3-credit experimental course fulfills elective requirements for Biology Major I. Hunter pre-requisites are BIOL100, BIOL102 and STAT113.

Learning Goals

Be able to use R as a plotting tool to visualize large-scale biological data sets
Be able to use R as a statistical tool to summarize data and make biological inferences
Be able to use R as a programming language to automate data analysis

Textbooks

R Studio (Required): Learning RStudio for R Statistical Computing
Digital textbook (Required): Data Analysis for the Life Sciences

Exams & Grading

Attendance (or a note in case of absence) is required
In-Class Exercises (50 pts).
Assignments. All assignments should be handed in as hard copies only. Email submission will not be accepted. Late submissions will receive 10% deduction (of the total grade) per day.
Three Mid-term Exams (3 X 30 pts each = 90 pts)
Comprehensive Final Exam (50 pts)
Bonus for active participation in classroom discussions

Course Outline

Feb 2. Introduction & tutorials for R/R studio

Course overview
Install R & RStudio on your home computers (Chapter 1. pg. 9)
Tutorial 1: First R Session (pg. 12)
Tutorial 2. Writing R Scripts (Chapter 2. pg. 21)

Assignment #1
Unix Text Filters (10 pts) Show both commands and outputs for the following questions: Without changing directory (i.e., remain in your home directory), locate and long-list the genbank file named "GBB.gb" in the course data directory Count the total number of lines, show the first and last 10 lines of the file. Using a combination of `head` and `tail` commands, show only the lines containing the translated protein sequence of the first gene Count the total number of replicans by extracting lines containing "LOCUS" (case sensitive); sort them by the total number of bases ("bp") Remove the string "(plasmid" from the above output Extract the second column (replicon names) from the above output. [Hint: these fields are delimited by an unequal number of spaces, not by tabs. Use `tr -s` to first squeeze to single space]

Feb 9. No class (Friday Schedule)

@@ Line 36: / Line 36: @@
 {| class="wikitable sortable mw-collapsible"
 |- style="background-color:lightsteelblue;"
-! Assignment #1 (10 pts; Due 2/16, Tuesday)
+! Assignment #1
-|- style="background-color:lightblue;"
+|- style="background-color:powderblue;"
-To Be Posted
+| '''Unix Text Filters''' (10 pts) Show both commands and outputs for the following questions:<br>
+# Without changing directory (i.e., remain in your home directory), locate and long-list the genbank file named "GBB.gb" in the course data directory
+# Count the total number of lines, show the first and last 10 lines of the file. Using a combination of <code>head</code> and <code>tail</code> commands, show only the lines containing the translated protein sequence of the first gene
+# Count the total number of replicans by extracting lines containing "LOCUS" (case sensitive); sort them by the total number of bases ("bp")
+# Remove the string "(plasmid" from the above output
+# Extract the second column (replicon names) from the above output. [Hint: these fields are delimited by an unequal number of spaces, not by tabs. Use <code>tr -s</code> to first squeeze to single space]
 |}

Biol20N02 2016: Difference between revisions

Revision as of 04:56, 2 February 2016

Contents

Course Description

Learning Goals

Textbooks

Exams & Grading

Course Outline

Feb 2. Introduction & tutorials for R/R studio

Feb 9. No class (Friday Schedule)

Feb 16. Introduction & tutorials for R/R studio

Feb 23. Statistics & samples

March 1. Displaying data

March 8. Describing data; Exam 1.

March 15. Probability and hypothesis testing

March 22. Analysis of proportions

March 29. Analysis of frequencies

April 5. Contingency tests; Exam 2

April 12. Normal distribution and controls

April 19. Comparing two means

April 26. No Class (Spring break)

May 3. Designing experiments

May 10. Comparing more than two groups; Exam 3

May 17. Correlation analysis

May 24. Final Exam (Comprehensive)

May 31. Grades submitted to Registrar Office

Navigation menu

Biol20N02 2016: Difference between revisions

Revision as of 04:56, 2 February 2016

Course Description

Learning Goals

Textbooks

Exams & Grading

Course Outline

Feb 2. Introduction & tutorials for R/R studio

Feb 9. No class (Friday Schedule)

Feb 16. Introduction & tutorials for R/R studio

Feb 23. Statistics & samples

March 1. Displaying data

March 8. Describing data; Exam 1.

March 15. Probability and hypothesis testing

March 22. Analysis of proportions

March 29. Analysis of frequencies

April 5. Contingency tests; Exam 2

April 12. Normal distribution and controls

April 19. Comparing two means

April 26. No Class (Spring break)

May 3. Designing experiments

May 10. Comparing more than two groups; Exam 3

May 17. Correlation analysis

May 24. Final Exam (Comprehensive)

May 31. Grades submitted to Registrar Office

Navigation menu

Search