Modern biological research projects regularly employee techniques capable of generating extremely large data sets. Specifically, microbiome investigations utilize amplicon surveys (16S rRNA, ITS, or 18S rRNA gene sequence) or metagenomic approaches to assess microbial ecology and gene expression studies take advantage of RNA-seq technology to identify differential gene regulation. Analysis of data resulting from any of these techniques requires proficiency in computational (UNIX, R) and statistical (exploratory data analysis, hypothesis testing, uni- and multivariant analysis) techniques.
The approach for analysis of both microbiome and gene expression analysis begins with appropriate understanding of the study design and metadata, proceeds through a process of data quality control and filtering, quantifies this filtered data and ultimately results in the production of tabular count data. In the case of microbiome studies, this is a table of each microbial taxa per sample and for gene expression studies a table of transcript counts per sample. Additional data types (e.g. taxonomic assignments, gene functional information) may also be created during this process and ultimately associated with or merged with the count table. These data types can then be explored using various plots and interrogated using statistical techniques. This Workshop provides instruction for how to proceed through each of these stages providing a strong foundation for working with count data and subsequent statistical analysis, plotting and interpretation.
Requirements: Students are required to bring their own laptops to participate in the course. All software and data will be provided and managed by the course organizers. No previous experience in bioinformatics is needed.
Collaborating Institutions:
Harvard University Center for AIDS Research (CFAR)
Sub-Saharan African Network for TB/HIV Research Excellence (SANTHE)
Centre for the AIDS Program of Research in South Africa (CAPRISA)
Organizing Team:
Scott Handley, Washington University
Doug Kwon, Ragon Institute
Matt Hayward, Harvard University
Barry L. Hykes, Washington University
Chandni Desai, Washington University
KRISP Teaching Assistants
SCHEDULE
Date | Day | Time | Presenter | Topic | Location |
---|---|---|---|---|---|
8 Oct | Monday | 9a – 12p | Sophie Shaw | Introduction to Unix | AHRI Seminar Rooms 1&2 |
Monday | 2p – 5p | Sophie Shaw | Introduction to Sequencing Data and Quality Control | AHRI Seminar Rooms 1&2 | |
9 Oct | Tuesday | 9a – 12p | Lindsay Droit | Building Successful Sequencing Libraries | AHRI Seminar Rooms 1&2 |
Tuesday | 2p – 5p | Scott Handley | Introduction to Data Science with R slides exercise | AHRI Seminar Rooms 1&2 | |
10 Oct | Wednesday | 9a – 12p | Scott Handley | Preprocessing Microbiome Data for Quantitative Microbiome Analysis slides | AHRI Seminar Rooms 1&2 |
Wednesday | 2p – 5p | Scott Handley | Processing Microbiome Data for Quantitative Microbiome Analysis exercise | AHRI Seminar Rooms 1&2 | |
11 Oct | Thursday | 9a – 12p | Chandni Desai | Host Differential Gene Expression Analysis slides | Onomo Hotel |
Thursday | 2p – 5p | Chandni Desai & Barry Hykes | Host Differential Gene Expression Analysis Exercise | Onomo Hotel | |
12 Oct | Friday | 9a – 12p | Matt Hayward | Metagenomic Analysis | Onomo Hotel |
Friday | 2p – 5p | Everyone | Open Lab | Onomo Hotel | |
Friday | 5p – 6p | Everyone | Reception | Onomo Hotel |