View on GitHub

2020_Bioinformatics_Research_Experience

Base repository for Coriell Institute's inaugural Bioinformatics Research Experience

Coriell Bioinformatics Research Experience 2020

The Bioinformatics Research Experience is a four-week research training program for undergraduate students interested in learning scientific biological data analysis. The following year’s material can be seen here: 2021


Getting Started

Programs to Install

  1. Zoom https://zoom.us/download
  2. Slack https://slack.com/. If you’re familiar with Slack, our slack name is coriellbioinf-se37156. Otherwise you can follow the attached directions in slack_intructions.pdf.
  3. Download and install TeamViewer https://www.teamviewer.com/en-us/. This will allow the research experience team to screen share with your computer to assist with technical problems.
  4. R. If you don’t already have it installed, go to R Cloud https://cloud.r-project.org/ to download and install R.
  5. RStudio. Go to RStudio’s website https://rstudio.com/products/rstudio/download/ and download the FREE version.
  6. Cisco AnyConnect VPN. See attached directions in cisco_vpn.pdf
  7. GitHub https://github.com/ If you don’t have a GitHub account already you need to sign up for one; it’s free.

Operating System Specific Programs

Mac
  1. You will need to use the terminal application to connect to remote servers. This comes with the Mac, so no additional software is needed, but it’s not listed in the Applications folder. Use Spotlight to search for Terminal, then pin it to the Dock so you’re prepared to use it when we get to it.
  2. Download and install XQuartz https://www.xquartz.org/. This will let you view images from the server.
  3. Download and install Cyberduck. https://cyberduck.io/
  4. Install Git https://git-scm.com/download/mac
PC
  1. Download and install PuTTy https://www.chiark.greenend.org.uk/~sgtatham/putty/. This will let you connect to the server.
  2. Download and install XMing https://sourceforge.net/projects/xming/. This will let you view images from the server.
  3. Download and install WinSCP https://winscp.net/eng/index.php. This will let you transfer files to and from your computer and the server.
  4. Install Git https://git-scm.com/download/win

Schedule

Daily Schedule

Event Schedule

Date Event Time Topic
7/07 Daily Lecture 9AM Program introduction; Introduction to Git
  Coriell Journal Club 12PM Jaroslav Jelinek, MD, PhD
7/08 Daily Lecture 9AM R: Introduction to the tidyverse, Rmarkdown, and dplyr
  BRE Talk 12PM Jaroslav Jelinek, MD, PhD: Epigenetics and DNA methylation
7/09 Daily Lecture 9AM R: Plotting with ggplot2
  BRE Talk 12PM Jozef Madzo, PhD
7/10 Daily Lecture 9AM How to read a scientific paper
  BRE Talk 12PM TBD
7/14 Daily Lecture 9AM R: Importing and tidying data with readr and tidyr
  No talk, Coriell Seminar cancelled    
7/15 Daily Lecture 9AM R: Clustering
  BRE Talk 12PM Himani Vaidya, MS
7/16 Daily Lecture 9AM R: Statistics
  BRE Talk 12PM Dara Kusic, PhD
7/17 Daily Lecture 9AM Catch up day
  Participant Presentaions 12PM First 3 review papers presented, 20 min per person
7/21 Daily Lecture 9AM How to use a Linux server
  Coriell Journal Club 12PM Nahid Turan, PhD
7/22 Daily Lecture 9AM Processing RNA-seq data
  BRE Talk 12PM Laura Scheinfeldt, PhD
7/23 Daily Lecture 9AM Analyzing RNA-seq data
  BRE Talk 12PM Gennaro Calendo, MS
7/24 Daily Lecture 9AM Introduction to final project
  Participant Presentaions 12PM First 3 review papers presented, 20 min per person
7/28 Final project work day    
  Coriell Seminar 12PM Dara Kusic, PhD
7/29 Final project work day    
  BRE Talk 12PM Matt Mitchell, PhD
7/30 Final project work day    
  BRE Talk 12PM Shoghag Panjarian, PhD
7/31 Final project presentations 12PM All participants 10 minute presentation on RNA-seq analyzed

BRE Material

July 07: Introduction to Git


Today’s Assignment: GitHub Practice https://classroom.github.com/a/kNgsFDtr


July 08: Introduction to R and the Tidyverse


Today’s Assignment: dplyr https://classroom.github.com/a/6j30_kft


July 09: Plotting with ggplot2


Today’s Assignment: dplyr https://classroom.github.com/a/WCFxfVF_


July 10: How to Troubleshoot Code and How to Read a Scientific Paper


Journal Club Assignment


July 14: Reading Data with readr and Tidying Data with tidyr


Today’s Assignment: readr and tidyr https://classroom.github.com/a/rLa6syth


July 15: Clustering


Today’s Assignment: Clustering https://classroom.github.com/a/UDYxMOkf


July 16: Statistics


Today’s Assignment: Statistics https://classroom.github.com/a/4NWo-P_3


July 17: Grab Bag and Exploratory Data Analysis


Today’s Assignment: Exploratory Data Analysis https://classroom.github.com/a/sUlgFd-I
Because this is a more involved assignment, it’s due Friday 7/24


July 21 Command-Line Server Navigation


Today’s Assignment: Practice Server Commands Playing Terminus https://classroom.github.com/a/sIUEXjYO


July 22 More Command-Line


July 23 Process RNA-seq Data Day 1


July 24 Process RNA-seq Data Day 2


Assignment: Process RNA-seq Data


July 28 Analyze RNA-seq Day 1


July 29 Analyze RNA-seq Day 2


Assignment: Analyze RNA-seq


July 30 Analyze RNA-seq Day 3: Pathway Analysis


Example RNA-seq Analysis

Here’s an example RNA-seq analysis in Rmd, html, and pdf.

The count files used are posted on the server used for the course at /mnt/data/encode_tissue_data/ and are also posted here on GitHub at RNA-seq/rnaseq_example/count_tables/