Skip to content

PMacDaSci/R4CancerSci

 
 

Repository files navigation

Check, build, and push image

Introduction

This site contains the materials for an R course run by Peter Mac.

The materials were adapted from the course run by the Bioinformatics Core at the Cancer Research UK Cambridge Institute (Contributions : Matthew Eldridge, Chandra Chilamakuri, Mark Fernandes, Ashley Sawle, Kamal Kishore, Sergio Martinez Cuesta, Rory Stark).

April-May 2022

Instructors

  • Maria Doyle, Research Computing
  • Cassie Litchfield, Bioinformatics Core
  • Miriam Yeung, Dawson Labs

Description

R is one of the leading programming languages in Data Science and the most widely used within Peter Mac for interacting with, analyzing and visualizing cancer biology datasets. In this training, we aim to provide a friendly 'next-steps' R course for beginners who have been through an introductory R training and would like to consolidate their skills. It is an opportunity for increasing skills through supported practice using cancer biology data as part of a class.

Learning Objectives

By the end of this course you should be able to:

  • Interact with R using RStudio
  • Import/export data (e.g. Excel, csv files) into/from R
  • Manipulate and reformat tabular data
  • Create visualisations (scatterplots, bar charts, boxplots, histograms and line graphs for time series data)
  • Generate reproducible reports for your analyses

This course is for you if

  • You've completed DataCamp's Introduction to the Tidyverse (contact Maria if you're a Peter Mac staff/student who needs access) and are looking to build on your skills. Alternatively, previous attendance at University of Melbourne's Introduction to R or equivalent is sufficient.
  • You have ~1-2 hours to spend on the course each week during the 4 weeks (working through material and exercises in your own time at your own pace) and are able to attend all 4 live 1hr (recap/Q&A) sessions.

Structure

The course will be run over 4 sessions with the following structure:

  • Material covering the concepts introduced in the lesson to go through in your own time with an assignment consisting of exercises to practice some of the concepts covered in that and previous lessons (~1-2hrs per week)
  • Live recap/Q&A session hybrid format, online and in-person (1hr per week)
  • Teams channel for the class to ask any questions or get help in-between the live sessions

Schedule

Course setup
Installing R and RStudio

Week 1
Introduction to working with data in RStudio
Interacting with R using RStudio, importing and viewing data, generating and editing a reproducible report.

Live recap/Q&A session Tuesday April 26th 3.30-4.30pm

Week 2
Data visualization with ggplot2
A common grammar to create scatter plots, bar charts, boxplots, histograms and line graphs for time series data.

Live recap/Q&A session Tuesday May 3rd 3.30-4.30pm

Week 3
Data manipulation using dplyr
Filtering and modifying tabular data, computing summary values, faceting with ggplot2.

Live recap/Q&A session Tuesday May 10th 3.30-4.30pm

Week 4
Grouping, combining,and restructuring data for analysis
Advanced grouping and summarization operations, joining data from different tables, the concept of 'tidy data', pivoting and separating operations, customizing ggplot2 plots.

Live recap/Q&A session Tuesday May 17th 3.30-4.30pm

Competition!
Submit an Rmd report showing some of the skills you've learned, using the course data or your own data, as simple or as complex as you like. Submit after the course ends, by May 24th. Best report will win a small prize.

Resources

Refer to the R for Data Science book for more information on the topics covered in this course.

Releases

No releases published

Packages

No packages published

Languages

  • HTML 99.7%
  • Other 0.3%