This workshop is designed to give beginners a solid foundation in Python programming, with a specific focus on applications in cancer biology. Participants will gain a thorough understanding of essential programming concepts through a blend of theoretical lessons, hands-on coding exercises, and practical applications.
The workshop will cover essential programming concepts and gradually introduce more advanced topics, with a focus on using the Pandas library for efficient data handling and analysis and matplotlib library for data visualization. By the end of the workshop, attendees will be equipped with the skills to enhance the reproducibility and efficiency of scientific research through powerful data analysis tools and effective visualization techniques.
Sanduni Rajapaksa
Participants will gain the following skills:
- Proficiency in using Python for data analysis.
- Basic Python programming skills.
- Reading, tidying, and joining datasets using
pandas
library. - Data manipulation and transformation using
pandas
library. - Creating various types of plots using
matplotlib
library.
The Metabric study characterized the genomic mutations and gene expression profiles for 2509 primary breast tumours. In addition to the gene expression data generated using microarrays, genome-wide copy number profiles were obtained using SNP microarrays. Targeted sequencing was performed for 2509 primary breast tumours, along with 548 matched normals, using a panel of 173 of the most frequently mutated breast cancer genes as part of the Metabric study.
Both the clinical data and the gene expression values were downloaded from cBioPortal.
We excluded observations for patient tumor samples lacking expression data, resulting in a data set with fewer rows.
These content were adapted from the following course materials:
- R for Data Science book
- OHI Data Science Training
- Data Carpentry
- WEHI tidyr coursebook by Brendan R. E. Ansell
- content developed by Maria Doyle.