Skip to content

Latest commit

 

History

History

2018-spring_compbio

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 

CMSE 491: Bioinformatics and Computational Biology

Description

This course is an introduction to contemporary topics in bioinformatics and computational biology, dealing with combining large-scale data and modern analytical techniques to gain biological/biomedical insights. In each topic, centered around a recent paper, we will discuss the major biological & biomedical questions, explore the relevant molecular/genomics/biomedical datasets, and understand the underlying statistical, probabilistic, & machine-learning approaches.

Students will learn how to formulate problems for quantitative inquiry, design computational projects, understand and think critically about data & methods, communicate research findings, perform reproducible research, and practice open science. Students will apply all these by carrying out a group project, presenting their project in class, and submit a report at the end of the course.

Note
Open to both undergraduate and graduate students. Counts toward the CMSE minor, graduate certificates, and dual PhD. Please email Heather Johnson at [email protected] for an override.

Prerequisites
CMSE 201 and two semesters of introductory biology (LB 144 and 145 OR BS 161 and 162 OR BS 181H and 182H, or equivalent). Statistics at the level of STT 231 is strongly recommended. Basically, it would be assumed that you:

  • know how to code in one of the mainstream languages like Python and R,
  • have an understanding of basic statistics and probability, and
  • have studied basic genetics, molecular biology, and cellular biology.

Instructor Contact Information

Arjun Krishnan ...
Affiliation Dept. Computational Mathematics, Science, and Engineering
Dept. Biochemistry and Molecular Biology
Office 2507H Engineering Building
Contact Email: [email protected]
Twitter: @compbiologist
Website: https://cmse.msu.edu/directory/faculty/arjun-krishnan/

[ Top ]

Course Outline and Materials

Major Topics

(subject to changes)

  • Genome assembly and annotation
  • Sequence alignment and pattern finding
  • Comparative genomics
  • Genetic variation and quantitative genetics
  • Regulatory genomics
  • Functional genomics
  • Molecular and digital evolution
  • Molecular docking and dynamics
  • Protein structure prediction
  • Modeling cellular pathways
  • Metabolomics
  • Large-scale biological networks

Each topic is covered over two classes, a "Lecture" and a "Paper discussion".

Recommended Materials

TBD

[ Top ]

Schedule, Location, Calendar, and Office Hours

S/L Info
Schedule Mon and Wed
8:30-9:50 am
Location 202 Urban Planning & Land. Arch. Building

Calendar

This calendar contains the class schedule and the links to the lecture slides and reading materials. Download the PDF file here.

Week Date Topic Content Learning Materials
Week 1 Mon, Jan 08 Introduction and Overview Course overview In-coming survey [Link]

Lecture1 [PDF]

Required reading:
- All biology is computational biology [Journal] [PDF]
- Computational biologists: moving to the driver's seat [Journal] [PDF]
Week 1 Wed, Jan 10 Introduction and Overview Getting started Lecture2 [PDF]

Required reading:
- So, you want to be a computational biologist [Journal] [PDF]
Week 2 Mon, Jan 15 No class (MLK Day)
Week 2 Wed, Jan 17 Genome assembly and annotation Lecture Lecture3 [PDF]
Week 3 Mon, Jan 22 Genome assembly and annotation Paper discussion Velvet: Algorithms for de novo short read assembly using de Bruijn graphs [Journal] [PDF]
Week 3 Wed, Jan 24 Sequence alignment and pattern finding Lecture Lecture4 [PDF]

Optional reading:
- BLAST Wikipedia page
- BLAST Glossary
- BLAST Practical details
- Statistics of sequence similarity scores
Week 4 Mon, Jan 29 Sequence alignment and pattern finding Paper discussion Basic local alignment search tool [Journal] [PDF]

Additional reading:
-Rapid and sensitive protein similarity searches [Journal] [PDF]
- FASTA (the algorithm) Wikipedia page
- An introduction to sequence similarity ("homology") searching [Journal] [PDF]
- Substitution matrix Wikipedia page
- Selecting the right similarity-scoring matrix [Journal] [PDF]
Week 4 Wed, Jan 31 Comparative genomics Lecture TBD

Additional reading:
- Comparative genomics as a tool to understand evolution and disease [Journal] [PDF]
- Dissecting evolution and disease using comparative vertebrate genomics [Journal] [PDF]
Week 5 Mon, Feb 05 Comparative genomics Paper discussion Whole-genome alignment:
- MUMmer1: Alignment of whole genomes [Journal] [PDF]
- MUMmer2: Fast algorithms for large-scale genome alignment and comparison [Journal] [PDF]
- MUMmer3: Versatile and open software for comparing large genomes [Journal] [PDF]
- MUMmer4: A fast and versatile genome alignment system [Journal] [PDF]
Week 5 Wed, Feb 07 Genetic variation and quantitative genetics Lecture TBD
Week 6 Mon, Feb 12 Genetic variation and quantitative genetics Paper discussion Genome-wide association studies and multiple hypothesis correction:
- Mining Genome-Wide Genetic Markers [Journal] [PDF]
- Genome-Wide Association Studies [Journal] [PDF]
- How does multiple testing correction work? [Journal] [PDF]
- Statistical significance for genomewide studies [Journal] [PDF]
Week 6 Wed, Feb 14 Regulatory genomics Lecture Lecture7 [PDF]
Week 7 Mon, Feb 19 Regulatory genomics Paper discussion - What are DNA sequnence motifs? [Journal] [PDF]
- How does DNA sequence motif discovery work? [Journal] [PDF]
- What is the Expectation Maximization algorithm? [Journal] [PDF]
- Practical Strategies for Discovering Regulatory DNA Sequence Motifs [Journal] [PDF]
Week 7 Wed, Feb 21 Functional genomics Lecture Lecture8 [PDF]
Week 7 Mon, Feb 26 Functional genomics Paper discussion TBD
Week 8 Wed, Feb 28 Mini Primers TBD TBD
Week 8 Mon, Mar 05 No class (Spring break)
Week 9 Wed, Mar 07 No class (Spring break)
Week 9 Mon, Mar 12 Mid-term project proposal presentations
Week 10 Wed, Mar 14 Molecular and digital evolution Lecture Lecture9 [PDF]
Week 10 Mon, Mar 19 Molecular and digital evolution Paper discussion TBD
Week 11 Wed, Mar 21 Molecular docking and dynamics Lecture Lecture10 [PDF]
Week 11 Mon, Mar 26 Molecular docking and dynamics Paper discussion - Predicting protein structures with a multiplayer online game [Journal] [PDF]
- GROMACS tutorial
Week 12 Wed, Mar 28 Protein structure prediction Lecture Lecture11 [PDF]
Week 12 Mon, Apr 02 Protein structure prediction Paper discussion - Evolutionarily conserved networks of residues mediate allosteric communication in proteins [Journal] [PDF]

Additional reading:
- Inferring Pairwise Interactions from Biological Data Using Maximum-Entropy Probability Models [Journal] [PDF]
- Gleaning structural and functional information from correlations in protein multiple sequence alignments [Journal] [PDF]
Week 13 Wed, Apr 04 Modeling cellular pathways Lecture Lecture12 [PDF]
Week 13 Mon, Apr 09 Modeling cellular pathways Paper discussion Construction of a genetic toggle switch in Escherichia coli [Journal] [PDF]

Additional reading:
- Networks dynamics and cell physiology [Journal] [PDF]
- Computational studies of gene regulatory networks: in numero molecular biology [Journal] [PDF]
- Network motifs: theory and experimental approaches [Journal] [PDF]
Week 14 Wed, Apr 11 Metabolomics Lecture Lecture13 [PDF]
Week 14 Mon, Apr 16 Metabolomics Paper discussion - Network-based prediction of human tissue-specific metabolism [Journal] [PDF]
- Integration of expression data in genome-scale metabolic network reconstructions (Mini supplementary reading to help with the main paper) [Journal] [PDF]

Additional reading:
- A protocol for generating a high-quality genome-scale metabolic reconstruction [Journal] [PDF]
- What is flux balance analysis? [Journal] [PDF]
- Applications of genome‐scale metabolic reconstructions [Journal] [PDF]
Week 15 Wed, Apr 18 Large-scale biological networks Lecture Lecture14 [PDF]
Week 15 Mon, Apr 23 Large-scale biological networks Paper discussion Genomic analysis of regulatory network dynamics reveals large topological changes [Journal] [PDF]

Additional reading:
- Network biology: understanding the cell's functional organization [Journal] [PDF]
- Network medicine: a network-based approach to human disease [Journal] [PDF]
- Learning biological networks: from modules to dynamics [Journal] [PDF]
- Inferring cellular networks – a review [Journal] [PDF]
Week 16 Wed, Apr 25 Final project presentations 1
Week 16 Mon, Apr 30 Final project presentations 2

Project deadlines

Item Due date
Project profile Mon, Jan 29
Project topic/team Wed, Feb 07
Project pre-proposal Wed, Feb 14
Project proposal Mon, Feb 26
Proposal reviews Mon, Mar 05
Mid-term project proposal presentations Mon, Mar 12
Review response Mon, Mar 14
Mid-course project report Wed, Apr 04
Final project report Wed, Apr 25
Final project presentations 1 Wed, Apr 25
Final project presentations 2 Mon, Apr 30

Office Hours

Wednesday 5-7pm

I will block this time from my schedule and be present in my office.

Couple of things to note:

  1. While I'm happy to chat with you in person, many times, just sending me a message on Slack with your questions/concerns might work as well. So, if you have specific Qs in mind, just shoot me a message and let's see if we can resolve it then and there.
  2. If you would indeed like to meet in person, please try to meet me during this time. But, don't worry if you can't make it during this window for some reason. Again, just send me a message on Slack and we'll find a time that works for both of us.

[ Top ]

Website and Communication

Course website

This GitHub repo will serve as the course website.

Communication

The primary mode of communication in this course (including major announcements), will be the course Slack account https://cmse491bioinfocompbio.slack.com. All of you should have invitations to join this account in your MSU email.

Emails
Although the bulk of the communication will take place via Slack, at times (rarely), we will send out important course information via email. This email is sent to your MSU email address (the one that ends in “@msu.edu”). You are responsible for all information sent out to your University email account, and for checking this account on a regular basis.

[ Top ]

Course Activities

Pre-class Assignments

For each topic, you will be assigned a paper after the topic's "lecture" class that you are required to read. The link to the PDF of the paper will be posted on this page next to the topic on the Calendar along with instructions on a specific analysis in the paper you should pay special attention to.

You are required to turn-in a two-page report containing the following:

Part 1 (page 1) of your report: Analysis and summary of the paper as a whole. It should contain the following three sections:

  1. Background
    • What problem/task is the paper trying to address? What is the biological problem and/or the computational challenge?
    • How was the problem/task addressed up to that point and what were the limitations of the then current practice?
  2. Summary of contributions
    • What is the contribution of this paper in terms of approach, algorithmic techniques, computational ideas?
    • Summarize the conclusion(s) your draw from each main figure in the paper. Based on your conclusions, are the major claims of the paper justified?
  3. Limitations and Future directions
    • What are the limitations of this paper or open questions that arise from this paper?
    • What are your suggestions for addressing these limitations?

Part 2 (page 2) of your report: Critical and thorough analysis of a specific section/figure. It should contain the following four sections:

  1. Data
    • What data did they use to perform the analysis presented in the figure?
    • Where did each piece of data come from?
  2. Methods
    • What techniques and algorithms did they use?
    • Does the paper have a detailed description of how to perform this analysis, enough for someone to repeat that analysis?
    • Does the paper have source code to reproduce the presented results?
  3. Evaluation
    • What are the measures/metrics plotted for each figure panel?
    • How is the success of each analysis evaluated?
  4. Conclusions
    • Are the conclusions you draw from the figure in agreement with those drawn by the authors?

This report is due before the topic's "Paper discussion" class. Points will be deducted for reports shorter or longer than two-pages. Wrong papers will be graded as zero.

Class Participation

In general:

  • Do the pre-class assignments and additional readings.
  • Show up to class.
  • Work in groups during in-class discussion sessions.
  • No one will have the perfect background: Ask questions about computational or biological concepts.
  • Correct me when I am wrong.

Paper discussion

You will also take turns to present the assigned paper during each topic's "Paper discussion" class. Make sure you sign-up.

  • Two students together present each paper.
  • The presentation should focus on the computational/analytical parts, not necessarily on detailed biological background & conclusions.
    1. What is the problem the authors are trying to solve? [description of the problem along with why it is important]
    2. What are their claims about the then current practices and their limitations? [existing approaches to solve the problem & their pros-and-cons]
    3. What’s their approach? What’s new in it and what is their rationale for it being potentially successful? [description of the new ideas, their merits in comparison to existing ones, and rationale]
    4. What are the major contributions and limitations of this paper?
    5. What are some open questions and next steps (for addressing the limitations)?
  • The two students will also make a note of all the points discussed in the class during the presentation, write-up them up by working with me, and post the discussion on PubPeer.

Scribing

Each topic's "Lecture" class will have two dedicated scribes who will take notes on the lecture, work with Arjun to refine the notes, and circulate a final version to the rest of the class.

  • The scribes should submit their individually completed drafts of their scribe notes within 3 days after lecture. I will read those notes and give comments/suggestions.
  • The two scribes should then work together to combine their drafts+comments into a single final scribe notes and submit within 6 days after lecture.

Semester Project and Presentation

A major goal of this course is to prepare you for performing original research in computational biology, and for effectively presenting your ideas and research. The semester project will serve as the most practical way to do exactly that.

Students are encouraged to form teams of two and work on this project by leveraging their complementary skills. Projects can take any one of the following flavors:

  • Design and implement a new computational method for a task in biology
  • Improve an existing method
  • Perform an evaluation of several existing methods
  • Develop a fully-reproducible documentation and codebase for an existing analysis in a paper

The outcomes of this semester-long project should include:

  1. Well-documented code to:
    • download and process the data
    • perform the computational analysis and generate all the results
    • visualize the results as various plots
  2. Detailed final report containing the following sections:
    • Abstract
    • Introduction
    • Data and Methods
    • Results and Discussion
    • Limitations and Future Directions
    • References
  3. A presentation (slides) that describes your project - motivation, exact problem, approach, results, discussion & conclusions, limitations & future direcrtions, acknowledgements.

There are several project deadlines throughout the course that will help you stay on track, allowing you to complete a substantial project.

  1. Describe your previous research, areas of research interest in bioinformatics / computational-biology, type of project that best fits your interests. Post this description in a profile that lets your classmates know you and find potential partners. Project profile due Mon, Jan 29.
  2. Discuss with Arjun (and any other PI), read recent papers, talk to potential partners. Describe project ideas and form groups. Project topic/team due Wed, Feb 07.
  3. Prepare a two-page pre-proposal (Page1: text; Page2: figures & references). Project pre-proposal due Wed, Feb 14.
  4. Write 4-page proposal describing project goals, division of work, milestones, datasets, and challenges. Project proposal due Mon, Feb 26.
  5. Review and discuss proposals (NIH review format). Reviews due Mon, Mar 05.
  6. Address peer evaluations, revise aims, scope, list of final goals & deliverables. Response due Mon, Mar 14 (note: due after presentation, not Mar 12).
  7. Continue making substantial progress on proposed milestones. Write outline/first-draft of final report. Meet Arjun to discuss all results and get feedback on the draft. Mid-course project report due Wed, Apr 04.
  8. Complete milestones, finalize results, figures, write-up in conference publication format. As part of the report, comment on your overall project experience. Final project report due Wed, Apr 25.
  9. Final project presentations will take place on the Wed, Apr 25 and Mon, Apr 30.

After you have formed teams, I will create private channels for each team on Slack so that you can effectively communicate with each other.

[ Top ]

Grading Information

Activity Percentage
Pre-class assignments ~35%
Class participation ~15%
Scribing ~10%
Project ~40%

Grading Scale

Point Percentage
4.0 ≥ 90%
3.5 ≥ 85%
3.0 ≥ 80%
2.5 ≥ 75%
2.0 ≥ 70%
1.5 ≥ 65%
1.0 ≥ 60%
0.0 < 60%

Note: Grades will not be curved. Your grade is based on your own effort and progress, not based on competition with your classmates.

[ Top ]

Attendance, Conduct, Honesty, and Accommodations

Class Attendance

This class is heavily based on material presented and worked on in class, and it is critical that you attend and participate fully every week! Therefore, class attendance is absolutely required. An unexcused absence will result in zero points for the day. Arriving late or leaving early without prior arrangement with the instructor of your session counts as an unexcused absence. Note that if you have a legitimate reason to miss class (such as job, graduate school, or medical school interviews), you must arrange this ahead of time to be excused from class. Three unexcused absences will result in the reduction of your grade by one step (e.g., from 4.0 to 3.5), with additional absences reducing your grade further at the discretion of the course instructor.

Code of Conduct

All conduct should serve the singular goal of sustain a friendly, supportive, and fun environment where we can do our best work and have a great time doing it.

  • Do work that you’re proud of, from the smallest piece of code to the entire project.
  • Be supportive of your classmates; respect each others' strengths, weaknesses, differences, and beliefs.
  • Communicate openly & respectfully with everyone in the class.
  • Ask for help; at the same time, respect and appreciate others' time and effort.

Respectful and responsible behavior is expected at all times, which includes not interrupting other students, turning your cell phone off, refraining from non-course-related use of electronic devices, and not using offensive or demeaning language in our discussions. Flagrant or repeated violations of this expectation may result in ejection from the classroom, grade-related penalties, and/or involvement of the university Ombudsperson.

I am unequivocally dedicated to providing a harassment-free experience for everyone, regardless of gender, gender identity and expression, age, sexual orientation, disability, physical appearance, body size, race, or religion (or lack thereof). We will not tolerate harassment of colleagues in any form. Behaviors that could be considered discriminatory or harassing, or unwanted sexual attention, will not be tolerated and will be immediately reported to the appropriate MSU office (which may include the MSU Police Department).

Academic honesty

Intellectual integrity is the foundation of the scientific enterprise. In all instances, you must do your own work and give proper credit to all sources that you use in your papers and oral presentations – any instance of submitting another person's work, ideas, or wording as your own counts as plagiarism. This includes failing to cite any direct quotations in your essays, research paper, class debate, or written presentation. The MSU College of Natural Science adheres to the policies of academic honesty as specified in the General Student Regulations 1.0, Protection of Scholarship and Grades, and in the all-University statement on Integrity of Scholarship and Grades, which are included in Spartan Life: Student Handbook and Resource Guide. Students who plagiarize will receive a 0.0 in the course. In addition, University policy requires that any cheating offense, regardless of the magnitude of the infraction or punishment decided upon by the professor, be reported immediately to the dean of the student's college.

It is important to note that plagiarism in the context of this course includes, but is not limited to, directly copying another student's solutions to in-class or homework problems; copying materials from online sources, textbooks, or other reference materials without citing those references in your source code or documentation, or having somebody else do your pre-class work, in-class work, or homework on your behalf. Any work that is done in collaboration with other students should state this explicitly, and have their names as well as yours listed clearly.

More broadly, we ask that students adhere to the Spartan Code of Honor academic pledge, as written by the Associated Students of Michigan State University (ASMSU): "As a Spartan, I will strive to uphold values of the highest ethical standard. I will practice honesty in my work, foster honesty in my peers, and take pride in knowing that honor is worth more than grades. I will carry these values beyond my time as a student at Michigan State University, continuing the endeavor to build personal integrity in all that I do."

Accomodations

If you have a university-documented learning difficulty or require other accommodations, please provide me with your VISA as soon as possible and speak with me about how I can assist you in your learning. If you do not have a VISA but have been documented with a learning difficulty or other problems for which you may still require accommodation, please contact MSU’s Resource Center for People with Disabilities (355-9642) in order to acquire current documentation.

[ Top ]