Prologue

This course was designed to accommodate the need for a modern bio data science in R course as part of the Master of Science (MSc) in Bioinformatics and Systems Biology. The pilot course was executed in 2019 and in 2020 the first fully fledged course was launched with ~35 students, since then the course has grown to ~150 students.

The course is designed as a semi-flipped classroom, with an emphasis on active learning. This means that during the 4h classes, the first hour will be dedicated to reviewing key points from last week and then a brief introduction to the topic of the day followed by a break. Hereafter, the students will work hands-on in groups on computational exercises using cloud computing infrastructure. The exercises will rely on relevant bioinformatics data from publicly available databases and gradually build the students toolbox with an emphasis on collaborative project work. This part will take up the first 9 labs. Each lab is defined by a set of specified learning objectives, it is essential that students continuously make sure, that they are on track with these LOs.

The fist hour of the 10th lab is dedicated to introducing the project part of the course and the subsequent 3h are dedicated to a mini symposium on “Application of R for Bio Data Science in Industry”. Here students will get a change to get insights into how the course topics are implemented in industry and get a glimpse into what options are available upon completing their education. This hybrid event typically attracts ~250 participants and have featured talks from major national and international Pharma/biotech companies, such as: Novo Nordisk, Lundbeck, Chr. Hansen, Bristol Meyer Squibb, a.o.

In the project part of the course, students will form groups of 4-5 students based on common interests. Hereafter, the students will seek out and select a data set, which will form the foundation for the project work. In the project work, the students will go through the entire data science cycle and produce the code base for a complete bioinformatics project. This entails a complete synthesis of all components of the exercise labs and supports the collaborative aspect. The groups must then condense the project into a presentation, thereby addressing communicative competencies as an essential part of being a modern bio data scientist.