Reproducible Quantitative Methods: Data analysis workflow using R

Reproducibility and open scientific practices are increasingly being requested or required of scientists and researchers, but training on these practices has not kept pace. This course, offered by the Danish Diabetes Academy, intends to help bridge that gap.

Course Syllabus

Reproducibility and open scientific practices are increasingly demanded of scientists and researchers. Training on how to apply these practices in data analysis is still limited and has not kept up with demand. This course is aimed at early career researchers conducting quantitative analyses (ranging from lab-based research to epidemiology). By the end of the course, students will have:

  1. An understanding of why an open and reproducible data workflow is important.
  2. Practical experience in setting up and carrying out an open and reproducible data analysis workflow.
  3. Know how to continue learning methods and applications in this field.

Students will develop proficiency in using the R statistical computing language, as well as improving their data and code literacy. Throughout this course we will focus on a general quantitative analytical workflow, using the R statistical software and other modern tools. The course will place particular emphasis on research in diabetes and metabolism; it will be taught by instructors working in this field and it will use relevant examples where possible. This course will not teach statistical techniques, as these topics are already covered in university curriculums.

Prerequisites and installation instructions

No experience in data analysis or programming assumed or required. However, before attending the workshop, there are a few prerequisites to complete.

  1. Install the latest version of R
  2. Install the latest version of RStudio
  3. Install the packages listed in the Course Materials
  4. Install Git
  5. Read or scan through Chapter 1 of the online book “R for Data Science”
  6. Read and abide by the Code of Conduct

Instructors and helpers

Course Schedule

The workshop is structured as a series of participatory live-coding sessions (instructor and learner coding together) interspersed with hands-on exercises, using either a practice dataset or the participants’ own datasets. Some lectures will be given, mainly at the start and end of the workshop.

Date and time Session topic Type Instructor
March 4
9:30-10:00 Arrival; coffee and snacks
10:00-10:30 Introduction to the course, to reproducibility, and to open science Lecture Luke
10:30-12:30 Project management and best practices Code-along Luke
12:30-13:15 Lunch
13:15-15:00 Data management, wrangling, and best practices Code-along Anna
15:00-15:45 Science in the era of (ir)reproducibility Lecture Daniel
15:45-16:00 Coffee break
16:00-16:45 Collaboration and teamwork in research Lecture Daniel
16:45-17:00 Describe assignment Luke
17:00-17:40 Form groups and exercises Group work
17:40-18:30 Free time
18:30-20:00 Dinner
March 5
7:00-7:30 Run / swim (optional)
7:30-8:30 Breakfast
8:30-9:00 Review of last day’s topics Lecture
9:00-9:45 Finding and obtaining open datasets Lecture Daniel
9:45-11:45 Version control and collaborative practices Code-along Luke
11:45-12:15 Group hands-on practical work Group work
12:15-13:00 Lunch
13:00-15:15 Data visualization and best practices Code-along Luke
15:15-15:30 Wrap up
March 18
9:30-10:00 Arrival; coffee and snacks
10:00-10:30 Review of last session’s topics Lecture
10:30-12:30 Creating reproducible documents Code-along Santiago
12:30-13:15 Lunch
13:15-15:15 Efficiency in data analysis and best practices Code-along Luke
15:15-15:45 Coffee break
15:45-17:45 Group hands-on practical work Group work
17:45-18:30 Free time
18:30-20:00 Dinner
March 19
7:00-7:30 Run / swim (optional)
7:30-8:30 Breakfast
8:30-9:30 Review of last day’s topics Lecture
9:30-10:30 Data analysis in the era of reproducibility and open science Lecture Daniel/Luke
10:30-12:15 Hands-on practical and coding exercises Group work
12:15-13:00 Lunch
13:00-13:30 Publishing your project’s output (code and paper) Lecture Luke
13:30-14:30 Presentations of group work Group work
14:30-15:15 Discussion of assignments
15:15-15:30 Closing remarks

Contact

Sponsors

Aarhus University

Danish Diabetes Academy

  • Milling Hotel Park, Viaduktvej 28, 5500 Middelfart, Denmark
  • DM Me