Utrecht, Netherlands

Data Science: Multiple Imputation in Practice

when 13 July 2021 - 16 July 2021
language English
duration 1 week
credits 1.5 EC
fee EUR 600

Due to the covid-19 outbreak, this course has been postponed to 2021.

This 4-day course teaches you the basics in solving your own missing data problems appropriately. Participants will learn how to form imputation models, how to combine data sets, how to model non-response, how to use diagnostics to inspect the imputed values, how to obtain valid inference on incomplete data and how to avoid many of the pitfalls associated with real-life missing data problems.

Most researchers in the social and behavioural sciences have encountered the problem of missing data: It seriously complicates the statistical analysis of data, and simply ignoring it is not a good strategy. A general and statistically valid technique to analyze incomplete data is multiple imputation, which is rapidly becoming the standard in social and behavioural science research.

This course will explain a modern and flexible imputation technique that is able to preserve important features in the data. The aim of this course is to enhance participants’ knowledge in imputation methodology and to provide a flexible solution to their incomplete data problems using R. The course will explain the principles of missing data theory, outline a step-by-step approach toward creating high quality imputations, and provide guidelines how the results can be reported. The course will use the authors' MICE package in R.

The lectures will follow the book “Flexible Imputation of Missing Data” by Stef van Buuren ( 2nd edition, Chapman & Hall, 2018). The book can be read online for free at https://stefvanbuuren.name/fimd/.

Course leader

Dr. Gerko Vink

Target group

This course is relevant for applied researchers or statistical researchers that would like to get acquainted with the theory and practice of multiple imputation. Participants should have basic understanding of statistical techniques (such as analysis of variance and (non)linear regression) and the concept of statistical inference. This course is suitable for students at Master level, Advanced master level en PhD level. A max. of 50 participants will be allowed in this course. Please note that the selection for this course will be done on a first-come-first-served basis.

Course aim

The aim of this course is to enhance participants’ knowledge in imputation methodology, and to provide a flexible solution to their incomplete data problems using R.

Fee info

EUR 600: Course + course materials
EUR 200: Housing fee (optional)

Register for this course
on course website