Utrecht, Netherlands

Data Science: Statistical Programming with R

when 8 July 2024 - 12 July 2024
language English
duration 1 week
credits 1.5 EC
fee EUR 850

This course offers an elaborate introduction to statistical programming with R. Students learn to operate R, form pipelines for data analysis, make high quality graphics, fit, assess, and interpret a variety of statistical models, and do advanced statistical programming. The statistical theory in this course covers t-testing, regression models for linear, dichotomous, ordinal, and multivariate data, statistical inference, statistical learning, bootstrapping, and Monte Carlo simulation techniques.

R is a very popular and powerful platform for data manipulation, visualization, and analysis and has a number of advantages over other statistical software packages. A wide community of users contribute to R, resulting in broad coverage of statistical procedures, including many that are not available in any other statistical program. However, R lacks standard GUI menus from which to choose what statistical test to perform or which graph to create. Consequently, R is more challenging to master. This course will help flatten the learning curve for those who wish to begin working with R by offering an elaborate introduction to statistical programming in R.

In this course we will cover the following topics:

An introduction to the R environment
Basic to advanced programming skills: data generation, manipulation, pipelines, summaries, and plotting
Fitting statistical models: estimation, prediction, and testing
Drawing statistical inference from data
Basic statistical learning techniques
Bootstrapping and Monte Carlo simulation
The course starts at a very basic level and builds up gradually. So, no previous experience with R is required. At the end of the week, participants will master advanced programming skills with R.

Participants are requested to bring their own laptop computer. Software will be available online.

R can be installed (for free) from here

RStudio can be installed (for free) from here

- The Open-Source Desktop license is fine.

This course is part of a series of 5 courses in the Summer School Data Science specialisation taught by UU’s department of Methodology & Statistics. Please see here for more information about the full specialisation. This course can also be taken separately.

Summer School Data Science specialisation:

Data science: Statistical Programming with R (This course)
Data science: Multiple Imputation in Practice (S28)
Data science: Introduction to Text Mining with R (S41)
Data science: Data analysis (S31)
Data science: Applied Text Mining (S42)
Upon completing 3 out of 5 courses in the specialisation (no more than one text mining course), students can obtain a certificate. Each course may also be taken separately.

Course leader

Dr. Laurence E. Frank

Target group

Applied researchers and (master) students who already use statistical software and would like to learn to use, or improve their usage of, the R environment. Understanding basic statistical theory such as t-tests, hypothesis testing, and regression is required. Participants from a variety of fields—including sociology, psychology, education, human development, marketing, business, biology, medicine, political science, and communication sciences—will benefit from this course.

A maximum of 80 participants will be allowed in this course, and selection for the course will be done on a first-come-first-served basis.

For an overview of all our summer school courses offered by the Department of Methodology and Statistics please click here.

Fee info

EUR 850: Course + course materials
EUR 250: Housing fee (optional)

Register for this course
on course website