Amsterdam, Netherlands

Machine-Learning for the Social & Behavioural Sciences

when 3 July 2022 - 21 July 2022
language English
duration 3 weeks
credits 6 EC
fee EUR 1650

How can we understand how societies become polarized using Twitter data? How might we extract emotion dynamics from individual time series data? In order to answer research questions such as these, the social and behavioral sciences increasingly rely on large and diverse data sets. In this course, we provide students with the practical skills necessary to take advantage of these novel data sources by providing an introduction to data wrangling, data visualization and machine learning models using the programming language Python.

This three-week programme will give students a solid introduction to data analysis using the easy and widely used programming language, Python. The structure of the course is such that students learn about a new method or skill in the morning lecture and then immediately apply them in a practical session in the afternoon, thereby ensuring that students acquire hands-on knowledge they can apply to their own research questions. The practical sessions consist of data analysis problems based on real data taken from the social and behavioral sciences, such as the European Social Survey, data from social media (Twitter and LinkedIn), questionnaire data, and time series collected with the Experience Sampling Method (ESM).

Course leader

Dr. Javier Garcia-Bernardo & Dr. Jonas Haslbeck

Target group

For current university students (Bachelors and Masters) who want to acquire skills in machine learning methods and have an interest in the social and behavioural sciences. PhD candidates who wish to learn about machine learning are also welcome to apply. Participants should have taken at least one class in statistics.

Course aim

After a short introduction to programming with Python, we first focus on data preparation, including cleaning data, merging data and the handling of missing values. Next, students will learn how to explore data with descriptive statistics and meaningful data visualizations. The largest part of the course will focus on learning and applying statistical and machine learning methods. Starting with linear regression and logistic regression, we carefully introduce more advanced prediction models such as random forests and support vector machines. Next to prediction models, we also cover clustering methods such as k-means, hierarchical clustering and t-SNE. While we cover some advanced methods, the focus of the course is on providing a conceptual understanding of the methods and ensuring that students know how to apply the methods in practice.

By the end of the programme, students will be able to leverage large datasets to answer research questions by applying machine learning methodology. Using Python, participants will be able to clean and combine datasets, create meaningful and beautiful visualizations, and carry out and draw conclusions from statistical analysis. Upon completion, participants will be able to understand when and how to use machine learning tools in the social and behavioral sciences.

Fee info

EUR 1650: Tuition fee includes class excursions, all course materials, welcome and farewell events and a public transportation card.
EUR 650: Approximate housing fee for the duration of the course. Arranging housing through our office is optional.


Please check our website for several scholarship opportunities.

Register for this course
on course website