30 July 2020
Machine-Learning for the Social & Behavioural Sciences
[Due to the Covid-19 pandemic, this programme has been cancelled for summer 2020. If you are interested, please visit the programme website and send us an email. We will keep you posted about online alternatives, other developments and summer 2021].
How can we understand how societies become polarized using Twitter data? How might we extract emotion dynamics from individual time series data? In order to answer research questions such as these, the social and behavioral sciences increasingly rely on large and diverse data sets. In this course, we provide students with the practical skills necessary to take advantage of these novel data sources by providing an introduction to data wrangling, data visualization and machine learning models using the programming language Python.
This three-week programme will give students a solid introduction to data analysis using the easy and widely used programming language, Python. The structure of the course is such that students learn about a new method or skill in the morning lecture and then immediately apply them in a practical session in the afternoon, thereby ensuring that students acquire hands-on knowledge they can apply to their own research questions. The practical sessions consist of data analysis problems based on real data taken from the social and behavioral sciences, such as the European Social Survey, data from social media (Twitter and LinkedIn), questionnaire data, and time series collected with the Experience Sampling Method (ESM).
Javier Garcia-Bernardo & Jonas Haslbeck
For current university students (Bachelors and Masters) who want to acquire skills in machine learning methods and have an interest in the social and behavioural sciences. PhD candidates who wish to learn about machine learning are also welcome to apply. Participants should have taken at least one statistical class.
After a short introduction to programming with Python, we first focus on data preparation, including cleaning data, merging data and the handling of missing values. Next, students will learn how to explore data with descriptive statistics and meaningful data visualizations. The largest part of the course will focus on learning and applying statistical and machine learning methods. Starting with linear regression and logistic regression, we carefully introduce more advanced prediction models such as random forests and support vector machines. Next to prediction models, we also cover clustering methods such as k-means, hierarchical clustering and t-SNE. While we cover some advanced methods, the focus of the course is on providing a conceptual understanding of the methods and ensuring that students know how to apply the methods in practice.
By the end of the programme, students will be able to leverage large datasets to answer research questions by applying machine learning methodology. Using Python, participants will be able to clean and combine datasets, create meaningful and beautiful visualizations, and carry out and draw conclusions from statistical analysis. Upon completion, participants will be able to understand when and how to use machine learning tools in the social and behavioral sciences.
EUR 1600: Tuition fee includes class excursions, all course materials, welcome and farewell events and a public transportation card.
EUR 500: Optional housing (price subject to change)
Please check our website for several scholarship opportunities.