home / Courses / Political Science / Data Scraping and Management for Social Scientists with R

Political Science & Economics Summer Course

Data Scraping and Management for Social Scientists with R

When:

15 June - 19 June 2026

School:

Global School in Empirical Research Methods

Institution:

University of St. Gallen

City:

St. Gallen

Country:

Switzerland

Language:

English

Credits:

4 EC

Fee:

1100 CHF

Interested?

Data Scraping and Management for Social Scientists with R

About

Online platforms such as Yelp, Twitter, Amazon, or Instagram are large-scale, rich and relevant sources of data. Researchers in the social sciences increasingly tap into these data for field evidence when studying various phenomena.

In this course, you will learn how to find, acquire, store, and manage data from such sources and prepare them for follow-up statistical analysis for your own research.

After a short introduction into the relevance of data science skills for the social sciences, we will review R as a programming language and its basic data formats. We will then use R to program simple scrapers that systematically extract data from websites. We will use the packages rvest, httr, and RSelenium, among others, for this purpose. You will further need to learn how to read HTML, CSS, JSON, or XML codes, to use regular expressions, and to handle string, text and image data. To store the data, we will look into relational databases, (My)SQL, and related R packages. Many websites such as Twitter and Yelp offer convenient application-programming interfaces (APIs) that facilitate the extraction of data and we will look into accessing them from R. Finally, we will highlight some options for feature extraction from images and text, which allows us to augment our collected data with meaningful variables we can use in our analysis.

At the end of this course, students should be able to identify valuable online data sources, to write basic scrapers, and to prepare the collected data such that they can use them for statistical analysis as part of their own research projects.

Throughout the course, students will work on a data-scraping project related to their theses. This project will be presented at the final day of the course.

Course leader

Reto Hofstetter

Target group

Master | PhD | Postdoc | Professionals

Fee info

Fee

1100 CHF, Master | PhD

Fee

2000 CHF, Postdoc | Professionals

Interested?

When:

15 June - 19 June 2026

School:

Global School in Empirical Research Methods

Institution:

University of St. Gallen

Language:

English

Credits:

4 EC

Visit school

Other relevant courses

Strasbourg, France

European Summer School 2026 – The European Identity: Past, Present and Future

When:

15 June - 26 June 2026

Credits:

0 EC

Online

Sulmona, Italy

Linear Panel Data Models in Stata

When:

23 April - 24 April 2026

Credits:

0 EC

Copenhagen, Denmark

Machine Learning for Predictive Analytics in Business

When:

22 June - 31 July 2026

Credits:

7.5 EC

Data Scraping and Management for Social Scientists with R

About

Course leader

Target group

Fee info

Interested?

Other relevant courses

European Summer School 2026 – The European Identity: Past, Present and Future

Linear Panel Data Models in Stata

Ma­chine Learn­ing for Pre­dict­ive Ana­lyt­ics in Busi­ness

Stay up-to-date about our summer schools!

Machine Learning for Predictive Analytics in Business