Lugano, Switzerland

Content Analysis and Natural Language Processing

when 19 August 2024 - 23 August 2024
language English
duration 1 week
fee CHF 700

Workshop contents and objectives

The aim of this workshop is to provide participants with a practical hands-on, and theoretical understanding of new methods in the content analysis made possible by applying digital technology to text corpora.

This approach scales from words to documents to large text corpora.

Some of the issues this approach addresses include the following:

Understanding the speech of political leaders: What U.S. president is viewed most negatively? Does political speech on Twitter incite violence?
Detecting historical changes in happiness: Which nations are happiest, and how has their happiness changed over time? Does national happiness correlate with GDP, longevity, democratisation, etc?
Predicting views of brands: What does it mean to be a luxury brand? What associations do people have with different products?
Using language to predict personality or changes across an individual’s lifespan: How did the writing of Darwin, Mozart, and Van Gogh change across their lifespan?
The course will begin by providing participants with an understanding of what natural language processing offers content analysis. Automation can allow interesting content questions to be answered in very short periods of time (sometimes minutes), saving weeks or months of research time. It can also introduce new questions that lead to innovative research programmes.

Specific cases will be used to show how natural language processing can be applied to theoretical questions in the social sciences. Each day will present published research and then demonstrate how the research was done, providing code and data.

On completion of the course, participants will be able to recognize and implement many common approaches to content analysis using natural language processing and take the first steps towards formulating and addressing problems of their own in social data science or the digital humanities. Participants will also be provided with detailed information about how to follow up and learn more with respect to their particular area of interest.



Workshop design

The course will alternate between lectures and interactive programming using pre-written code in R.



Detailed lecture plan (daily schedule)

Day 1.
Intro to content analysis and natural language processing, off the shelf tools and simplicity

Day 2.
Word features (document sentiment and feature analysis)

Day 3.
Word and document semantics and similarity

Day 4.
Topics (what are my documents about and how can I organize them?)

Day 5.
Advanced topics and short presentations from students

Prerequisites

Students taking this workshop should have some experience with R and RStudio. There are a number of free or inexpensive online courses well worth the investment in time (e.g., Datacamp) that offer introductory courses in R that are sufficient prerequisites for this course. A general introductory book to statistics in R will also work (e.g., Dalgaard, P. 2008. Introductory statistics with R). Though the course will primarily use R, I will provide all the code. Therefore, this course can be a way to improve your R skills as well.

Course leader

Thomas Hills is currently the Director of the Behavioural and Data Science MSc and the Bridges Doctoral Training Centre in Mathematical and Social Sciences (University of Warwick).

Target group

doctoral researchers, early career researchers, experienced researchers

Credits info

The Summer School cannot grant credits. We only deliver a Certificate of Participation, i.e. we certify your attendance.

If you consider using Summer School workshops to obtain credits (ECTS), you will have to investigate at your home institution (contact the person/institute responsible for your degree) to find out whether they recognise the Summer School, how many credits can be earned from a workshop/course with roughly 35 hours of teaching, no graded work, and no exams.

Make sure to investigate this matter before registering if this is important to you.

Fee info

CHF 700: Reduced fee: 700 Swiss Francs per weekly workshop for students (requires proof of student status).*

Reduced Fee

To qualify for the reduced fee, you are required to send a copy of an official document that certifies your current student status or a letter from your supervisor stating your actual position as a doctoral or postdoctoral researcher. Send this letter/document by e-mail to methodssummerschool@usi.ch.
CHF 1100: Normal fee: 1100 Swiss Francs per weekly workshop for all others.*

*These fees also include participation in one of the preliminary workshops (a 2/3-day workshop preceding the Summer School). The registration fee for the Preliminary workshop booked on its own is 200 CHF.