Colchester, United Kingdom
Interviews as Method of Data Collection in Times of Polarisation
When:
24 March - 28 March 2025
Credits:
4.0 EC
Read more
Social Sciences
When:
23 September - 27 September 2024
School:
Institution:
In cooperation with University of Cologne
City:
Country:
Language:
English
Credits:
2.0 EC
Fee:
550 EUR
Basic “bag-of-words” methods of text analysis that rely on counting words or n-grams are limited in their ability to account for the complexity of natural language. This has implications for our ability to apply these approaches to measure social science concepts in textual data. Deep learning methods for text embedding and neural language modeling help overcome the limitations of bag-of-words text analysis approaches, and thus are an essential addition to the toolkit of computational social science researchers.
This course thus introduces social scientists to advanced, deep learning-based text analysis methods such as word embeddings and large neural language models. Participants will learn about the conceptual motivation and methodological foundations of text embedding methods and large neural language models (LLMs). Moreover, they will gather plenty of practical experience with applying these methods in social science research using the Python programming language. Next to conveying a solid conceptual understanding as well as hands-on experience with applying these methods, the course puts a strong emphasis on introducing and discussing potential social science use cases as well as ethical considerations.
We will start by introducing classical word embedding models like GloVe and word2vec and participants will learn how to use word embeddings in social science research. We will then introduce state-of-the-art Transformer models like BERT and GPT. We will first cover their methodological foundations: the attention mechanism, masked and autoregressive language modeling, and the neural network architectures that characterize BERT and GPT. Participants will then apply these models in exercises covering various supervised learning tasks (single- and multilabel sentence classification, token classification, and pairwise comparison) as well as topic modeling with BERTopic. Finally, we will introduce strategies and techniques to prompt pre-trained generative language models to code texts based on no or only a few labelled examples (i.e., zero-shot prompting and few-shot in-context learning).
This is an advanced-level course. Participants should have prior knowledge of basic text analysis techniques. Specifically, they should have experience with standard bag-of-words pre-processing techniques and text representation approaches, such as word count-based document-feature matrices. Those looking for a more introductory-level course should consider taking “Introduction to Machine Learning for Text Analysis with Python” (16-20 September). Moreover, participants should have experience with programming in Python. The instructors cannot provide an introduction to or recap of basics in Python programming in the course due to limited time.
Lisa Maria Lechner, University of Innsbruck, AustriaHauke Licht, University of Cologne, Germany.
You will find the course useful if:
- you have a background in the social sciences or humanities (e.g., communication science, economics, political science, sociology, or related fields)
- you have a solid understanding of basic text analysis methods and
- you want to advance their knowledge, skills, and practical experience
- you want get up to speed with applying state-of-the-art NLP methods to text analysis problems in social science research.
By the end of the course you will:
- know the methodological foundations of text embedding methods, transfer learning, Transformers, large language models (LLMs)
- be able to apply these methods to analyze social scientific text data
- be able to reflect critically about the application of the techniques in social science research, including relevant ethical considerations.
Fee
550 EUR, Student/PhD student rate.
Fee
825 EUR, Academic/non-profit rate.The rates include the tuition fee, course materials, the academic program, and coffee/tea breaks.
When:
23 September - 27 September 2024
School:
Institution:
In cooperation with University of Cologne
Language:
English
Credits:
2.0 EC
Colchester, United Kingdom
When:
24 March - 28 March 2025
Credits:
4.0 EC
Read more
Colchester, United Kingdom
When:
24 March - 28 March 2025
Credits:
4.0 EC
Read more
Antwerp, Belgium
When:
03 February - 07 February 2025
Credits:
3 EC
Read more