Barcelona, Spain
Service Design for Innovation
When:
14 July - 25 July 2026
Credits:
0 EC
Read more
Social Sciences
When:
13 July - 17 July 2026
School:
Institution:
Utrecht Summer School
City:
Country:
Language:
English
Credits:
1.5 EC
Fee:
1050 EUR
In this course, students will learn how to apply text mining and NLP methods on text data and analyse them in a pipeline with machine learning and deep learning algorithms. The course has a strongly practical hands-on focus, and students will gain experience in using text mining on real data from social sciences, humanities, and healthcare, and interpreting the results.
Given the rapid rate at which text data are being digitally gathered in many domains of science, there is a growing need for automated tools that can analyze, classify, and interpret these kinds of data. Text mining and NLP techniques can be applied to create a structured representation of text, making its content more accessible for researchers. Applications of text mining are everywhere: social media, web search, advertising, emails, customer service, healthcare, marketing, etc. This course offers an extensive exploration into text mining with Python. The course has a strongly practical hands-on focus, and students will gain experience in using text mining on real data from for example social sciences and healthcare and interpreting the results. Through lectures and practicals, the students will learn the necessary skills to design, implement, and understand their own text mining pipeline. The topics in this course include preprocessing text, text classification, topic modeling, word embedding, deep learning models, large language models, promoting, and responsible text mining.
The course deals with:
Reviewing the fundamental approaches to text mining;
Understanding and applying current methods for analyzing texts;
Defining a text mining pipeline given a practical data science problem;
Implementing all steps in a text mining pipeline: feature extraction, feature selection, model learning, model evaluation;
Understanding and applying state-of-the-art methods in text mining;
Implementing word embedding and advanced deep learning techniques;
Understanding, employing, and promoting large language models with responsible text mining.
The course starts with reviewing basic concepts of text mining and implementing advanced concepts in natural language processing. At the end of the week, participants will master advanced skills of text mining with Python.
Participants should have a basic knowledge and a motivation of scripting and programming in Python..
Participants are requested to bring their own laptop. Software will be available online.
Dr. Ayoub Bagheri
This course works best for learners who are comfortable programming in Python, who want to acquire skills in text mining approaches, and who have a basic knowledge of machine learning.
Participants should also have a basic knowledge and a motivation of scripting and programming in Python. Participants from computer science and related disciplines, as well as diverse fields such as sociology, psychology, education, medicine, statistics, and beyond, will benefit from the course.
Please note that the selection for this course will be done on a first-come-first-served basis.
There are no restrictions on who can participate beyond the prerequisites; academics, researchers, and professional participants are all welcome to register for the course.
For an overview of all our summer courses offered by the Department of Methodology and Statistics, please click here.
We also offer tailor-made M&S courses and in-house M&S training. If you want to look at the possibilities, please contact Dr. Laurence Frank at pe.dsai@uu.nl.
The course teaches students the basic and advanced text mining techniques using Python on a variety of applications in many domains of science. The skills addressed in this course are:
Python environment
Preprocessing text and feature extraction
Python NLP packages: NLTK, Gensim, spaCy and more
Text classification
Sentiment classification
Text clustering
Topic modeling
Word embedding
CBOW vs Skip-gram
Contextual word embedding
Convolutional neural networks
Recurrent neural networks
Attention models
Responsible text mining
Large language models
BERT, GPT and other Transformers from the package
LLMs: pre-training, prompting, and learning from human feedback
Applications
Fee
1050 EUR, Course + course materials
Fee
275 EUR, Housing fee (optional)
When:
13 July - 17 July 2026
School:
Institution:
Utrecht Summer School
Language:
English
Credits:
1.5 EC
Barcelona, Spain
When:
14 July - 25 July 2026
Credits:
0 EC
Read more
Brighton, United Kingdom
When:
29 June - 17 July 2026
Credits:
15 EC
Read more
Oslo, Norway
When:
29 June - 24 July 2026
Credits:
10 EC
Read more