Utrecht, Netherlands

AI-Aided Systematic Reviewing

when 29 August 2022 - 2 September 2022
language English
duration 1 week
credits 1.5 EC
fee EUR 620

More and more researchers rely upon Systematic Reviews: attempts to synthesize the state of the art in a particular scientific field. However, the scientific output of the world doubles every nine years. In this tsunami of new knowledge, there is not enough time to read everything – resulting in costly, abandoned or error-prone work. Using the latest methods from the field of Artificial Intelligence (AI), you can reduce the number of papers to screen up to 95%(!). This summer school course introduces you to the open-source software ASReview to help you speed up your systematic review.

Performing a systematic review is a very rigorous process, which is increasingly intensive due to the ever-growing number of scientific publications to review. Nevertheless, systematic reviews are pivotal for scholars, clinicians, policymakers, journalists, and, ultimately, the general public. Developing a search strategy for a systematic review is an iterative process to balance recall, precision and quality. That is, including as many potentially relevant and – ideally – high-quality studies as possible (recall and sensitivity), while at the same time limiting the total number of studies to screen (precision or specificity). In light of the time-consuming and costly process of conducting rigorous systematic reviews with the constant growth of scientific publications, reports, guidelines and other data sources, recent advances in natural language processing (NLP), text mining, and machine learning have produced new algorithms that can accurately mimic human endeavor in systematic review activity, faster and more cheaply.

Within this course, every day consists of both lectures and do-it-yourself computer labs. The talks will be provided by a multidisciplinary team of experts from different fields: statistics, systematic reviewing, data science, open science, bibliometrics and transparent software engineering. For the computer sessions, we have a team ready to help you.

On the first day of the course, we compare the classical manual-based pipeline of performing a systematic review using the PRISMA steps with the AI-aided approach using screening prioritization. We assume participants are familiar with PRISMA. If not, you are requested to read the information on the PRISMA-website before the start of the summer school: http://prisma-statement.org/. In the afternoon, we will work with the open-source software ASReview (www.asreview.nl) by using example datasets to experience the benefits of using active learning. Make sure to have installation rights on your pc!

The second day will be devoted to systematically obtaining the perfect dataset. The basics of searching online databases will be discussed using examples and demos: composing a search query and getting the highest quality of data (e.g., complete abstracts). Because of the use of active learning, the size of the dataset can be different compared to a classical systematic review. How does this affect your search? Is there still a need to search multiple databases? How do you process these large datasets? A much larger dataset can be screened with the same effort, for example, the CORD19 database containing over 500K papers on the Coronavirus. Imagine screening such a database in a couple of days instead of a lifetime!

The third day will be devoted to an in-depth explanation of the different feature extraction techniques (TF-IDF, word2vec, sBert), classifiers (e.g., Naive Bayes, SVM, neural nets), the query strategies (certainty, uncertainty, random sampling) and balancing strategies to deal with the highly sparse relevant papers in the dataset. Although this part of the course is technical, we consider it essential to better understand how AI works if you want to use AI-aided tools (and to be able to answer questions of your supervisors, reviewers, peers and friends).

The fourth day is devoted to Open Science. Although sharing the search query and data is part of the PRISMA checklist, sharing the complete (meta)data underlying a systematic review, including all labeling decisions, is not standard. Therefore, we will discuss a data-sharing protocol, including the importance of persistent identifiers (DOIs), abstract retrieval and trusted repositories. Moreover, it is not enough to make the search query and the meta-data FAIR (Findable, Accessible, Interoperable and Reusable). The AI also makes decisions throughout the process, which should be made FAIR as well. Therefore, all settings of the AI and every iteration of the model have to be stored and made human-readable. We will explain this process and demonstrate how this can be done.

The fifth day consists of Q&A sessions and consultations.

Course leader

Laura Hofstee

Target group

Participants from various fields — including psychology, education, human development, public health, prevention science, sociology, marketing, business, biology, medicine, political science, and communication — will benefit from the course.

It helps if you have a concrete plan for carrying out a systematic review to start working with your own data immediately.

Course aim

After engaging in the course lectures and discussions, and completing the hands-on practice activities, participants will be able to carry out their AI-aided systematic review using active learning.

Fee info

EUR 620: Course fee