Hoe kunnen journalisten Large Language Models (LLM’s) op een betrouwbare manier gebruiken? In deze korte online workshop introduceert Joris Veerbeek externe link de basisprincipes van extractieve AI, waarbij LLM’s worden ingezet om informatie uit ongestructureerde documenten te extraheren.
.webp)
How can journalists use Large Language Models (LLMs) in a reliable way? In this short online workshop, Joris Veerbeek
introduces the basics of extractive AI, using LLMs to extract information from unstructured documents. Drawing on his investigative work for De Groene Amsterdammer, he demonstrates how these methods can support investigative journalism and other research.
With the advent of generative AI, journalism has been flooded with big claims about what these systems can create. But behind the spectacle of bot-written articles and synthetic images lies a quieter question: what can AI pull out of the material that journalists already have?
This workshop takes that angle, focusing on what Derek Willis (University of Maryland)
calls ‘extractive AI’ — using LLMs not to invent new texts, but to systematically extract information from unstructured documents. Given that generative AI still frequently hallucinates and cannot function as a source of facts, extractive AI offers a more reliable path forward.
Extractive AI takes information already present in the input data (for example, large collections of messy PDFs) and turns it into structured outputs such as spreadsheets that can be analyzed further. Drawing on real investigative reporting from the Dutch weekly De Groene Amsterdammer, this workshop introduces the basics of applying extractive AI to investigative work.
No prior coding experience is required. The hands-on part of the workshop includes a small amount of coding, but nothing advanced. As long as you’re not hesitant to give it a go, you should be able to follow along. The workshop will be in English.
The workshop is open to all. While the focus is on investigative journalism, the methods are relevant to anyone working with textual data and can easily be adapted to other contexts.

Joris Veerbeek is a PhD candidate in the Department of Media and Culture Studies at Utrecht University. He focuses on the application of AI in investigative journalism and, as part of his PhD, works on investigative data journalism projects with the Dutch weekly De Groene Amsterdammer.