AI Utrecht Region

How can journalists use Large Language Models (LLMs) in a reliable way? In this short online workshop, Joris Veerbeek

introduces the basics of extractive AI, using LLMs to extract information from unstructured documents. Drawing on his investigative work for De Groene Amsterdammer, he demonstrates how these methods can support investigative journalism and other research.

With the advent of generative AI, journalism has been flooded with big claims about what these systems can create. But behind the spectacle of bot-written articles and synthetic images lies a quieter question: what can AI pull out of the material that journalists already have?

This workshop takes that angle, focusing on what Derek Willis (University of Maryland)

external link

calls ‘extractive AI’ — using LLMs not to invent new texts, but to systematically extract information from unstructured documents. Given that generative AI still frequently hallucinates and cannot function as a source of facts, extractive AI offers a more reliable path forward.

Workshop content

Extractive AI takes information already present in the input data (for example, large collections of messy PDFs) and turns it into structured outputs such as spreadsheets that can be analyzed further. Drawing on real investigative reporting from the Dutch weekly De Groene Amsterdammer, this workshop introduces the basics of applying extractive AI to investigative work.

Practical structure:

First half hour: An informative talk about the advantages of having LLMs return predefined, structured outputs. This approach makes it possible to process large volumes of unstructured materials such as parliamentary records, or to create bots that automatically scroll through TikTok.
Second half hour: A hands-on session demonstrating how to use structured outputs in practice, including querying LLM APIs to extract structured information directly from documents.

Level

No prior coding experience is required. The hands-on part of the workshop includes a small amount of coding, but nothing advanced. As long as you’re not hesitant to give it a go, you should be able to follow along. The workshop will be in English.

For whom?

The workshop is open to all. While the focus is on investigative journalism, the methods are relevant to anyone working with textual data and can easily be adapted to other contexts.

About

Joris Veerbeek is a PhD candidate in the Department of Media and Culture Studies at Utrecht University. He focuses on the application of AI in investigative journalism and, as part of his PhD, works on investigative data journalism projects with the Dutch weekly De Groene Amsterdammer.

‍

Meer informatie

Terug

AI en onderzoeksjournalistiek

een workshop over extractieve AI

Online

Thu

26

3

Workshop content

Practical structure:

Level

For whom?

About