ECPR

Install the app

Install this application on your home screen for quick and easy access when you’re on the go.

Just tap Share then “Add to Home Screen”

Mapping Political-Elite Networks in Europe with a Multilingual Joint Entity-Relation Extraction Pipeline: The VALPOP Project

Europe (Central and Eastern)
Civil Society
Democracy
European Politics
Governance
Corruption
Big Data
Kirill Solovev
University of Graz
Jana Lasser
University of Graz
Kirill Solovev
University of Graz

To access full paper downloads, participants are encouraged to install the official Event App, available on the App Store.


Abstract

VALPOP is an EU-funded project that investigates how networks of political elites and connected organisations shape the creation and distribution of public goods in Europe, with particular attention to populist actors and their allies. Addressing the workshop’s focus on large-scale, cross-national elite-network research, the project develops a multilingual Joint Entity-Relation Extraction (JERE) pipeline that uses large language models and other AI tools to derive temporal political elite networks from unstructured news text. The current implementation covers 13 European countries and languages, using Factiva as the main source of newspaper articles. Methodologically, the pipeline combines open-source components for semantic chunking, multilingual named-entity recognition, hybrid entity linking and hybrid search, and LLM-based relation extraction. Articles are first segmented into semantically coherent units using BGE-M3 embeddings, which improves downstream recognition of political and organisational actors by the multilingual GLiNER-X-Large model. A three-stage entity-linking and retrieval module then canonicalises mentions to Wikidata: exact alias matching via a SQLite index, fuzzy matching with rapidfuzz, and semantic vector search in a Qdrant database constructed from a filtered, Europe-focused Wikidata snapshot. This hybrid design brings together symbolic lookup and dense retrieval, reducing hallucinations while handling multilingual aliases, abbreviations, and minor spelling variation. Relations between canonicalised actors are extracted with a large multilingual instruction-tuned model (Qwen3-Next-80B via vLLM), orchestrated through DSPy. Typed Pydantic outputs enforce a strict schema that links subjects and objects (labels, QIDs, ontology types) to Wikidata property IDs and temporal information (events, intervals, or ongoing relations). A VALPOP-specific ontology, aligned with Wikidata and implemented in SKOS, distinguishes relationship types and sectors relevant for public goods and regulatory governance (for example office-holding, oversight, corporate ownership, advisory roles, and explicit alliances or conflicts). This design allows us to construct cross-national, multiplex, and potentially signed networks of political elites, firms, media organisations, and public institutions. Substantively, the paper demonstrates how these tools can be used to move beyond small inner circles of top leaders and to reconstruct broader networks surrounding key domains of public-good provision, such as welfare, energy, and infrastructure. Using Wikidata QIDs and sitelinks as anchors, we discuss strategies for external validation and linkage to existing elite datasets, as well as the limitations imposed by proprietary text sources. The open and modular implementation addresses the workshop’s concerns about replicability and future-proofing: the architecture separates data, ontology, and models, allowing the replacement of individual components (for example different LLMs) while keeping outputs comparable over time. The contribution is threefold. First, it presents a concrete, open, and multilingual JERE architecture tailored to political-elite research that combines career-type and co-occurrence-based relations. Second, it evaluates trade-offs between open-weight and proprietary LLMs and between different retrieval strategies for identifying political entities and their ties in multilingual news. Third, it outlines how the resulting networks can be used to study elite capture, nepotistic allocation of public goods, and the role of populist actors in transforming public goods into club goods across democratic and hybrid regimes.