Measuring Personal Attacks in Parliamentary Debates
Comparative Politics
Parliaments
Quantitative
Big Data
To access full paper downloads, participants are encouraged to install the official Event App, available on the App Store.
Abstract
Understanding the behavior of political parties, such as who attacks whom and when, is essential for grasping the dynamics (e.g., political conflicts and polarization) of political discourse among political actors over time. Although current research has shed light on group-based rhetoric and relationships in political communication, many studies have been hindered by their dependence on demanding (semi-) manual content analysis with a small set of labeled data or are primarily English-centric language coverage. This limitation has particularly affected comparative research across languages and political contexts. In multilingual contexts, particularly when comparing parliaments across countries, we encounter the challenge of lacking an efficient multilingual approach.
Therefore, we address three shortcomings in the analysis of actor-centric rhetoric in text. First, the current approaches often lack automation, resulting in a labor-intensive process that undermines efficiency. Second, they exhibit limited multilingual capabilities, restricting their effectiveness in diverse linguistic contexts. Finally, they typically require extensive resources to analyze a vast number of speeches.
To tackle these challenges, we introduce a new method for automatically identifying mentions of political actors in debates and their polarity within. Our method minimizes the need for manual annotations in target languages and reduces the reliance on numerous computational resources when employing large language models (LLMs). Our method surpasses existing techniques in three significant aspects. Firstly, we develop a new hand-coded multilingual dataset (English, German, and Spanish) to evaluate our approach. We offer novel annotations for two tasks: (1) to analyze actor-centric utterances and (2) to identify the polarization in debates. Our methodology includes coding that distinguishes between political actor categories in parliaments (governments, opposition, parties, and politicians). Additionally, we evaluate the polarization of these actors over time. Secondly, we leveraged training data from one language to adapt an open-source LLM (LLaMA) for our two specific actor-based rhetoric tasks. We employ a LoRA-adapter-training method to reduce computational needs. Thirdly, we utilize this tailored model to validate our cross-lingual approach by testing it with our labeled datasets in German and Spanish. This validation demonstrates that the task-specific adapted LLM can enhance our training data beyond the language of the original dataset due to a language-agnostic task representation that we have learned. Furthermore, we expand our training data to 10.000 samples in each language, which we use to train a compact student RoBERTa model capable of performing the tasks in the target languages. This smaller model allows us to create lightweight models making computational analysis of political discourse more accessible for researchers. We illustrate the effectiveness of these various approaches by comparing them to a range of baselines. This comparison includes our method evaluated alongside state-of-the-art zero- and few-shot prompting without task-adapted LLMs, primarily utilizing open-source models and GPT4o as a black-box comparison.
Overall, our approach shows how advanced and efficient NLP can address key questions in comparative politics while ensuring methodological rigor and research accessibility. By automating actor reference identification and polarization analysis, we enable larger-scale comparisons of political actors’ engagement with diverse actors across languages and national contexts.