The workshop explores the rapidly advancing field of natural language processing (NLP), or text-as-data analysis, in political research. Its aim is to bring together researchers from diverse subfields of political science, using varied types of text data, and employing different NLP tools. It investigates how NLP is deployed for ever more nuanced and difficult text-analytical tasks and how it helps address important research questions through novel means. The workshop also explores best practices and standards for the use and validation of the most powerful but also largely opaque NLP tools, including LLMs and generative AI, in political research.
The developments in NLP (text-as-data analysis) over the last decade, and especially the last few years, are transforming political science. They enable researchers to analyse textual data with unprecedented nuance, at scale, across languages, at little cost, and within increasingly user-friendly environments (Grimmer et al 2022). Combined with expanding availability of accessible relevant text data sources, this opens the door for the exploration of new research questions and agendas in political science.
Political research has been quick to jump on these opportunities. In comparative politics, international relations, European studies, public administration, and other related fields, researchers deliver important new insights studying data sources such as parliamentary records, social and news media, legal regulations, and verbatim speeches on an unprecedented scale (eg Eckhard et al. 2024; Heidenreich et al., 2019; Martin and McCrain 2019; Rauh 2023; Parizek 2024). Robust methodological literature rapidly advances politics-focused NLP research (eg Baden et al., 2022; Benoit et al., 2019; Rodriguez and Spirling 2022; Watanabe 2021). We are moving from lexicon-based analysis to the use of language models that are on par with trained human annotators on nuanced coding tasks (Gilardi et al. 2023).
As a result, the number of political scientists engaged with NLP, not the least junior scholars, is growing fast. An active exchange oriented around substantive research questions, as well as methodological innovations is needed. Importantly, the discipline also needs a robust discussion about best practices and standards for the use, and in particular for the validation, of NLP tools.
Baden, C., Pipal, C., Schoonvelde, M., & van der Velden, M. (2022). Three Gaps in Computational Text Analysis Methods for Social Sciences: A Research Agenda. Communication Methods and Measures 16(1), 1–18.
Benoit, K., Munger, K., & Spirling, A. (2019). Measuring and Explaining Political Sophistication through Textual Complexity. American Journal of Political Science 63(2), 491–508.
Eckhard, S., Jankauskas, V., Leuschner, E., Burton, I., Kerl, T., & Sevastjanova, R. (2023). The performance of international organizations: a new measure and dataset based on computational text analysis of evaluation reports. The Review of International Organizations 18(4), 753-776.
Grimmer, J., Roberts, M. E., & Stewart, B. M. (2022). Text as data: A new framework for machine learning and the social sciences. Princeton University Press.
Heidenreich, T., Lind, F., Eberl, J.-M., & Boomgaarden, H. G. (2019). Media Framing Dynamics of the ‘European Refugee Crisis’: A Comparative Topic Modelling Approach. Journal of Refugee Studies, 32(1), i172-i182.
Gilardi, F., Alizadeh, M., & Kubli, M. (2023). ChatGPT outperforms crowd workers for text-annotation tasks. Proceedings of the National Academy of Sciences, 120(30).
Martin, G. J., & McCrain, J. (2019). Local News and National Politics. American Political Science Review, 113(2), 372-384.
Parizek, M. (2024). Less in the West: The tangibility of international organizations and their media visibility around the world. The Review of International Organizations. Online First.
Rodriguez, P., & Spirling, A. (2022). Word Embeddings: What Works, What Doesn’t, and How to Tell the Difference for Applied Research. The Journal of Politics 84(1), 101–15.
Watanabe, K. (2021). Latent Semantic Scaling: A Semisupervised Text Analysis Technique for New Domains and Languages. Communication Methods and Measures 15(2), 81–102.
1: What NLP tools are deployed by scholars for the really difficult political text NLP tasks?
2: How should researchers navigate the choice between NLP tools, e.g. bag-of-words and LLMs?
3: What are best practices for the validation of text-as-data models?
4: What are the emerging standards, if any, for the use of LLMs and generative AI?
5: Can subfields in Political Science learn from each other about best uses of different NLP tools?
Title |
Details |
From Paradise to Paradox: Examining Family Policy in Quebec’s Welfare State |
View Paper Details
|
Exploring Nostalgic Rhetoric in UK Parliamentary Speeches Using a LLM |
View Paper Details
|
Rules and Revelations: Balancing Interpretability and Flexibility in NLP Approaches to Legislative Design Analysis |
View Paper Details
|
Measuring the communication of threat |
View Paper Details
|
Conceptualizing Illiberalism Using Natural Language Processing |
View Paper Details
|
Second-Order Saliency Theory - A Theoretical Foundation for Ideal Point Estimation from Text |
View Paper Details
|
Seeing China from different perspectives: Analyzing Media reporting on the sustainability impacts of Chinese overseas investments. |
View Paper Details
|
AI-Driven Text Analysis in the Political Economy of Sustainability: Hybrid Retrieval-Augmented Generation and LLM Multi-Agent Approach |
View Paper Details
|
Conflicts in the UN Security Council: Manual coding, automatic classification and a visualization tool |
View Paper Details
|
Challenges and opportunities in validating automated stance detection: The coverage of migration in Slovak media (2003-2024) |
View Paper Details
|
Trade Talk: The Changing Nature of Global Trade Narratives |
View Paper Details
|
Customizing GPT for gender equality in judicial decision-making: opportunities and ethical challenges |
View Paper Details
|
Supranational Entrepreneurship in Action: How the UN Secretary-General Influences Peacekeeping Policy Through Reports |
View Paper Details
|
Validating Text-as-Data Approaches: A Framework for NLP Applications in Political Science |
View Paper Details
|
Can we Automatedly Measure the Quality of Online Political Discussion? How to (Not) Measure Interactivity, Diversity, Rationality, and Incivility in Online Comments to the News |
View Paper Details
|
The Worldwide Reception of the Competing Strategic Narratives of the Russia–Ukraine and Hamas–Israel Wars |
View Paper Details
|
1,000 Speeches vs. GPT-4o: Analyzing Position Shifting in the European Parliament |
View Paper Details
|
Bridging the gap between data and understanding for novice users |
View Paper Details
|
Classifying ideological policy frames: Introducing the ILLFRAMES codebook and the Babel Machine pipeline |
View Paper Details
|
Measuring Personal Attacks in Parliamentary Debates |
View Paper Details
|
"China's Strategic Narratives About AI: Regional Variations In Communication via CGTN on YouTube" |
View Paper Details
|
Policy actor alignments and social media discourse: a structural topic modeling approach to Twitter/X discussions on German legislation |
View Paper Details
|
German Far-Right Mainstreaming on TikTok |
View Paper Details
|