Text-as-Data Analysis in Political Science: Unlocking Political Insights with New Research Tools

Political Methodology

Big Data

Code:

P025

Director: Michal Parizek

Charles University

Co-Director: Steffen Eckhard

Zeppelin University Friedrichshafen

Date, Time and Location

Tuesday 09:00 – Friday 17:00 (20/05/2025 – 23/05/2025)

The workshop explores the rapidly advancing field of natural language processing (NLP), or text-as-data analysis, in political research. Its aim is to bring together researchers from diverse subfields of political science, using varied types of text data, and employing different NLP tools. It investigates how NLP is deployed for ever more nuanced and difficult text-analytical tasks and how it helps address important research questions through novel means. The workshop also explores best practices and standards for the use and validation of the most powerful but also largely opaque NLP tools, including LLMs and generative AI, in political research.

The developments in NLP (text-as-data analysis) over the last decade, and especially the last few years, are transforming political science. They enable researchers to analyse textual data with unprecedented nuance, at scale, across languages, at little cost, and within increasingly user-friendly environments (Grimmer et al 2022). Combined with expanding availability of accessible relevant text data sources, this opens the door for the exploration of new research questions and agendas in political science. Political research has been quick to jump on these opportunities. In comparative politics, international relations, European studies, public administration, and other related fields, researchers deliver important new insights studying data sources such as parliamentary records, social and news media, legal regulations, and verbatim speeches on an unprecedented scale (eg Eckhard et al. 2024; Heidenreich et al., 2019; Martin and McCrain 2019; Rauh 2023; Parizek 2024). Robust methodological literature rapidly advances politics-focused NLP research (eg Baden et al., 2022; Benoit et al., 2019; Rodriguez and Spirling 2022; Watanabe 2021). We are moving from lexicon-based analysis to the use of language models that are on par with trained human annotators on nuanced coding tasks (Gilardi et al. 2023). As a result, the number of political scientists engaged with NLP, not the least junior scholars, is growing fast. An active exchange oriented around substantive research questions, as well as methodological innovations is needed. Importantly, the discipline also needs a robust discussion about best practices and standards for the use, and in particular for the validation, of NLP tools.

Baden, C., Pipal, C., Schoonvelde, M., & van der Velden, M. (2022). Three Gaps in Computational Text Analysis Methods for Social Sciences: A Research Agenda. Communication Methods and Measures 16(1), 1–18. Benoit, K., Munger, K., & Spirling, A. (2019). Measuring and Explaining Political Sophistication through Textual Complexity. American Journal of Political Science 63(2), 491–508. Eckhard, S., Jankauskas, V., Leuschner, E., Burton, I., Kerl, T., & Sevastjanova, R. (2023). The performance of international organizations: a new measure and dataset based on computational text analysis of evaluation reports. The Review of International Organizations 18(4), 753-776. Grimmer, J., Roberts, M. E., & Stewart, B. M. (2022). Text as data: A new framework for machine learning and the social sciences. Princeton University Press. Heidenreich, T., Lind, F., Eberl, J.-M., & Boomgaarden, H. G. (2019). Media Framing Dynamics of the ‘European Refugee Crisis’: A Comparative Topic Modelling Approach. Journal of Refugee Studies, 32(1), i172-i182. Gilardi, F., Alizadeh, M., & Kubli, M. (2023). ChatGPT outperforms crowd workers for text-annotation tasks. Proceedings of the National Academy of Sciences, 120(30). Martin, G. J., & McCrain, J. (2019). Local News and National Politics. American Political Science Review, 113(2), 372-384. Parizek, M. (2024). Less in the West: The tangibility of international organizations and their media visibility around the world. The Review of International Organizations. Online First. Rodriguez, P., & Spirling, A. (2022). Word Embeddings: What Works, What Doesn’t, and How to Tell the Difference for Applied Research. The Journal of Politics 84(1), 101–15. Watanabe, K. (2021). Latent Semantic Scaling: A Semisupervised Text Analysis Technique for New Domains and Languages. Communication Methods and Measures 15(2), 81–102.

1: What NLP tools are deployed by scholars for the really difficult political text NLP tasks?
2: How should researchers navigate the choice between NLP tools, e.g. bag-of-words and LLMs?
3: What are best practices for the validation of text-as-data models?
4: What are the emerging standards, if any, for the use of LLMs and generative AI?
5: Can subfields in Political Science learn from each other about best uses of different NLP tools?

Title	Details
From Paradise to Paradox: Examining Family Policy in Quebec’s Welfare State	View Paper Details
Analyzing Sentiments towards the European Union in Slovak Parliamentary Speeches (1994–2023)	View Paper Details
Rules and Revelations: Balancing Interpretability and Flexibility in NLP Approaches to Legislative Design Analysis	View Paper Details
Measuring the communication of threat	View Paper Details
Conceptualizing Illiberalism Using Natural Language Processing	View Paper Details
Second-Order Saliency Theory - A Theoretical Foundation for Ideal Point Estimation from Text	View Paper Details
Seeing China from different perspectives: Analyzing Media reporting on the sustainability impacts of Chinese overseas investments.	View Paper Details
AI-Driven Text Analysis in the Political Economy of Sustainability: Hybrid Retrieval-Augmented Generation and LLM Multi-Agent Approach	View Paper Details
Challenges and opportunities in validating automated stance detection: The coverage of migration in Slovak media (2003-2024)	View Paper Details
Trade Talk: The Changing Nature of Global Trade Narratives	View Paper Details
Not In My Backyard... Or Is It? The Psychological Distance of Climate Change in Parliamentary Speech	View Paper Details
Customizing GPT for gender equality in judicial decision-making: opportunities and ethical challenges	View Paper Details
Supranational Entrepreneurship in Action: How the UN Secretary-General Influences Peacekeeping Policy Through Reports	View Paper Details
Validating Text-as-Data Approaches: A Framework for NLP Applications in Political Science	View Paper Details
Can we Automatedly Measure the Quality of Online Political Discussion? How to (Not) Measure Interactivity, Diversity, Rationality, and Incivility in Online Comments to the News	View Paper Details
Power and the Global Flows of Political Information	View Paper Details
1,000 Speeches vs. GPT-4o: Analyzing Position Shifting in the European Parliament	View Paper Details
Bridging the gap between data and understanding for novice users	View Paper Details
Classifying ideological policy frames: Introducing the ILLFRAMES codebook and the Babel Machine pipeline	View Paper Details
Measuring Personal Attacks in Parliamentary Debates	View Paper Details
"China's Strategic Narratives About AI: Regional Variations In Communication via CGTN on YouTube"	View Paper Details
Policy actor alignments and social media discourse: a structural topic modeling approach to Twitter/X discussions on German legislation	View Paper Details
German Far-Right Mainstreaming on TikTok	View Paper Details

Install the app

Install the app

Text-as-Data Analysis in Political Science: Unlocking Political Insights with New Research Tools