ECPR

Install the app

Install this application on your home screen for quick and easy access when you’re on the go.

Just tap Share then “Add to Home Screen”

ECPR

Install the app

Install this application on your home screen for quick and easy access when you’re on the go.

Just tap Share then “Add to Home Screen”

Your subscription could not be saved. Please try again.
Your subscription to the ECPR Methods School offers and updates newsletter has been successful.

Discover ECPR's Latest Methods Course Offerings

We use Brevo as our email marketing platform. By clicking below to submit this form, you acknowledge that the information you provided will be transferred to Brevo for processing in accordance with their terms of use.

Data on a computer screen

Roundtable

Using Synthetic Data in Political Science Research – Promises and Pitfalls

Friday 29 July, 14:30 – 16:00 CEST, Online

The field of political science methods is rapidly evolving, with a persistent shift towards computational social science where large datasets and machine learning methods are becoming ubiquitous. At the same time a shift towards replicability and open science means that, more and more, researchers are required to share their data. This may be problematic due to the need to protect subjects' privacy and ensure that individuals cannot be identified.

Often simply anonymising data, by deleting identifiers such as names from the dataset is not sufficient to ensure the privacy of the individuals and to make it impossible to re-identify the individuals. 

One possible solution to this problem is the use of synthetic data, where researchers create data sets which reflect the underlying dependencies and statistical properties of the original dataset but do not allow the identification of any individuals. Those synthetic datasets can be shared without dramatic consequences for individuals while allowing for replication or being reused for other research projects.

Join us in discussing questions related to synthetic data: 
  • How can such data be generated?
  • When is it useful or necessary to do so?
  • Which tools exist to create synthetic data and how easily can they be implemented by researchers?
  • What are the potential pitfalls and problems with generating or sharing synthetic data?

Chair

Sebastian Koehler Kings College London

Speakers

Christian Arnold

Senior Lecturer at the School of Law and Politics, University of Cardiff

Using data-driven methods from statistics and machine learning, Christian's work lies at the intersection between social science and computer science.

He previously held positions at Oxford University and as a Data Scientist in the industry.

Christian holds a PhD in Political Science from the Graduate School of the University of Mannheim. His research has been published in the Journal of Politics or International Interactions, among others.

He also presented findings relevant to computer science at the International Conference on Machine Learning and the Theory and Practice of Differential Privacy Workshop Series.

Marcel Neunhoeffer

Research Associate at the Chair of Statistics and Data Science for the Social Sciences and Humanities, Ludwig-Maximilians University, Munich

Marcel Neunhoeffer is a postdoc scholar at Boston University (BU) with a joint position at LMU Munich, Germany. Before BU, he spent six months as a research associate at LMU Munich at the chair of Prof. Frauke Kreuter. Marcel is a political scientist by training, with a focus on computational methods for applied research. In particular, he works on the application of new learning algorithms to social science problems, with a focus on differential privacy.