ECPR

Install the app

Install this application on your home screen for quick and easy access when you’re on the go.

Just tap Share then “Add to Home Screen”

ECPR

Install the app

Install this application on your home screen for quick and easy access when you’re on the go.

Just tap Share then “Add to Home Screen”

Back to Panel Details
Back to Panel Details

Panel Data Analysis

Andrew X. Li
lixiang577@gmail.com

Central European University

Andrew X. Li is an assistant professor at the Department of International Relations, Central European University.

He obtained his PhD from the National University of Singapore and King’s College London.

Andrew's research and teaching interests include international political economy, research design and quantitative methods. He teaches the Research Design and Methods in IR course series at CEU.

Twitter @lixiang577

Course Dates and Times

Monday 29 July – Friday 2 August

Prerequisite Knowledge

This course builds on ordinary least square (OLS) regression and extends it to data with a time-series-cross-sectional structure.

You need

  • familiarity with basic statistical concepts such as sample mean and sample variance as well as their properties
  • basic knowledge of regression analysis (up to multiple regression)
  • basic skills in Stata.

If you do not have this knowledge, I strongly encourage you to take one of the following, either before or concurrently with this course:

Introduction to STATA

Introduction to Inferential Statistics: What you need to know before you take regression

Multiple Regression Analysis: Estimation, Diagnostics, and Modelling

This course is designed for students who are somewhat familiar with the content covered up to Chapter 9 of Wooldridge’s 2016 textbook (see below) but do not have a good knowledge of the materials beyond that. This is an easy way to check if this is the right course for you.


Short Outline

This course places an emphasis on the connections between panel data methods and causal inference, which is the primary goal of social science research.

Panel data is particularly advantageous for causal inference because it allows researchers to control for entity-specific factors (individual heterogeneity) such as geography and culture, which may not be observed and can be difficult to measure.

Panel data usually contain more degrees of freedom and more sample variability than purely cross-sectional data or time-series data, and hence improve the efficiency of the parameter estimates.

The course begins with a quick overview of causal inference and a review of the standard OLS assumptions.

It then moves on with simple panel data methods, fixed and random effect estimators, then more advanced methods such as clustered samples, panel instrumental variable methods, panel corrected standard error estimator and dynamic panel regressions, which deal with various violations of the standard OLS assumptions.

Two or three lab sessions put these methods into practice, using Stata.

The last session is a seminar in which those who want extra credits can present their research or research proposals and receive feedback from the Instructor and peers. You are therefore encouraged to bring your own data to the course.

ECTS Credits

2 credits Meet the attendance requirements and participate actively in class.

3 credits As above, plus give a presentation on your research or research proposals during the final session.

4 credits As above, plus submit a 2,500–3,000 word research paper/proposal that uses panel data analysis in the research design.


Long Course Outline

This course introduces various econometric models and techniques that can be applied to panel data. It also emphasises the connections between these methods and causal inference, which is the primary goal of social science research.


Day 1
We begin with an overview of causal inference in the context of observational studies. We quickly survey the idea of causality and relevant concepts such as average treatment effect (ATE). Importantly, we will draw connections between these concepts and regression analysis. We also quickly review the set of assumptions on which OLS regression hinges.

In the second half of the day, we look at simple panel data methods. Specifically, we make a distinction between ‘independently pooled across section’ data and ‘panel’ data. The latter is the focus of this course. We’ll see that panel data differs in some important respects from an independently pooled cross-section in that a panel consists of the same entities (individuals, firms, countries or whatever) across time. We then study two basic panel data methods: two-period panel data analysis and first differencing.

Day 2
We focus on slightly more advanced methods for estimating the unobserved effects in the context of panel data analysis. We introduce the fixed effects estimator, which, like first differencing, uses a transformation to remove the unobserved effect prior to estimation. As a result, any time-constant explanatory variables are removed in the process. In contrast, we introduce the random effect estimator, which looks attractive when we think the unobserved effect is uncorrelated with all the explanatory variables. For example, when we have good knowledge about the factors affecting the dependent variable and have controlled for these factors in the equation, random effect estimator can sometimes be the preferred strategy. With these foundations, we then look into a relatively new correlated random effects approach, which provides a synthesis of fixed effects and random effects methods, and has been shown to be practically very useful. In the meantime, you’ll also learn that the usefulness of these methods critically hinges on the assumptions made with respect to the error term and the relationship between the regressors and the error term.

Day 3
We begin with a lab session during which we put into practice the methods introduced in the previous two days. We carry out these analyses in Stata and learn how to interpret the results. If you have brought your own data, this is a good opportunity to carry out the analysis for your current research projects. In the second session, we return to the classroom to learn more about the research designs and methods to which panel data can be applied. We start to relax the assumptions made in the previous discussion and learn methods designed to deal with various violations of these assumptions. The first is instrumental variable (IV) method, which deals with violations to the strict exogeneity assumption. Again, I emphasise the connection between this method and causal inference.

Day 4
I introduce several more advanced panel data methods that address further violations of the standard assumptions. Since panel data contains repeated observations of the same entity over time, we may have different error variances for different panels as well as correlation of the error terms across panels and/or time. To deal with these potential challenges, we study methods such as clustering and robust estimation, panel-corrected standard error (PCSE) estimates and dynamic panel methods (Arellano-Bond estimator). For these more advanced methods, we will not look into the technical details but rely more on an intuitive understanding of the challenges these methods address.

Day 5
In the first session, we return to the lab and put these more advanced methods into practice. We learn how to carry out panel IV analysis, obtain the PCSE and Arellano-Bond estimators and calculate cluster robust standard errors. Again, this is a good opportunity for you to carry out the analyses using your own datasets and check the robustness of the results across various model specifications. The last session is a seminar. Students who want extra credits can present their research or research proposals that use panel data methods, and receive feedback from the Instructor and fellow participants.


This course is a general survey of panel data methods. It attempts to achieve a balance between theories and practical/implementation. If you are interested only in implementing these methods, you may find this course less beneficial.

The limited time available means we won’t have the luxury of going deeply into methodological niches, such as presenting all the techniques for tackling autocorrelation.

 

Day Topic Details
1 Overview of causal inference Review of OLS assumptions Simple panel data methods

Lecture (3 hours)

2 Fixed effect estimator Random effect estimator Correlated random effects approach

Lecture (3 hours)

3 Practical session 1 Instrumental variable method

Lab (1.5 hours)

Lecture (1.5 hours)

4 Clustering and robust estimation Heteroskedasticity Autocorrelation Dynamic panel models

Lecture (3 hours)

5 Practical session 2 Student presentations

Lab (1.5 hours)

Seminar (1.5 hours)

Day Readings
2

Wooldridge, Jeffrey M
Introductory Econometrics: A Modern Approach
Cengage Learning, 2016: Chapter 14, pp.434-457

Beck, Nathaniel
'Time-series–cross-section data: What have we learned in the past few years?'
Annual Review of Political Science 4, no. 1 (2001): 271-293

1

Rubin, Donald B
'Estimating causal effects of treatments in randomized and nonrandomized studies'
Journal of educational Psychology 66, no. 5 (1974): 688-701

Wooldridge, Jeffrey M
Introductory Econometrics: A Modern Approach
Cengage Learning, 2016: Chapter 13, pp.402-433

3

Cameron, Adrian Colin and Pravin K. Trivedi
Microeconometrics Using Stata
College Station, TX: Stata Press, 2009. Chapter 8, pp. 229-280

Wooldridge, Jeffrey M
Introductory Econometrics: A Modern Approach
Cengage Learning, 2016: Chapter 15, pp.461-498

4

Arellano, Manuel, and Olympia Bover
'Another Look at the Instrumental Variable Estimation of Error-components Models'
Journal of econometrics 68, no. 1 (1995): 29-51

Beck, Nathaniel, and Jonathan N. Katz
'What to do (and not to do) with time-series cross-section data'
American Political Science Review 89, no. 3 (1995): 634-647

Bond, Stephen R
'
Dynamic panel data models: a guide to micro data methods and practice'
Portuguese Economic Journal 1, no. 2 (2002): 141-162

Newey, Whitney K., and Kenneth D. West
'A Simple, Positive Semi-Definite, Heteroskedasticity and Autocorrelation Consistent Covariance Matrix'
Econometrica (1987): 703-708

Wooldridge, Jeffrey M
Econometric Analysis of Cross Section and Panel Data
MIT Press, 2010. Chapter 20.3, pp. 853-894

5

Cameron, Adrian Colin and Pravin K. Trivedi
Microeconometrics Using Stata
College Station, TX: Stata Press, 2010. Chapter 9, pp. 281-312

Software Requirements

Stata (version 15)

Hardware Requirements

None – the course will be held in a lab

Literature

Holland, Paul W
'Statistics and Causal Inference'
Journal of the American Statistical Association 81, no. 396 (1986): 945-960.

Imbens, Guido W., and Jeffrey M. Wooldridge'
Recent developments in the econometrics of program evaluation'
Journal of Eeconomic Literature 47, no. 1 (2009): 5-86

Nickell, Stephen
'Biases in dynamic models with fixed effects'
Econometrica: Journal of the Econometric Society (1981): 1417-1426

Blundell, Richard, Stephen Bond, and Frank Windmeijer
'Estimation in dynamic panel data models: improving on the performance of the standard GMM estimator'
In Nonstationary panels, panel cointegration, and dynamic panels, pp. 53-91
Emerald Group Publishing Limited, 2001

Recommended Courses to Cover Before this One

<p><br /> <strong>Summer School</strong></p> <p>Introduction to STATA</p> <p>Introduction to Inferential Statistics: What you need to know before you take regression</p> <p>Multiple Regression Analysis: Estimation, Diagnostics, and Modelling</p>

Recommended Courses to Cover After this One

<p><br /> <strong>Summer School</strong></p> <p><span style="background-color:white">Advanced Topics in Applied Regression</span></p> <p><span style="background-color:white">Causal Inference in the Social Sciences II: Difference in Difference, Regression Discontinuity and Instruments</span></p>


Additional Information

Disclaimer

This course description may be subject to subsequent adaptations (e.g. taking into account new developments in the field, participant demands, group size, etc). Registered participants will be informed in due time.

Note from the Academic Convenors

By registering for this course, you confirm that you possess the knowledge required to follow it. The instructor will not teach these prerequisite items. If in doubt, contact the instructor before registering.