ECPR

Install the app

Install this application on your home screen for quick and easy access when you’re on the go.

Just tap Share then “Add to Home Screen”

ECPR

Install the app

Install this application on your home screen for quick and easy access when you’re on the go.

Just tap Share then “Add to Home Screen”

Your subscription could not be saved. Please try again.
Your subscription to the ECPR Methods School offers and updates newsletter has been successful.

Discover ECPR's Latest Methods Course Offerings

We use Brevo as our email marketing platform. By clicking below to submit this form, you acknowledge that the information you provided will be transferred to Brevo for processing in accordance with their terms of use.

Panel Data Analysis: Hierarchical Structures, Heterogeneity and Serial Dependence

Course Dates and Times

Monday 29 February to Friday 4 March 2016
Generally classes are either 09:00-12:30 or 14:00-17:30
15 hours over 5 days

 

Christian Aßmann

christian.assmann@uni-bamberg.de

University of Bamberg

The objective of this course is to familiarize the participants with the application of statistical methods to handle hierarchical structures, heterogeneity and serial dependence structures as typical features of panel data. These issues are typically characterized as advanced topics in statistical analysis and modelling. Correct handling of these issues is however of crucial importance for parameter inference in statistical analysis using panel data. Hierarchical and serial dependence structures will be discussed in form of the archetypical modeling devices for heterogeneity covering mixed modelling, fixed cluster specific parameters, and autoregressive structures. Illustration will be performed using linear regression models for continuous variables, as well as models for binary variables. Thereby, the course aims to provide the theoretical and implementation principles, which allow the participants for acquiring knowledge about further methodologies not covered by the course in self-study.


Instructor Bio

Empirical analysis based on panel data facilitates the user on the one hand to specify richer models allowing for assessing dynamics and conditional effects, it requires on the other hand explicit handling of latent heterogeneity and serial dependence among the observations. Defining panel data as repeated measurements on the same sample of objects, it provides data referenced by two indices. Considering that panel data sets can be classified according to dimension and scale of the dependent variable under consideration, this lecture will discuss the statistical approaches prevailing within the literature to handle these features for typical panel data environments covering metric, binary, ordinal, and categorical dependent variables and dimensions of data sets ranging from small cross sections (sample size) and moderate time dimension (Macro panel or time series cross section data) to large cross sections with small time dimension (micro panel data). Parameter estimation within this model frameworks will be presented in terms of likelihood based estimation with extensive cross reference to alternative estimation procedures. Empirical illustration of the panel estimation routines will be based on typical publicly available panel data sets.

Day 1 – Lecture

The contents of day 1 will pay attention to the typical characteristics of panel data and highlight why these panel data characteristics may cause a failure of standard properties of estimators routinely used in the analysis of linear regression models. Based on a review of the Least Squares estimator in matrix notation, properties of Maximum Likelihood estimation and test routines, are assessed in the context of panel data. Next to the linear regression model as a workhorse for illustrating the effects of panel data on standard estimators, the lecture will discuss binary panel probit models as a model framework often applied in empirical analysis.

Day 1 – Computer Exercise

The computer exercise aims at providing hands on experience with the behaviour of standard estimators, e.g. the least squares estimator within the linear regression framework, when applied to panel data. This includes running simulations allowing for gauging the behaviour of the estimator, when classical assumptions such as homoscedasticity and absence of serial correlation among the errors of a linear regression model are violated. Next to estimators for the linear regression model, also estimation for the standard binary panel probit model is implemented to serve for the basis of upcoming panel extensions. The participants will learn how to use available implementations in R, as well as using own code developed in the course.

Day 2 – Lecture

The second lecture will present the typical modelling and estimation procedures to cope with latent heterogeneity and serial dependence. Next to use of robust standard errors, random coefficients as an extension to a random effects specification and cluster specific or fixed effects are discussed to deal with latent heterogeneity. The discussion will highlight how each of the proposed modelling devices copes with the incidental parameter problem inherent to panel data. The consequences on estimation for each of the modelling devices will be presented for both the linear regression model and the binary panel probit framework.

Day 3 – Computer Exercise

The participants will exercise the handling of panel data structures discussed in the lecture. One focus is set on interpretation the parameter estimates in empirical analysis using available code and data sets. Further, a focus is set also on using the discussed modelling devices in the context of unbalanced panel data sets, i.e. data sets with a different number of observations per sample unit.

Day 4 – Lecture

The lecture presents modelling of correlation dependence structures in Heckman type simultaneous regression frameworks. A special focus is set on the inherent identification problem and the robustness of parameter inference under alternative model specifications incorporating further panel structures to different degrees. Given the variety of model devices to cope with latent heterogeneity the lecture will present model specification tests that allow for checking the empirical adequacy of the different model specifications.

Day 4 – Computer Exercise

Estimation routines for the considered Heckman type simultaneous regression framework are implemented. Given the implemented estimation routines, model specification and testing devices are assessed for simulated as well as empirical data sets to assess their statistical power.

Day 5 – Lecture

As panel data are often based on surveys, missing values for single measurements on individuals will inevitably occur in any non-mandatory survey. Standard panel estimation routines in case of missing values have not only to face the risk to loose single observations per individual thus reducing estimation efficiency, but complete cross sectional units, thus inducing bias as well. As one possible strategy, the lecture will discuss the use of multiple imputation adapted towards panel data requirements with a short reference to alternative approaches based on direct modelling or data augmentation.

Day 5 – Computer Exercise

Participants will be presented possibilities to adapt existing multiple imputation routines towards panel data requirements. Given the provided implementation, the participants will check the effects of ignoring missing values in the data and accounting for the missing values by multiple imputation using both simulated and empirical data sets.

Participants should have knowledge of:

  • Least-Squares Estimation in the Linear Regression Model under the classical assumptions
  • Properties of sample means, sample variances, and the expectation operator
  • Vector and matrix notation
  • Principles of maximum likelihood estimation under standard regularity conditions
  • Statistical software R
Day Topic Details
1 Panel data and statistical concepts for parameter estimation Scope of panel data, Failures of classical assumptions within linear regression for panel data, maximum likelihood estimation and test routines (1.5h lectures + 1.5 hours lab)
2 Modelling hierarchical structures and parameter estimation I Incidental parameter problem, random coefficient modelling, cluster or individual specific parameters, autoregressive structures (2 x 1.5h lecture)
3 Modelling hierarchical structures and parameter estimation II Empirical application of estimation routines for the linear regression model and binary panel probit model enriched with random coefficient, cluster specific parameters and serial correlation (2 x 1.5h lab)
4 Correlation structures in Heckman type simultaneous regression models Inference within Heckman type simultaneous equation systems (1.5h lecture + 1.5h lab)
5 Handling missing values in panel data analysis Types of missing values in panel data, statistical methodologies to handle missing values in parameter estimation (1.5h lecture + 1.5h lab)
Day Readings
1 Greene, William H. (2012): Econometric Analysis. 7th ed., ch. 3, 4, 13 and 14.
2 Baltagi, Badi H. (2014), Econometric Analysis of Panel Data. 5th ed., ch. 2 and 3. Lancaster, T. (2000): The incidental parameter problem since 1948. In: Journal of Econometrics 95, S. 391-413.
3 Greene, W. (2004,a): The behaviour of the maximum likelihood estimator for limited dependent variable models in the presence of fixed effects. In: Econometrics Journal 7, S. 98-119. Greene, William (2004,b): Convenient estimators for the panel probit model: Further results. In: Empirical Economics 29, S. 21-47.
4 Heckman, James J. (1978): Dummy Endogenous Variables in a Simultaneous Equation System. In: Econometrica 46 (6), S. 931-959.
5 van Buuren, S.; Groothuis-Oudshoorn, K. (2011): mice: Multivariate Imputation by Chained Equations. In: Journal of Statistical Software 45, S. 1-67.

Software Requirements

Not applicable, available software in Bamberg sufficient.

Hardware Requirements

Not applicable, available software in Bamberg sufficient.

Literature


Sylvia Frühwirth-Schnatter (2011): Panel data analysis: a survey on model-based clustering of time series. In: Advances in Data Analysis and Classification 5 (4), S. 251-280.

Badi H. Baltagi (2014): Econometric Analysis of Panel Data. New York: Wiley.

Beck, Nathaniel; Katz, Jonathan N. (2007): Random Coefficient Models for Time-Series-Cross-Section Data: Monte Carlo Experiments. In: Political Analysis 15, S. 182-195.

Cameron, A. Colin and Pravin K. Trivedi (2005): Microeconometrics: Methods and Applications. New York: Cambridge University Press.

Greene, William H. (2012): Econometric analysis. 7. ed. Upper Saddle River and NJ: Pearson.

Recommended Courses to Cover Before this One

Winter School: Short Introduction to R Introduction to R with Advanced Methods and Elements of Reproducible Research

Summer School: Introduction to Generalized Linear Modelling Introduction to R

Recommended Courses to Cover After this One

Winter School: Advanced Discrete Choice Modelling Introduction to Bayesian Inference