ECPR

Install the app

Install this application on your home screen for quick and easy access when you’re on the go.

Just tap Share then “Add to Home Screen”

Data Mining for Theorists: A Simulated Basis Regression Approach to Testing Formal Models

Curtis Signorino
University of Rochester
Brenton Kenkel
University of Rochester
Curtis Signorino
University of Rochester

Abstract

Among those interested in statistically testing formal models, two approaches dominate. The structural estimation approach assumes random utility, derives a structural probability model based on the formal model, and then estimates parameters associated with that model. The comparative statics approach generally applies off-the-shelf techniques -- such as OLS, logit, or probit -- to test whether a regressor (or regressors) are related to a decision variable according to the comparative statics predictions. Both methods have their limitations. Structural estimation applied to observational data suffers from well-known structural misspecification problems: if our structural model is not the “true” model, then our results will be biased. Although it is less well-known, the comparative statics approach suffers from the same problem. Additionally, the comparative statics approach can be viewed as partial estimation of a larger model. In this context, omission of the other aspects of the model can lead to biased estimates as well. In this paper, we take a different approach to testing formal models – one that is closer to the comparative statics approach above, but without the structural component. We assume a formal model provides equilibrium predictions of the relationship between some exogenous variables and a decision variable of interest. We test that relationship using a variant of a basis regression. Basis regressions (using splines or polynomials) have seen little use in Political Science. The potential benefit of basis regression is the ability to model a wide range of highly nonlinear functional forms. One need not assume a particular functional relationship. There are, however, two potential problems with this approach: too many parameters and over-fitting. Concerning the latter, we demonstrate that a simulated basis regression approach not only allows us to capture the highly nonlinear functional relationships embodied in game-theoretic models, but avoids over-fitting as well. Concerning the former, we compare three methods for variable/parameter selection and demonstrate that (1) a third-order approximation performs relatively well, (2) an AIC-based selection process performs much better, and (3) the "lasso" method performs best of all. We demonstrate this technique using monte carlo, observational, and experimental data from a deterrence game, a signaling game, and an ultimatum game. Finally, we provide an R package (polywog) that implements the techniques demonstrated here.