ECPR

Install the app

Install this application on your home screen for quick and easy access when you’re on the go.

Just tap Share then “Add to Home Screen”

ECPR

Install the app

Install this application on your home screen for quick and easy access when you’re on the go.

Just tap Share then “Add to Home Screen”

Back to Panel Details
Back to Panel Details

Interpreting Binary Logistic Regression Models

Markus Wagner
markus.wagner@univie.ac.at

University of Vienna

Course Dates and Times

Monday 6 to Friday 10 March 2017
Generally classes are either 09:00-12:30 or 14:00-17:30
15 hours over 5 days

Prerequisite Knowledge

This course requires an intermediate level of prior knowledge. Students should have good knowledge of basic statistical techniques up to and including multiple linear (OLS) regression models. Some knowledge of binary logistic regression models is useful (though not expected) as this course will focus on interpretation and diagnostics for such models. Participants should have basic knowledge of how to use Stata, though detailed knowledge is not expected. For students without prior knowledge of Stata, the pre-sessional course on Stata is recommended.


Short Outline

This course will provide participants with the detailed understanding and advanced skills needed to interpret the results of logistic regression models with binary outcome variables. Logistic regression models are very common in the social sciences, but their interpretation is different than for OLS regression models. To this end, this course will help participants to interpret the results of binary logistic regression models using log odds, odds ratios and above all predicted probabilities. A particular focus of this course will be how to present results in tables and graphs. This will help participants make the most out of their results, both for academic papers and for making their research accessible to lay readers. In addition, the techniques learnt in this course can be transferred to related models such as ordinal and multinomial logistic regression. The software package Stata will be used as this provides a user-friendly way of interpreting logistic regression results.


Long Course Outline

This course will provide participants with the detailed understanding and advanced skills needed to interpret the results of logistic regression models with binary outcome variables.

Many courses on statistical methods only briefly cover how to interpret the results of logistic regression models. However, the results of such models are more difficult to interpret and to present than those for OLS regression models. In particular, interpreting logistic regression results is not limited to looking at coefficients and standard errors: we need to go beyond the simple regression output to gain a full understanding of what our results mean. This course will provide researchers with the tools to make the most of their regression results and to present their results in the most effective way possible. The focus throughout this course will be on practical analysis rather than on statistical theory.

By the end of this course, participants will be able to:

  • interpret the results of binary logistic regression models using log odds, odds ratios and predicted probabilities,
  • present these results as tables and graphs in ways suitable for general and specialist audiences,
  • interpret interaction effects in the appropriate ways,
  • use simulations to create measures of uncertainty for the predicted effects,
  • distinguish different measures of model fit and include these in presentations of results,
  • run straightforward diagnostic tests of their model,
  • use Stata to run and understand binary logistic regression models.

The format of classes will be informal. Lectures will be short, and the focus of classes will be computer exercises and classroom discussions of results and homework. Lectures will take up at most a third of the overall classroom time as the focus in this class is on practical analysis. Students are encouraged to bring along their own data and research questions.

There will be a homework assignment after each class that will help participants to gain further understanding and experience in interpreting binary logistic regression models. We will begin each class with a short discussion of the homework.

On Day 1, we will begin with a review of binary logistic regression models. What is the statistical rationale for these models, and how do they differ from OLS models? This session will also provide a review of basic commands for Stata, with a particular focus on running binary logistic regression models. Then, we will consider the two easiest ways of interpreting the results of binary logistic regressions. We will discuss how to interpret regression coefficients and then move on to using the odds ratios interpretation of the results.

On Day 2, we will begin by reviewing the homework on log odds and odds ratios. Then, we will discuss advantages and disadvantages of interpretation using odds ratios. After that, we will move on to interpreting binary logistic regression results using predicted probabilities. We will learn how to calculate predicted probabilities ‘by hand’. We will then cover the various ways in which Stata can help in the calculation of predicted probabilities.

On Day 3, we will learn how to present regression results and predicted probabilities graphically. Advantages and disadvantages of presenting results using predicted probabilities will be discussed. We will then consider the importance of uncertainty and how this can be included in the presentation of predicted probabilities. We will learn how to use Stata and the Clarify package to run simulations. We will also run these simulations without Clarify in order to understand the intuition and processes involved. Finally, we will consider the level at which we should set other variables when calculating predicted probabilities.

On Day 4, we will concentrate on the interpretation of interaction effects in binary logistic regression models. Here, results need to be presented particularly clearly and carefully for readers to understand results well.

Day 5 will consider other topics related to the interpretation of binary logistic regression models. The focus will lie on diagnostic techniques and on measures of model fit. We will discuss how best to present information on the variation explained by the model in the absence of a clear equivalent of R2-values.

This course is at an advanced level, so some familiarity with OLS regression as well as with the basics of logistic regression will be assumed. As a mainly hands-on, practical course, knowledge of algebra or statistical theory is not required. Potential participants should contact the instructor if they are unsure whether they have the required knowledge and if they would like to know how to prepare for the class.

The main datasets we use will be high-quality election surveys. Participants are also encouraged to bring along their own data and research questions. Time permitting, these can be used in class as examples. Personal appointments for discussing research results will also be possible.

While this course does not cover logistic regression models using categorical or ordinal outcome variables, the techniques introduced in the class can easily applied to such models as well. Where possible, reference to these models can be made in class. However, those students who wish to focus on categorical data as their dependent variable may wish to choose a different class.

Day Topic Details
1 Review Log odds and odds ratios Rationale for logistic regression models; differences between OLS and logistic regression models; running binary logistic regression models in Stata; interpreting the effects of explanatory variables as the effects on the log odds and on odds ratios; useful Stata commands for understanding log odds and odds ratios; presenting and interpreting odds ratios in presentations and papers (1.5 hrs lecture, 1.5 hrs lab)
2 Predicted probabilities (1) Advantages and disadvantages of odds ratio interpretation of logistic regression models; advantages of using predicted probabilities; basic Stata commands for predicted probabilities (1 hr lecture, 2 hrs lab)
3 Predicted probabilities (2) Graphs of effects (coefficients and predicted probabilities); advantages of using simulations to assess effect uncertainty; running simulations using Stata (with Clarify and without); calculating predicted probabilities: observed case versus average value approaches (1hr lecture, 2 hrs lab)
4 Interaction effects Evaluating and presenting the results of interaction effects, including quadratic effects; differences in interpreting interaction effects compared to OLS models (1 hr lecture, 2 hrs lab)
5 Diagnostics and model fit Simple diagnostic techniques for binary logistic regression models; appropriate measures of model fit (1.5 hrs lecture, 1.5 hrs lab)
Day Readings
1 Long (1997), ch.1-3. Orme and Combs-Orme (2009), ch.1-2. Long and Freese (2006), section 4.7.
2 Orme and Combs-Orme (2009), ch.2. Long and Freese (2006), sections 3.6, 4.6 and 4.7.
3 Mood (2010) Hanmer and Ozan Kalkan (2012) King et al. (2000)
4 Long and Freese (2006), sections 9.2 to 9.4 . Brambor et al. (2006) Berry et al. (2010) Berry et al. (2012) Tsai and Gill (2013)
5 Long and Freese (2006), sections 4.4 and 4.5. Menard (2001), chs. 2 and 4. Esarey and Pierce (2012)

Software Requirements

Stata 13 or above

Hardware Requirements

None

Literature


Introductory

Orme, John G. and Terri Combs-Orme (2009) Multiple Regression with Discrete Dependent Variables, Oxford University Press: Oxford.

Long, J. Scott (1997) Regression Models for Categorical and Limited Dependent Variables, Sage: Thousand Oaks.

Long, J. Scott and Jeremy Freese (2006) Regression Models for Categorical and Dependent Variables using Stata, 2nd edition, Stata Press: College Station.

Menard, Scott (2001) Applied Logistic Regression Analysis, 2nd edition, Sage: London.

Pampel, Fred C. (2000) Logistic Regression: A Primer, Sage: London

General Issues

Hanmer, M. J., & Ozan Kalkan, K. (2012). Behind the Curve: Clarifying the Best Approach to Calculating Predicted Probabilities and Marginal Effects from Limited Dependent Variable Models. American Journal of Political Science, 57(1), 263-277.

Hellevik, O. (2009). Linear versus logistic regression when the dependent variable is a dichotomy. Quality & Quantity, 43(1), 59-74.

Hoetker, G. (2007). The use of logit and probit models in strategic management research: Critical issues. Strategic Management Journal, 28(4), 331-343.

Mood, Carina (2010) Logistic regression: Why we cannot do what we think we can do, and what we can do about it, European Sociological Review 26 (1): 67-82.

Simulations

King, Gary, Michael Tomz, and Jason Wittenberg (2000) Making the Most of Statistical Analyses: Improving Interpretation and Presentation, American Journal of Political Science 44: 341-355.

Zelner, B. A. (2009). Using simulation to interpret results from logit, probit, and other nonlinear models. Strategic Management Journal, 30 (12), 1335-1348.

Interaction Effects

Berry, William D., Jacqueline H.R. DeMeritt and Justin Esarey (2010) Testing for interaction in binary logit and probit models: Is a product term essential? American Journal of Political Science 54 (1): 248-266.

Berry, William D., Matt Golder and Daniel Milton (2012) Improving tests of theories positing interaction, Journal of Politics, 74 (3): 653-671.

Brambor, Thomas, William Clark and Matt Golder (2006) Understanding interaction models: Improving empirical analyses, Political Analysis 14 (1): 63-82.

Tsai, T.-h., & Gill, J. (2013). Interactions in Generalized Linear Models: Theoretical Issues and an Application to Personal Vote-Earning Attributes. Social Sciences, 2(2): 91-113.

Goodness-of-fit Tests

Esarey, Justin and Andrew Pierce (2012) Assessing Fit Quality and Testing for Misspecification in Binary-Dependent Variable Models. Political Analysis, 20 (4): 480-500.

Recommended Courses to Cover Before this One

<p>Summer School Multivariate Statistical Analysis and Comparative Crossnational Surveys Data Multiple Regression Analysis: Estimation, Diagnostics and Modelling</p>

Recommended Courses to Cover After this One

<p>Winter School Advanced Discrete Choice Modelling</p>


Additional Information

Disclaimer

This course description may be subject to subsequent adaptations (e.g. taking into account new developments in the field, participant demands, group size, etc). Registered participants will be informed in due time.

Note from the Academic Conveners

By registering for this course, you confirm that you possess the knowledge required to follow it. The instructor will not teach these prerequisite items. If in doubt, contact the instructor before registering.