Install the app

Install this application on your home screen for quick and easy access when you’re on the go.

Just tap Share then “Add to Home Screen”


Install the app

Install this application on your home screen for quick and easy access when you’re on the go.

Just tap Share then “Add to Home Screen”

Regression Refresher (before you take a more advanced stats course)

Levente Littvay

Central European University

Levente Littvay researches survey and quantitative methodology, twin and family studies and the psychology of radicalism and populism.

He is an award-winning teacher of graduate courses in applied statistics with a topical emphasis in electoral politics, voting behaviour, political psychology and American politics.

He is one of the Academic Convenors of ECPR’s Methods School, and is Associate Editor of Twin Research and Human Genetics and head of the survey team at Team Populism.


Course Dates and Times

Friday 2 March
13:00–15:00 and 15:30–17:00

Saturday 3 March
11:00–12:00 / 13:00–14:30 and 15:00–16:30

Prerequisite Knowledge

This course is a refresher about regression. This means that I assume you know regression. You may just need more context, more in-depth knowledge, maybe just a review before you take a more advanced class.

If you don't know regression, this course is NOT for you.

I expect you to know regression and the basics of inferential statistics, including:

  • hypothesis testing
  • central limit theorem
  • t-test
  • correlations.

If you don’t, I recommend instead the week-long regression class, or, if you have no statistical background, the intro class.

If, however, you are ready to venture into a more advanced statistical topic, such as:

  • panel data
  • time series
  • structural equations
  • multilevel modelling

this is the course for you.

Short Outline

So, you know regression. You run regression models in your work regularly, but you have not stopped to think much about them lately. Once upon a time you may have learned (or even cared) about the assumptions of the models. You may have thought often about what could go wrong.

Then you got into a work flow that simply runs the models and you worry that

  1. you may be making mistakes and/or
  2. you need more advanced techniques but maybe do not have the necessary foundations to venture into those topics.

If that's the case, this course was designed for you.

If you have not thought about linearity, collinearity or heteroscedasticity beyond coming to the realisation that you should look into these issues but you do not; if you never heard of these things and you are regularly running regressions like you know what you are doing, please take this class. It will be good for you.

The secondary goal of this class is to review the foundations you may not have thought much about lately. This could be crucial when taking more advanced courses where the instructor will probably assume you know these problems associated with regression models and want to solve them, typically

  • autocorrelation in multilevel regression
  • panel data and time series
  • measurement error in structural equations
  • limited dependent variables in logistic regression, etc

Long Course Outline

We cover the conceptual and mathematical foundations of bivariate regression, going briefly into the associated hypothesis test. Then we turn to regression’s multivariate form, focused mainly on the functioning of ordinary least squares.

We review the assumptions of regression models, examining why these assumptions exist and what happens if you violate them.

I will offer practical tips on how to deal with assumption violations. We start by discussing model specification, from the perspective of omitted variable bias and the inclusion of unnecessary variables.

We will discuss measurement scales and measurement errors of the variables in the model. We cover homoscedasticity, mean independence, autocorrelation, linearity of relationships and collinearity.

Though strictly speaking not assumptions, I will also outline good practices that prevent assumption violations like the treatment of outliers and diagnosis of all variable distributions.

We cover these topics both in theory and through examples (presented in R, though no knowledge of R is assumed; it really could be done in any software).

Day Topic Details
Friday afternoon Review of Basics

We review the mathematical foundations of OLS regressions

Saturday morning Assumptions

We go through the assumptions of regression models

Saturday afternoon Examples

We look at an applied example, in great detail

Day Readings
Friday afternoon

Gravetter and Wallnau Statistics for the behavioral sciences Ch.16

Lewis-Beck and Lewis-Beck Applied regression: An introduction (2015)

Saturday morning

Fox Regression diagnostics: An introduction (1991)

Software Requirements

I will present a few examples in a recent version of R. No need to follow along in R in the class.

If R is what you would like to learn, take one of the three following short courses:

Automated Web Data Collection with R

Introduction to R (entry level)

Introduction to R (for participants with some prior knowledge in command-line programming)

Hardware Requirements

No computer needed in class. I will show some software examples but the purpose is not to follow the examples together.


Gravetter, Frederick J, and Larry B. Wallnau Statistics for the behavioral sciences Cengage Learning, 2016. Chapter 16

Lewis-Beck, Colin, and Michael Lewis-Beck Applied regression: An introduction Vol. 22. Sage publications, 2015

Fox, John Regression diagnostics: An introduction Vol. 79. Sage, 1991

Reference text / additional reading

Fox, John Applied regression analysis and generalized linear models Sage Publications, 2015

Recommended Courses to Cover Before this One

<p><strong>Summer and Winter School</strong></p> <ul> <li>Introduction to Statistics</li> <li>Introduction to Regression (not if you have taken it recently)</li> </ul>

Recommended Courses to Cover After this One

<p><strong>Summer and Winter School</strong></p> <ul> <li>Logistic Regression / General Linear Model</li> <li>Panel Data Analysis</li> <li>Time Series Analysis</li> <li>Multilevel Modelling</li> <li>Structural Equation Modelling</li> </ul>

Additional Information


This course description may be subject to subsequent adaptations (e.g. taking into account new developments in the field, participant demands, group size, etc). Registered participants will be informed at the time of change.

By registering for this course, you confirm that you possess the knowledge required to follow it. The instructor will not teach these prerequisite items. If in doubt, please contact us before registering.