ECPR

Install the app

Install this application on your home screen for quick and easy access when you’re on the go.

Just tap Share then “Add to Home Screen”

ECPR

Install the app

Install this application on your home screen for quick and easy access when you’re on the go.

Just tap Share then “Add to Home Screen”

Your subscription could not be saved. Please try again.
Your subscription to the ECPR Methods School offers and updates newsletter has been successful.

Discover ECPR's Latest Methods Course Offerings

We use Brevo as our email marketing platform. By clicking below to submit this form, you acknowledge that the information you provided will be transferred to Brevo for processing in accordance with their terms of use.

Introduction to Statistics for Political and Social Scientists

Course Dates and Times

Monday 29 February to Friday 4 March 2016
Generally classes are either 09:00-12:30 or 14:00-17:30
15 hours over 5 days

Florian Weiler

florian.weiler@rug.nl

Rijksuniversiteit Groningen

This course introduces students to the basic ideas of descriptive and inferential statistics. The first two days of the course cover the necessary basics: variables, randomisation, centrality and variability, probability distributions, point and interval estimates, etc. On the third day hypothesis testing will be covered, and in the last two days we will discuss correlation, simple and multiple linear regression models. Finally, we will look at model assumptions and violations, and how regression models can be improved. The course and the lab session will be taught in R.


Instructor Bio

Florian Weiler is a senior researcher at the University of Basel, where he teaches statistics and content courses. He earned his doctoral degree at ETH Zurich.

Before joining the University of Basel, he worked as a lecturer in Quantitative Politics at the University of Kent, and as a postdoctoral researcher at the University of Bamberg. 

Florian's main research interests are in the fields of environmental politics and interest group research.

How can we detect voting irregularities? What are the conditions for the onset (or cessation) of civil war? How do democracies choose electoral systems? In what sense (if any) does democracy (or trade) facilitate international cooperation? The field of quantitative political methodology addresses these questions and many others by using and developing statistical methods that combine data analysis with political science theory. This course provides an introduction to the tools used in basic quantitative political methodology. The first half of the course covers introductory (univariate) statistics, while the second half of the course focuses on regression models.

On the first day, after clarifying basic terminology, we start with defining what variables are, and then we discuss various sampling techniques (e.g. simple random sample, cluster sampling, stratified sampling) and randomisation. Then descriptive statistics will be discussed, for example how tables and graphs can be used to summarise (and to better understand) the data, but also the centre and the variability of variables will be discussed. In addition, I will also present bivariate descriptive statistic (both in table from and as graphs).

The second session starts with a discussion of probability distributions for both discrete and continuous variables, with a particular focus on the normal distribution. Then we will talk about sampling distributions, and how to use sample data to estimate population parameters (such as point and interval estimates). The choice of the sample size will also be covered.

On day 3 we will mostly focus on hypothesis testing. The logic of a significance tests is described, type I and type II errors are distinguished, and we learn how to employ statistical tests to compare two groups.

At the beginning of Day 4 we cover associations between categorical variables, how to detect patterns of association, and how to measure association in contingency tables. In the second part of this day’s lecture simple linear regression will be introduced. The following topics will be covered: ordinary least squares, interpreting linear models, model assumptions and violations, and graphical representation.

In the last session we will discuss multiple regression analysis, the concept of control variables, and the difference between correlation and causation. Then we will try to improve the model by revisiting the model assumptions, and how we can detect (and correct) potential model violations. Interaction effects will also be covered in this session.

The course we will be taught in R, a very powerful and versatile computing environment. R has the huge advantage of been free, open software, and students can bring and work on his/her own computer if they want to. In the first lab session, some of the basics of R will be introduced, yet students are encouraged to participate in the course “Introduction to R” at the ECPR Winter School if they are not yet familiar with the programme.

The course is designed for beginners and no prior statistical (or computing) knowledge is required. However, the course covers many topics in a relatively short period of time. This requires intensive work by students, who are expected to participate in the lab sessions and in addition prepare for the sessions by completing the assigned readings.

Important: This course is not an introduction to R, as some students assumed last year, but an introduction to statistics. R is only the tool of choice in this course to implement the techniques we learned during the lectures in the lab sessions. This means that during the lectures R will only be used every now and then. Instead, the concepts will be discussed theoretically. During the lab sessions course participants then have a chance to directly apply the learned techniques, and at the same time to get to know the R programming language better. Still, participants are encouraged to visit the preparatory course “Introduction to R” offered by the Winter School in the week before this course begins.

No statistical knowledge required. Some basic knowledge of R (or any other statistical software package) would be desirable. Therefore, students are advised to take one of the software courses offered by the ECPR Winter School.

Day Topic Details
1 Sampling; Descriptive Statistics

- Variables - Randomisation - Sampling - Tables and graphs to describe data - Centre and variability of data - Bivariate descriptive statistics - Sample statistics and population parameters - Lab: Data management, descriptive statistics, basic plots

2 Probability Distributions; Statistical Inference - Probability distributions - The normal distribution - Point and interval estimation - Confidence intervals - Lab: Probability distributions, sampling distributions, statistical inference
3 Hypothesis Testing; Comparison of Two Groups - Significance tests - Types of errors - Comparing proportions and means - Lab: Hypothesis testing, group comparison
4 Association between Categorical Variables; Linear Regression - Contingency tables - Chi-squared test of independence - Detecting and measuring association - Correlation - Least squares - The linear regression model - Assumptions and violations - Lab: Contingency tables, simple linear regression
5 Multivariate Relationships; Multiple Regression - Association and Causality - Control variables - The multiple regression model - Interaction effects - Improving the model - Short overview of other models - Lab: Multivariate regression, interaction effects
Day Readings
1 Agresti & Finlay, Ch. 1, 2, 3; Dalgaard, Ch. 1, 3
2 Agresti & Finlay, Ch. 4, 5; Dalgaard, Ch. 2
3 Agresti & Finlay, Ch. 6, 7; Dalgaard, Ch. 4
4 Agresti & Finlay, Ch. 8, 9; Dalgaard, Ch. 5
5 Agresti & Finlay, Ch. 10, 11; Dalgaard, Ch. 9, 10

Software Requirements

If students want to use their own computer, they should download R Version 3.0.0 or higher from http://www.r-project.org/ In addition, I recommend downloading RStudio from http://www.rstudio.com/

Hardware Requirements

Any fairly modern computer able to run R should be good enough for this course. Students will need an internet connection if they use their own computer, as we will need to download R packages during the course.

Literature


For the course:

Agresti, Alan and Barbara Finlay (2008): Statistical Methods for the Social Sciences (4th Edition), Upper Saddle River: Prentice Hall.

Dalgaard, Peter (2002): Introductory Statistics with R, New York: Springer.

Further readings:

Fox, John (2008): Applied Regression Analysis and Generalized Linear Models, London: Sage.

Fox, John and Sanford Weisberg (2012): An R Companion to Applied Regression (2nd Edition), London, Sage.

Gujarati, Damodar and Dawn C. Porter (2009): Basic Econometrics (5th edition), New York: McGraw Hill.

Wooldridge, Jeffrey (2013): Introductory Econometrics: A Modern Approach (5th edition), Mason: South-Western.

Recommended Courses to Cover Before this One

  • Winter School 2016: Introduction to R