ECPR

Install the app

Install this application on your home screen for quick and easy access when you’re on the go.

Just tap Share then “Add to Home Screen”

ECPR

Install the app

Install this application on your home screen for quick and easy access when you’re on the go.

Just tap Share then “Add to Home Screen”

Your subscription could not be saved. Please try again.
Your subscription to the ECPR Methods School offers and updates newsletter has been successful.

Discover ECPR's Latest Methods Course Offerings

We use Brevo as our email marketing platform. By clicking below to submit this form, you acknowledge that the information you provided will be transferred to Brevo for processing in accordance with their terms of use.

Multivariate Statistical Techniques for Comparing Countries

Course Dates and Times

Monday 29 July – Friday 2 August and Monday 5 – Friday 9 August

09:00–10:30 and 11:00–12:30

Bruno Cautrès

bruno.cautres@gmail.com

Sciences Po Paris

''I took this course 11 years ago and it changed my career. Excellent instructor, very useful course; I strongly recommend it!''
— Giulia Sandri, Maître de Conférences/Associate Professor in Political Science
European School of Political and Social Sciences (ESPOL)
Université Catholique de Lille
 

This course offers an overview and practical experience of the major theoretical and practical issues faced during the statistical analysis of cross-national comparative surveys.

It teaches the fundamentals of statistical techniques for analysing such surveys (without going into complex mathematical things) and the practical problems encountered when using them.

It presents in their most simple formats, multivariate methods, their use in cross-national comparative analysis.

We cover the most commonly used techniques such as linear regression analysis (and ANCOVA), logistic regression, loglinear models, factor analysis and correspondence analysis.

Typically, students attending the course want to use cross-national datasets (ESS, EVS, Eurobarometers, ISSP, macro-comparative data bases) for their research and their key problem may be the homogeneity/heterogeneity of the statistical relationship between variables across their country cases.

We can investigate the ‘black box’ explanation of country effect through multivariate technique testing for this homogeneity/heterogeneity hypothesis.

More generally, the statistical methods used in this course are relevant for ‘multiple groups’ analysis when groups are countries, but they could be any groups such as gender, ethnic or regional.

In just two weeks, this course will give you strong foundational multivariate analysis skills.

It’s ideal preparation for a more advanced course in, for instance, multilevel modelling or structural equation modelling.

ECTS credits for this course and, below, tasks for additional credits:

6 credits As above, plus a PowerPoint presentation of an empirical analysis mini-research project during the final session.

8 credits As above, plus submit a 25,000–35,000-character paper no later than four weeks after the Summer School, which extends the mini-research presentation.


Instructor Bio

Bruno Cautrès is attached to CEVIPOF – Centre de recherches politiques de Sciences Po (Paris), at the Fondation Nationale des Sciences Politiques in Paris.

He is a senior CNRS research fellow with interests in voting behaviour, political attitudes and behaviours, comparative survey research and quantitative techniques.

Bruno is involved in a variety of projects, including the European Social Survey, European Values Studies, International Social Survey Programme and European elections studies; and he participates in the development of elections studies in France. His current research programme concerns political trust and attitudes to democracy in France.

@BCautres

A growing practice in the social sciences is the comparison of social and political behaviours between social groups. The focus can be on economic, political, cultural, socio-psychological attitudes and behaviours, it can be at an individual or collective level, and the groups of interest can be defined by gender, ethnicity, age, or generation.

One particular type of group is at the centre of this course: nations or countries. Indeed, the trend in cross-national and cross-cultural research has gained greater popularity in recent decades, especially since the developments of comparative cross-national databases such as Eurobarometer, EVS, ISSP and ESS.

Thanks to methodological developments in comparative survey designs such as the one achieved by ESS or equivalent surveys, we have taken a significant step forward. A new big challenge for comparative survey analysis, therefore, is to analyse these comparative datasets with statistical techniques and methods that make it possible to test for country effects.

Unfortunately, many users still simply juxtapose their results country by country, and few are really dealing with the logic of statistical control for country effects. When they introduce country effects in their explanatory models, it is mostly in the form of dummy effects coding in regression analysis, which only account for the heterogeneity of the dependent variable in the different countries.

But how to test for heterogeneity/homogeneity of beta parameters, the one estimating the effects of the explanatory variables on the dependent one?

How can we know that beta parameters estimated by a logistic regression analysis are statistically different across nations?

How can we know that factorial structures diverge between countries or are similar?

These are the kinds of big issues this course will cover, looking at the most-used methods in statistical comparative analysis.

Each day, we will focus on a specific method, with strong emphasis on two blocks of methods: modelling techniques (regression – linear and logistic, loglinear and latent class models) and data reduction techniques (factor analysis, multiple correspondence analysis). I will emphasise complementarities between these types of methods.

Modelling techniques invite the user to apply a model to their data structures: an algebraic expression such as the classical linear regression model. This family of methods attach great importance to hypothesis testing, in particular to the significance of models and their parameters.

But how do you cope with the countries’ effects in that case? Tests may be sensitive to sample sizes; for instance, if a researcher compares two countries with fairly different N. How do you know that when beta parameters have higher values in one country than another, it really means a country difference? On the other hand, data reduction techniques are great at reducing the space of variables and/or individuals or cases: with big databases like Eurobarometer, ESS or EVS, which have so many indicators, reducing the data space to few dimensions is a very interesting objective. But what constitutes a ‘factorial invariance’ across countries? How could we say that the factorial space is homogeneous or not across nations?

Factor analysis and correspondence analysis propose certain solutions to these problems. These multivariate techniques can also be used as preliminary analysis before running regression models, as can cluster analysis, a powerful method to find groups across databases.

Finally, how do you correlate individual patterns of political/social attitudes and behaviours with the national macro-level characteristics of countries? The course will finish with an introduction to multilevel regression models.

By the end of this course

Though not exhaustive, this course offers a broad overview of some very important methods to help comparatists deal with country effects. It crams a lot into only two weeks; if you were working only with a textbook, you could spend months discovering so many different techniques.

Acknowledging that no method is perfect, the course will show, by theoretical presentations and empirical illustrations, the advantages and limits of each method type. By the end of the course, you should know which methods are best suited to your particular research perspectives and types of data.

Further study

If you want to learn more advanced methods such as confirmatory factor analysis and structural equation modelling, the Summer School offers other courses specialising in these.

This is an introductory level course, even if the methods taught are multivariate.

If you are struggling to enter into the world of multivariate techniques, this course will help you overcome these first difficulties. 

The course requires no skills other than basic descriptive statistics and hypotheses testing, but if you need a refresher in these, email the instructor before the course; if needed the Instructor and/or the TA can propose a remedial on this at the very beginning of the Summer School.

Day Topic Details
1 Basic revisions in statistics

Lecture: 90 minutes

Lab: 90 minutes

2 Comparing simple statistics across countries: comparisons of means and percentages across countries

Lecture: 90 minutes

Lab: 90 minutes

3 Correlation, causality and comparison across countries

Lecture: 90 minutes

Lab: 90 minutes

4 Regression analysis across countries (1): linear model with dummies

Lecture: 90 minutes

Lab: 90 minutes

5 Regression analysis across countries (2): interactions, discontinuity, Chow test

Lecture: 90 minutes

Lab: 90 minutes

6 Logistic regression analysis and controlling for country effects

Lecture: 90 minutes

Lab: 90 minutes

7 Loglinear analysis of comparative cross-classified data

Lecture: 90 minutes

Lab: 90 minutes

8 Data reduction techniques: factorial invariance across countries (factor analysis and PCA)

Lecture: 90 minutes

Lab: 90 minutes

9 Multilevel regression models for linking micro and macro variables accross countries

Lecture: three hours

10 Presentations of the students work

In lecture room

Day Readings
Note

The reading list makes a distinction between compulsory and recommended readings. Recommended are readings that can be done or browse to go further or to complete the compulsory readings. It also serve as an indication of extended bibliography to work/read after the summer school.

1
On general perspectives about comparative survey research and some of the statistical problems linked to : Janet A. Harkness, Fons J. R. van de Vijver, Peter Ph. Mohler (ed.). Cross-cultural survey methods. NY, Wiley, 2003 (compulsory)

Janet A. Harkness, Michael Braun, Brad Edwards, Timothy P. Johnson, Lars Lyberg, Peter Ph. Mohler, Beth-Ellen Pennell, Tom W. Smith. Surveys methods in multinational, multiregional and multicultural contexts. John Wiley, 2010, extracts from  chapters 1, 2, 26 (recommended)

On the first day, students will also be (if needed), introduced to the handling of a major comparative data base (such as the ESS) with SPSS environment

2

Extracts (to be precised) from Alan Agresti, Barbara Finlay. Statistical methods for social sciences. 4th edion , chapters 5, 6, 7 and 9.

3

Damodar Gujarati. Basic econometrics, 4th ed., pages 37 to 51 (compulsory), pages 58 to 64 (compulsory), pages 65 to 79 (recommended), pages 81 to 87 (compulsory), pages 127-139 (compulsory)

4 and 5

Damodar Gujarati. Basic econometrics, 4th ed., pages 202-215 (compulsory), pages 217-223 to 64 (compulsory), pages 248-265 (recommended), pages 297-303 (recommended), pages 304-311 (compulsory)

Alfred DeMaris. Regression with social data: modelling continuous and limited response variables. Wiley, NJ, 2004, pp. 148-154 (compulsory)

Gregory C. Chow, Tests of Equality Between Sets of Coefficients in Two Linear Regressions, Econometrica, vol. 28(3), 1960, p. 591–605 (recommended)

Damodar Gujarati. Use of Dummy Variables in Testing for Equality between Sets of Coefficients in Two Linear Regressions: A Note. The American Statistician, 1970, 24(1), 50-52; and 1970, 24(5), 18-22 (recommended).

If you have time to discover an interesting example of the dummy variable techniques in another field:

Thrane Christer (2004). In defence of the price hedonic model in wine research, Journal of Wine Research, 15: 2, pages 123 — 134

6

Scott Menard. Applied logistic regression analysis. Sage Publications (Quantitative applications in the Social Sciences, n°106), pages 12-24 (compulsory), pages 37-52 (compulsory)

7

Alan Agresti. Categorical data analysis. NY, Wiley, 2003, chap 8 and 9 (recommended) Alfred DeMaris. Logit modeling. Practical applications. Sage Publications. (Quantitative applications in the Social Sciencesn, n°86), pages 7-28 (compulsory)

McCutcheon, A., and Mills, C. “Categorical data analysis: Log-linear and latent class models”, in E. Scarbrough and E. Tanenbaum (eds.), Research Strategies in the Social Sciences. Oxford University Press, 1998 (recommended)

8

Pennings, Paul, Hans Keman, and Jan Kliennijenhuis. Doing Research in Political Science: An Introduction to Comparative Methods and Statistics.  London, Sage Publications, 1999 (compulsory : pages on factor analysis, to be precised).

Rummel R.J. Understanding factor analysis. The Journal of Conflict Resolution, Vol. 11, No. 4, (Dec., 1967), pp. 444-480 (recommended).

9

Robert F. Dedrick, John M. Ferron, Melinda R. Hess,Kristine Y. Hogarty, Jeffrey D. Kromrey, Thomas R. Lang, John D. Niles, and Reginald S. Lee. Multilevel Modeling: A Review of Methodological Issues and Applications. Review of Educational Research, Spring 2009, Vol. 79, No. 1, pp. 69–102

10

No readings.

Software Requirements

SPSS20

Stata 14

Hardware Requirements

None. Some sessions will take place in computer labs.

Literature

On substantive and methodological issues of crossnational surveys data analysis (and outside the bibliography of the core daily readings), you may find interesting elements in:

Roger Jowell, Caroline Roberts, Rory Fitzgerald, Gillian Eva. Measuring Attitudes Cross-Nationally. Lessons from the European Social Survey, Sage Publications, 2007.

Janet A. Harkness, Fons J. R. van de Vijver, Peter Ph. Mohler (ed.). Cross-cultural survey methods. NY, Wiley, 2003

Janet A. Harkness, Michael Braun, Brad Edwards, Timothy P. Johnson, Lars Lyberg, Peter Ph. Mohler, Beth-Ellen Pennell, Tom W. Smith. Surveys methods in multinational, multiregional and multicultural contexts. John Wiley, 2010.

On the statistical analysis part, an excellent textbook covering most of the course issues is: Alan Agresti, Barbara Finlay. Statistical methods for the social sciences. Prentice Hall, 4th edition, 2008. For regression analysis, the most complete book is certainly: Damodar Gujarati. Basic econometrics.  New York: McGraw-Hill, 4th edition, 2004 or 5th edition (with Dawn C Porter, 2009)

Participants are not expected to buy these books, as they can be expensive. These books just complete the methodological readings that are the core readings to do. Electronic versions of the core readings will be made available as much as possible through the Moodle platform of the summer school.