Install the app

Install this application on your home screen for quick and easy access when you’re on the go.

Just tap Share then “Add to Home Screen”

SD104 - Multi-Variate Statistical Analysis and Comparative Cross-National Surveys Data

Instructor Details

Instructor Photo

Bruno Cautrès

Sciences Po Paris

Instructor Bio

Bruno Cautrès is attached to CEVIPOF – Centre de recherches politiques de Sciences Po (Paris), at the Fondation Nationale des Sciences Politiques in Paris.

He is a senior CNRS research fellow with interests in voting behaviour, political attitudes and behaviours, comparative survey research and quantitative techniques.

Bruno is involved in a variety of projects, including the European Social Survey, European Values Studies, International Social Survey Programme and European elections studies; and he participates in the development of elections studies in France. His current research programme concerns political trust and attitudes to democracy in France.


Course Dates and Times

Monday 1 to Friday 5 August and Monday 8 to Friday 12 August 2016

Generally classes are either 09:00-12:30 or 14:00-17:30

30 hours over 10 days

Prerequisite Knowledge

The course is an introductory level one even if methods taught are multivariate ones. It is designed to help social sciences students that have difficulties into entering in the multivariate techniques world and to make them crossing over these first difficulties during the summer school.  Because the course is introductory, it does not require any other previous skills than basic descriptive statistics and hypotheses testing. If students need basic recall on these points, they can ask the instructor documents or advices before the course; in any case, if needed the instructor and/or the TA can propose a remedial on this in the very beginning of the summer school.

Short Outline

This course proposes an overview and practical experience of the major theoretical and practical issues that are faced by the statistical analysis of cross-national comparative survey analysis. It combines the learning of fundamental points about the statistical techniques used to analyse such surveys (without going into too complex mathematical things) and the learning of the practical problems encountered when using them. The originality of the course is to present in most simple formats the multivariate methods and to concentrate on their use for cross-national comparative analysis. This objective fits with the introductory level of the course. The course proposes a panorama of the most used techniques such as linear regression analysis (and ANCOVA), logistic regression, loglinear models, factor analysis and correspondence analysis. Typically, students attending the course may wish to use cross-national data sets (ESS, EVS, Eurobarometers, ISSP, macro-comparative data bases) for their research and may have as a key problem the question of the homogeneity/heterogeneity of the statistical relationship between variables across their countries cases. The “black box” explanation of country effect can be investigated through multivariate technique testing for this homogeneity/heterogeneity hypotheses. More generally, the statistical methods that will be used in this course are relevant for “multiple groups” analysis, when groups are countries, but could be any groups such as gender, ethnic, regional groups. The course is a perfect preparation for attending later more specific and advanced courses (for instance on multilevel models or structural equations models) but that require a strong skill in the basics of multivariate analysis like this course delivers in only 2 weeks.

Long Course Outline

A growing concern and professional practice in the social sciences is the comparison of social and political behaviours between social groups. The focus can be on economic, political, cultural, socio-psychological attitudes and behaviours, it can be at an individual or collective level and the groups of interest can be defined by gender, ethnicity, age, generation.  One particular type of groups will be at the centre of this course: nations or countries. Indeed, the trend in cross-national, and also cross-cultural, research has gained greater popularity in the last decades, especially since the developments of comparative cross-national data bases such as Eurobarometers, EVS, ISSP or ESS.


Thanks to methodological developments in comparative survey designs such as the one achieved by a survey like ESS or equivalent surveys, a significant step forward has been crossed over. A new big challenge for comparative survey analysis is thus to analysis these comparative data sets with statistical techniques and methods that make it possible to test for country effects. Unfortunately, many users still just juxtapose country by country their results and finally few are really dealing with the logic of statistical control for country effects. When they introduce country effects in their explanatory models, most of the time it is just on the form of dummy effects coding in regression analysis, which only account for the heterogeneity of the dependent variable in the different countries. But how to test for heterogeneity/homogeneity of beta parameters, the one estimating the effects of the explanatory variables on the dependent one ? How can we know that beta parameters estimated by a logistic regression analysis are statistically different across nations? How can we know that factorial structures diverge between countries or are similar ? This is the kind of big issues that this course will cover by looking at different methods, among the most used in statistical comparative analysis.  The course will focus every day on a specific method, with as strong emphasis on two blocks of methods : modelling techniques (regression, both linear and logistic, loglinear and latent class models) and data reduction techniques (factor analysis, multiple correspondence analysis). The complementarities between these types of methods will be emphasized.  Modelling techniques are the one proposing to the user to apply a model on their data structures : the model is an algebric expression such as the classical linear regression model. This family of methods attach great importance to hypotheses testing, in particular about significance of the models and their parameters. But, how to cope with the countries effects in that case? Tests may be sensitive to sample sizes for instance if a researcher compare two countries with fairly different N. How to know that when beta parameters have higher values in one country than another it really means a country difference? On the other hand, data reduction techniques are of great help for reducing the space of variables and/or individuals or cases: with big data bases like Eurobarometers or ESS, EVS, with so many indicators, reducing the data space to few dimensions is a very interesting objective. But what constitute a “factorial invariance” across countries? How could we say that the factorial space is homogeneous or not across nations? Factor analysis and correspondence analysis propose certain solutions to these problems. The course will concentrate mainly on exploratory factor analysis, the confirmatory factor analysis being also interesting for comparative analysis will be explained anyway but on a shorter basis.


A big point of this course is to offer a (non-exhaustive) panorama of some very important methods, helping comparatists to deal with country effects. The scope is not exhaustive but permits in only two weeks to save a huge amount of time : if working with a textbook only, a student could spend months to discover so many different techniques.  Attendants to the course should be aware that this is precisely the objective: to offer a panorama of some methods and practical solutions, in a non exhaustive framework but concentrated on most used methods. The objective is that after the summer school, participants know what methods are best adapted to their research perspectives and types of data.  The summer school offers other courses specialized on one method (such as confirmatory factor analysis, is structural equation models), which is a different choice. The course will show, both by theoretical presentations and empirical illustrations, the advantages and the limits of each method, knowing that no method is the perfect one.

Day-to-Day Schedule

Day-to-Day Reading List

Software Requirements


Stata 14

Hardware Requirements

None - a computer lab will be used when necessary.


On substantive and methodological issues of crossnational surveys data analysis (and outside the bibliography of the core daily readings), participants may find very interesting elements in :


Roger Jowell, Caroline Roberts, Rory Fitzgerald, Gillian Eva. Measuring Attitudes Cross-Nationally. Lessons from the European Social Survey, Sage Publications, 2007.


Janet A. Harkness, Fons J. R. van de Vijver, Peter Ph. Mohler (ed.). Cross-cultural survey methods. NY, Wiley, 2003


Janet A. Harkness, Michael Braun, Brad Edwards, Timothy P. Johnson, Lars Lyberg, Peter Ph. Mohler, Beth-Ellen Pennell, Tom W. Smith. Surveys methods in multinational, multiregional and multicultural contexts. John Wiley, 2010.


On the statistical analysis part, an excellent textbook covering most of the course issues is : Alan Agresti, Barbara Finlay. Statistical methods for the social sciences. Prentice Hall, 4th edition, 2008. For regression analysis, the most complete book is certainly : Damodar Gujarati. Basic econometrics.  New York: McGraw-Hill, 4th edition, 2004 or 5th edition (with Dawn C Porter, 2009)


Participants are not expected to buy these books, that can be expensive. These books just complete the methodological readings that are the core readings to do. Electronic versions of the core readings will be made available as much as possible through the Moodle platform of the summer school.


The following other ECPR Methods School courses could be useful in combination with this one in a ‘training track .
Recommended Courses Before

Introduction to SPSS

Introduction to STATA

Comparative Research Designs

Introduction to Statistics

Survey Designs

Recommended Courses After

Multiple Regression Analysis

Multilevel Structural Equation Modelling

Applied Multilevel Models

Advanced Topics in Applied Regression

Data Visualisation

Structural Equation Modelling

Additional Information


This course description may be subject to subsequent adaptations (e.g. taking into account new developments in the field, participant demands, group size, etc). Registered participants will be informed in due time.

Note from the Academic Convenors

By registering for this course, you confirm that you possess the knowledge required to follow it. The instructor will not teach these prerequisite items. If in doubt, contact the instructor before registering.

Share this page