ECPR

Install the app

Install this application on your home screen for quick and easy access when you’re on the go.

Just tap Share then “Add to Home Screen”

ECPR

Install the app

Install this application on your home screen for quick and easy access when you’re on the go.

Just tap Share then “Add to Home Screen”

Your subscription could not be saved. Please try again.
Your subscription to the ECPR Methods School offers and updates newsletter has been successful.

Discover ECPR's Latest Methods Course Offerings

We use Brevo as our email marketing platform. By clicking below to submit this form, you acknowledge that the information you provided will be transferred to Brevo for processing in accordance with their terms of use.

Introduction to STATA

Course Dates and Times

Thursday 26 July

13:30-15:00 / 15:30-17:00

Friday 27 July and Saturday 28 July

09:00-10:30 / 11:00-12:30 and 13:30-15:00 / 15:30-17:00

Andrew X. Li

lixiang577@gmail.com

Central European University

STATA is a statistical software that has been widely used for data analysis tasks both in academia and in the industry. Compared with R, which is becoming increasingly popular, STATA has a less steep learning curve and is more effective in carrying out analyses using established methods such as panel data methods. This short course is focused on giving participants practical knowledge, by showing how STATA can be an effective and powerful tool in every step of the data analysis process. We will cover data acquisition, cleaning and manipulation, visualizations and finally statistical analysis and prediction. Throughout the course, the instructor will briefly mention key statistical concepts and methods as a quick recap as and when needed. However, the emphasis of the course is on STATA itself. Participants with absolutely zero foundation in statistical analysis are warned to take this course at their own risks as they may not be able to make sense of what the software is doing.


Instructor Bio

Andrew is an assistant professor at CEU's Department of International Relations. He obtained his PhD from the National University of Singapore and King’s College London.

His research interests include international political economy, research design, and quantitative methods. He teaches the Research Design and Methods in IR course series at CEU.

@lixiang577

The guiding logic of the course is to give practical knowledge for the whole data analysis workflow:

  1. Data acquisition
  2. Data manipulation/cleaning
  3. Data visualisation
  4. Analysis and prediction
  5. Reporting the results

Day 1 will start with a general introduction to STATA. Participants will see that there are two ways to use STATA, either through the software’s graphical user interface or through the command lines. This course focuses on the latter as that’s the way preferred by almost all STATA experts. On the first day, we will learn how to get a dataset into STATA for analysis. There are a number of ways to do it, depending on the format of the raw data, including copy and paste. After that, we will briefly look at the type of variables (string, float and etc) stored in STATA and learn some basic syntax and mathematical calculations.

Day 2 will be dedicated to data manipulation, data cleaning and part of visualization. We will learn how to name and label variables and data values, how to generate new variales and replace the values of existing varaibles, how to order observations and variables, how to preserve and restore datasets and how to merge or append different datasets into one. This will be followed by the basics of visualizaiton. Students will be introduced to various types of graphs STATA can generate, and the formats which these graphs can be exported to or saved as. The last session (90-minutes) of the day will be reserved for students to revise and practice the commands themselves and clarify any doubts they may have.

Day 3 begins with a continuation of data visualization. We will learn how to create more sophisticated graphs from STATA. Students will be introduced to more advanced visualization tools and options such as title and subtitle, axis label, marker label, color control and etc. We will then move on with data analysis and prediction, whereby we will learn how to generate summary statistics, how to run an OLS regression, how to conduct a t-test for two sample means and how to perform in-sample predictions. Finally, we will learn how to generate beautiful regression tables that look like those in published journal articles. Like Day 2, the last 90-minute session will be reserved for self-study, practice, clarifying doubts and possibly completing an assignment. Thus, students are encouraged to bring their own data which they want to analyze for their research.

Note from the Academic Convenors to prospective participants: by registering to this course, you certify that you possess the prerequisite knowledge that is requested to be able to follow this course. The instructor will not teach again these prerequisite items. If you doubt whether you possess that knowledge to a sufficient extent, we suggest you contact the instructor before you proceed to your registration.

This course requires no prior experience with STATA or any other statistical software/programming language. However, participants are expected to have basic theoretical knowledge of statistical concepts and OLS regression. This course is best designed for students who have completed an introductory course on statistical analysis but have not got a chance to carry out the analyses using a statistical software.

Day Topic Details
Thursday Introduce the STATA working environment; Create and save data set; Data management
Friday Descriptive statistics; Graphical visualization; Measures of association
Saturday Hypothesis testing/t-test; Regression analysis
1 Getting to know STATA

Input data into STATA, basic syntax and mathematical calculations.

3 Data visualization (more advanced), analysis and prediction

How to:

  • create more sophisticated graphs
  • generate summary statistics
  • run OLS regressions
  • conduct t-tests of two sample means
  • make predictions
  • generate regression tables
2 Data manipulating and visualization

How to:

  • label variables and data values;
  • generate new variables and change existing values;
  • order variables and sort data;
  • merge and append datasets;
  • create basic graphs.
Day Readings
1

Cameron, Adrian Colin and Pravin K. Trivedi. Microeconometrics Using Stata. College Station, TX: Stata Press, 2009. Chapter 2.1-2.3, pp. 29-37

2

Cameron, Adrian Colin and Pravin K. Trivedi. Microeconometrics Using Stata. College Station, TX: Stata Press, 2009. Chapter 2.4-2.6, pp. 38-68

3

Cameron, Adrian Colin and Pravin K. Trivedi. Microeconometrics Using Stata. College Station, TX: Stata Press, 2009. Chapter 3, pp. 71-112

Software Requirements

Stata 15

Hardware Requirements

None - this course will be taught in a lab.