In this course you will learn how to use STATA for conducting basic types of statistical analysis. The course will walk you through the typical stages of a process of empirical data analysis, from getting the data, through arranging the data in the needed formats and visualizing them, to conducting different statistical analyses and reporting the results in formats required by professional journals.
The course gives the participants practical, hands-on training in the use of STATA for conducting statistical analysis through a mix of examples presented by the instructor and a set of applications and exercises that the participants will solve and discuss in the class.
Although the course will present the STATA commands for conducting different types of statistical analyses and will briefly discuss their results, it will not teach the statistical theory behind these statistical methods. For example, when showing the STATA commands for conducting linear regression analysis and their results we will not cover the assumptions of the linear regression or how coefficients are calculated, although we will briefly discuss how to read the main elements of the output generated. This assumes that the participants have some basic knowledge of statistics, although the interpretation of the results will be formulated in terms that appeal to the basic intuition behind these concepts.
The participants will also learn how to use the STATA manual and help menu for writing commands that they need for conducting various statistical analyses.
The list of topics that will be covered in the course includes:
- Introduce the STATA working environment – graphic user interface (GUI), load data, set current working directory, help menu, do file, log file, commands to describe variables in data sets
- Create a data set, save data set, save do files, log files
- Data management: merge databases, keep and drop variables, recode variables, rename variables, create new variables, dummy variables, label variables, apply value labels to variables, add labels, drop labels, replace labels, sort observations, display missing values
- Summary/descriptive statistics – mean, median, mode, standard deviation, mean comparisons, distribution, min and max values, various qualifiers and operators for creating and summarizing variables, cross tables
- Graphical visualization – creating various types of graphs for summarizing the data (histograms, pie charts, bar charts, scatter plots/two dimensional scatter plots, line plots), graph editing, exporting graphs
- Measures of association – for nominal variables (crosstabs, chi2), for ordinal variables (Spearman’s Rho, odds ratio), interval variables (correlation)
- Hypothesis testing: t-test (single sample test, paired, and independent t tests assuming equal and unequal variance)
- Regressions: linear regression (command, results, estimate predicted values, interaction effects, plotting effects), save table results in journal format. Depending on the participants’ interest the course may also discuss other types of regressions, e.g. logistic, ordered logit.