ECPR

Install the app

Install this application on your home screen for quick and easy access when you’re on the go.

Just tap Share then “Add to Home Screen”

ECPR

Install the app

Install this application on your home screen for quick and easy access when you’re on the go.

Just tap Share then “Add to Home Screen”

R Basics

Akos Mate
aakos.mate@gmail.com

Centre for Social Sciences

Akos Mate is a research fellow at the Centre for Social Sciences in Hungary. His key research area is the political economy of the European Union and its members’ fiscal governance.

He uses a wide variety of methods in his research, particularly automated text analysis (and attached various machine learning approaches), network analysis and more traditional econometric techniques.

  @aakos_m
Alexandru Moise
alexandru.moise@eui.eu

European University Institute

Alexandru Moise is a postdoctoral researcher at the European University Institute (2020–2025). He received his PhD in political science from Central European University in 2019. Alex's research focus is the political economy of welfare reforms. He looks at how individual perceptions and the quality of party linkages affect health care policy. Within the context of the ERC SOLID project, he looks at how crises affect European integration using a variety of quantitative models. He has been teaching quantitative analysis courses at Johns Hopkins University, the European University Institute, Central European University, Ilija University, and the ECPR’s Summer and Winter Schools for many years.

Twitter @alexdmoise

Course Dates and Times

Friday 26 July 13:00–15:00 and 15:30–18:00

Saturday 27 July 09:00–12:30 and 14:00–17:30

Prerequisite Knowledge

No prior experience with R (or any other programming language) is required. The goal of the course is to introduce R in an accessible fashion, with a heavy emphasis on practical, applicable knowledge.


Short Outline

Due to overwhelming demand, we are running two concurrent R Basics courses; one taught by Akos Mate, one by Alexandru Moise.

Once we know final numbers, we will email you to let you know which group you are in.

R is an extremely versatile programming language that is rapidly becoming the top choice for data analysis tasks in academia and in the industry. This short course will show you how R can be a powerful tool in every step of the data analysis process. We will cover:

  • importing data (in various formats)
  • cleaning and manipulating data
  • visualisations
  • statistical analysis.

Since the popularity of R is due to its ever-expanding package ecosystem, I will place a special emphasis on how to get information on packages, and how to get relevant R help. We will also cover how to create reports, export results and generally how to do reproducible research in R, using RStudio.

R is a language developed for statistical analysis, but we will not cover the statistical concepts in depth beyond to how to implement them in R.

ECTS Credits for this course


Long Course Outline

It is not a stretch to say that R has become one of the main data analysis tools used in and outside of academia. R is an open source programming language, developed for statistical computing, that developed an extremely active user base with an expanding universe of packages.

The guiding logic of the course is to give practical knowledge for the whole data analysis workflow:

  1. Importing data
  2. Data manipulation/cleaning
  3. Data visualisation
  4. Analysis
  5. Reporting the results

It might be strange to switch from SPSS or Stata to R, but the benefits outweigh the efforts of climbing the learning curve. The base R allows us to read different data files into R, manipulate them, create various visualisations and run statistical analysis of any sorts (from basic descriptives to time series analysis, or multilevel regressions). The real value in learning R is that it integrates the research workflow into one environment. It can also be adapted to a broad range of research, from party politics data to ecological modelling.

Day 1

We start with a general introduction to R and RStudio. We learn how to start coding and set up RStudio to make our workflow as seamless as possible. RStudio is an Integrated Development Environment (IDE) that puts together the R console, a text editor where we write the code, and an object viewer where we can view the data objects we created.

The general introduction will cover how to use R for basic mathematical calculations, and how to create different objects. This part is key because we will cover the base R syntax, how to create / access / remove objects, and how to merge vectors into data frames. These are essential operations for the following sessions.

We also look at how to load data into R from commonly encountered sources, such as .txt, .csv, Excel sheets, Stata, SPSS and SAS save files. After getting data into R, we will perform some basic operations to have a sneak-peek at the data. This includes the usual descriptive statistics and creating histograms and scatterplots.

Day 2

Dedicated to data manipulation and data cleaning, with a hint of data analysis. This is an essential part (which usually takes up the majority of the time) of every analysis. The materials will cover how to set up data in R, what the difference is between the wide and long data format and the more recent push for 'tidy data' in the R community. I will introduce writing loops and functions in R and the 'apply' function family as well as their ‘tidy’ alternatives.

Similarly to the previous day, all the activities are accompanied with some degree of data visualisation, since it is often better to show a figure than a disorienting half-page table. At this point we will have enough results to think about getting them out of R. The course uses RMarkdown to show how to create pdf or html output of our work. There are also several packages developed for getting results out of the R console.

If you have any particular interests during the course, I try to cover those in this final session.

Day Topic Details
1 Getting to know R and the basics

Setting up RStudio, the basic R syntax, loading various data into R and having a quick look at the data. Some first steps on visualising data in R.

2 Data manipulation and visualisation, getting the results out of R

Data manipulation with base R and with the ’tidyverse’ package family. How to subset, merge, clear data, how to deal with missing data. Introduction to loops and writing functions in R to make life easier. Using Rmarkdown to generate a report of our analysis.

3 Analysing data in R (t-test, ANOVA, regressions), more visualisation and quick sneak-peek of web scraping, text analysis and network data

Analysing the data with different approaches. Examining the resulting R objects and how to access relevant parts of them.

Day Readings
Note

Main books we will use

1

Adler: Chapter 5 ’Overview of the R language’, Chapter 6 ‘R syntax’, Chapter 7 ‘R Objects’ // Wickham: Chapter 3 ‘Data visualization’

2

Adler: Chapter 9 ‘Functions’, Chapter 12 ‘Preparing data’ // Wickham: Chapter 12 ‘Tidy data’, Chapter 27 ‘R Markdown’ // Healey: Chapter 4 ‘Show the right numbers’

3

Adler: Chapter 16 ‘Analyzing data’, Chapter 17 ‘Probability distributions’, Chapter 18 ‘Statistical tests’, Chapter 20 ‘Regression Models’ (up to page 412) // Healey: Chapter 6 ‘Work with models’

Software Requirements

R version 3.5.0 (or newer) RStudio version 1.1.463 (or newer)

R and RStudio are free to use. They work on Windows, MacOS and Linux. Both must be installed and working on your laptop.

Download and install R

Download and install RStudio

Hardware Requirements

Bring your own laptop with software installed.

Literature

R is one of the fastest growing languages for statistical analysis and it has a great online community that generates a huge amount of content. Useful online and print resources to continue with learning R:

R tutorials of all sorts

R related blog post and news aggregator site

R blog aggregator site

Follow #rstats on Twitter. Package developers and R industry veterans post regularly, and it is easy to engage with others.

Forum for problem solving with R Googling your error message will usually land you here

Rstudio resources

Rstudio cheat sheets curated by RStudio devs

Print
(in addition to the books used during the course)

Grolemund, Garrett 2014, Hands-On Programming with R, O’Reilly

Wickham, Hadley 2014, Advanced R, Chapman and Hall/CRC (online companion: http://adv-r.had.co.nz/)

Matloff, Norman 2011, The Art of R Programming: A Tour of Statistical Software Design, No Starch Press

Teetor, Paul 2011, R Cookbook, O’Reilly


Additional Information

Disclaimer

This course description may be subject to subsequent adaptations (e.g. taking into account new developments in the field, participant demands, group size, etc). Registered participants will be informed at the time of change.

By registering for this course, you confirm that you possess the knowledge required to follow it. The instructor will not teach these prerequisite items. If in doubt, please contact us before registering.