Install this application on your home screen for quick and easy access when you’re on the go.
Just tap then “Add to Home Screen”
Install this application on your home screen for quick and easy access when you’re on the go.
Just tap then “Add to Home Screen”
Monday 10 ꟷ Friday 14 August 2020
2 hours of live teaching per day
Courses will be either morning or afternoon to suit participants’ requirements
This course provides a highly interactive online teaching and learning environment, using state of the art online pedagogical tools. It is designed for a demanding audience (researchers, professional analysts, advanced students) and capped at a maximum of 16 participants so that the teaching team (the Instructor plus one highly qualified Teaching Assistant) can cater to the specific needs of each individual.
You will learn how to work with big data using R, through various available solutions.
You’ll also gain insights from the data through basic machine learning techniques, and from coding tutorials.
3 credits Engage fully with class activities
4 credits Complete a post-class assignment
Akitaka is a Postdoctoral Research Fellow at the Institute for Analytics and Data Science (IADS). Before joining IADS, he was a Research Fellow in Data Science in LSE's Department of Methodology. He earned his PhD in political science at Rice University in Houston.
His research interests lie in data science and politics, in particular in the statistical methodology for scaling political behaviour, and natural language processing of political texts.
• constructing and managing large datasets in R
• machine learning, with specific focus on providing analytics from large datasets.
For the first topic of big data, we start by asking: What is big data? Why it is difficult to work with? We then learn the best solutions in R for working with big data depending on the size of data, from locally stored data objects to databases hosted on the cloud.
For the second topic of machine learning, we will learn basic concepts such as:
1) problem definitions
2) objective function
3) bias-variance tradeoffs
4) parameter tuning.
Social scientists have traditionally emphasised the explanation as the primary purpose of statistical analysis. Machine learning has an overlapping but evidently different orientation. By contrasting the inference-based approach and prediction-focused approach, you get to understand the fundamental ideas of machine learning. The application of machine learning techniques to various analytical tasks in social sciences will follow these theoretical discussions.
The course provides approximately 4ꟷ5 hours of pre-recorded lectures as well as an online forum on Slack where the Instructor/TA and students can freely discuss the lecture materials.
Approximately two hours of each day will be an online seminar, where we will learn how to apply the concepts and knowledge gained from pre-course lecture materials through Q&A and the live lab work.
In the live lab, you will be given several coding tasks, and asked to code along with the Instructor. Some tasks are left as homework, which will be discussed on the online forum and during the following day’s live lab.
You’ll also be able to advance-book one-to-one consultations with the instructor/TA during office hours scheduled in advance.
You will learn how to work with cloud computational workspace using R as a primary statistical software with RStudio server and Google Colab. Example scripts and assignments are distributed through github so you can learn how to use online version control systems for collaboration and research accountability.
The course assumes you have some familiarity with R statistical language and can conduct basic data handling in R (opening data files, working with data frames). If you don’t have this, take the week-one course Introduction to R.
You should also have basic knowledge of standard statistical analysis in social science, such as linear regression and hypothesis testing. These are covered in the week-one courses Introduction to Inferential Statistics and Big Data Collection and Management in R.