ECPR

Install the app

Install this application on your home screen for quick and easy access when you’re on the go.

Just tap Share then “Add to Home Screen”

ECPR

Install the app

Install this application on your home screen for quick and easy access when you’re on the go.

Just tap Share then “Add to Home Screen”

Your subscription could not be saved. Please try again.
Your subscription to the ECPR Methods School offers and updates newsletter has been successful.

Discover ECPR's Latest Methods Course Offerings

We use Brevo as our email marketing platform. By clicking below to submit this form, you acknowledge that the information you provided will be transferred to Brevo for processing in accordance with their terms of use.

Python Programming for Social Scientists: Big Data, Web Scraping and Other Useful Programming Tricks

Course Dates and Times

Thursday 26 July

13:30-15:00 / 15:30-17:00

Friday 27 July and Saturday 28 July

09:00-10:30 / 11:00-12:30 and 13:30-15:00 / 15:30-17:00

Eszter Somos

somoseszter@gmail.com

With more and more data available on the Web, skills to collect and process these data in an automated manner are becoming increasingly valuable. This course will provide an introduction to programming with Python, starting from the basics. Beyond confidently using Python, the class will focus on solving problems around data collection, processing, and analysis. Additionally, we will discuss for what types of problems Python is the right choice and how to further extend your knowledge after the class. The overarching goal is to equip students with enough programming experience to start working in any area of computation and data-intensive research.


Instructor Bio

Eszter Somos is currently an associate at Gravity R&D, a company specializing in the development and maintaining of recommendation systems. She has been doing data mining  since 2015, when she switched from being a PhD student at the University of Hull researching autobiographical memory to working for startup companies.

Her main focus is exploratory data analysis, algorithm fine tuning, and conducting AB testing in online environments. She is experienced in working with data from various sectors, like online job markets, travel metasearch, webshops, and video streaming sites.

The courses will mostly be of the form of hands-on programming sessions, organized into ten 1.5 hour sessions. Use of a computer will be required during most of the lectures. While no prior programming experience is required to follow the class, students will highly benefit from prior knowledge of Stata, R or other languages.

Learning outcomes

By the end of the course, students will have experience with techniques which are vital to effective data management:

  • The basic syntax and use of Python as a data analysis tool, including writing and executing scripts to automate common tasks, using the IPython interpreter for interactive exploration of data and code, and using the Jupyter notebook to share and collaborate.
  • Loading data from a variety of common formats such as csv, html, json
  • Manipulating data efficiently with Pandas
  • Basic web scraping
  • Use of web APIs
  • Use of special python packages such as data visualization libraries

While no prior programming experience is required to follow the class, students will highly benefit from prior knowledge of Stata, R or other languages.

Day Topic Details
Thursday Python Fundamentals

First steps with Python and IPython environment.

Friday Advanced Python

Useful programming tricks and data  management

Saturday APIs and Web Scraping

How to use the Internet as a data source.

1 Why Python? First steps with python, Ipython notebook
2 • Working with various data structures, reading/writing files • Scraping from the Web
3 • Parsing scraped data, Pandas basics • Data analysis and visualization
Day Readings
Thursday

10 Reasons Python Rocks for Research (And a Few Reasons it Doesn’t)

http://www.stat.washington.edu/~hoytak/blog/whypython.html

Friday

No mandatory reading

Saturday

No mandatory reading

Note

Given the practical nature of the class there is no necessary reading but in case you want to prepare ahead any of these online tutorials can be useful, and you can go through the below blog to get an idea about why Python is particularly useful for data analysis.

Software Requirements

It is important that you install the softwares Python 3 and Anaconda 3.6 before coming to class. Detailed installation instructions can be found here: http://ancsahannak.me/files/setup_instructions_python.pdf

Please make sure to test whether it works before coming to class, you will see instructions on how to do that in the document.

Hardware Requirements

Participants are required to bring their own laptops.