"What does it mean to do empirical social science? Asking good questions. Digging up novel data. Designing statistical analysis. Writing up results.

For many of us, most of the time, what it means is writing and debugging code. We write code to clean data, to transform data, to scrap data, and to merge data. We write code to execute statistical analyses, to simulate models, to format results, to produce plots. We stare at, puzzle over, fight with, and curse at code that isn’t working the way we expect it to. We dig through old code trying to figure out what we were thinking when we wrote it, or why we’re getting a different result from the one we got the week before."

Code and Data for Social Sciences: A Practictioner’s Guide (Matthew Gentzkow and Jesse M. Shapiro)

1 Class preparation

Classes will be held in a computer room. You will be able to use either one of PSE computer or your personal laptop. In either case, please download R and RStudio before the first class on your laptop. You can find installation instructions here. If you already have R and RStudio on your laptop, make sure to have the latest versions. You will use R and RStudio for several classes.

3 Debugging

Throughout this course and in the future, you will spend quite some time debugging your code. The first thing to acknowledge is that you are probably not the first one to run into this particular issue. You can search for help on these two very helpful forums: Stack Overflow and RStudio Community.

Keep in mind that when you are start using R, many issues arise because of packages: not installed, not loaded, etc.

I strongly encourage you to go over this online chapter on debugging once you are familiarized with R (especially its subsection on debugging code in RMarkdown: it might be useful for your assignments).

4 Coding Commandments

For the past decades, economics has relied more and more on data analysis. This has forced economists to learn some coding to use statistical softwares (e.g. R, Stata, Python). Unfortunately, economics students are rarely taught the basics of computer sciences and coding. We usually learn on-the-go.

You can find my attempt to gather the Good Coder Commandments here.

5 References

5.1 Cheatsheets

RStudio: Familiarize yourself with RStudio

Keyboard shortcuts in RStudio: Useful list of all the keyboard shortcuts in RStudio

Import data: How to import data with the tidyverse

Tidy data: How to tidy data with tidyr

Transform data: How to manipulate and summarize data with dplyr

Data visualization: How to graph your data with ggplot2

String variables: How to manipulate string variables with stringr

Date and time variables: How to deal with date and time variables with lubridate

Spatial data: How to manipulate spatial data with sf

Funtions: How to apply functions with purrr

RMarkdown 1/2: How to generate a RMarkdown (for your assignments!)

RMarkdown 2/2: Useful reference about RMarkodwn syntax and options