Data and code

Data and code

Get the raw data and the cleaning codes that created the datasets used for analyis.

How to?

R vs Python vs Stata

This textbook is coding language neutral. R, Stata, Python are by far the three most widely used tools to write code for data analysis.

Social scientists, especially economists like Stata for its power and sophisticated econometrics capabilities. It has a great interface, it is very easy to start doing analysis. It has a click and point user interface, too. How to set up for Stata?

Social scientists, data scienctists, statisticans like R for its great mix of data managament, statitical, and vizualization capacities. It has a large array of machine learning or natural language processing tools, it is great for web scraping or creating dashboards. It has a neatly assembled set of libraries, called Tidyverse, which helps learning elementary tools fast. R is free an open source. How to set up for R?

Python is the number one coding language for computer scientists and is widely used in data science applications from banking and finance to Industry of Things. Python is great for web scraping, building and maintaing databases, or all tasks of machine learning. Python is free an open source. How to set up for Python?


Download all code in R, Stata, Pythonrelease 1.0 (2020-11-21)

Download code one by one from the Case studies page


Download all the data Data-all - release 1.0 (2020-11-21)

Download data one by one from the Datasets page