Sitemap
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Pages
Page Not Found
Page not found. Your pixels are in another canvas.
Datasets summary
Datasets details for case studiesa
Data Analysis for Business, Economics, and Policy
About the textbook
How to set up your computer for R
Set up system, get R and R studio, create a project
How to set up your computer for Stata
Set up system, get Stata, create a project
How to use our code in R
Set up system, get R and R studio, create a project
Posts
Data Chats Podcast
I did a podcast with Chris Richardson of Pragmatic Institute, a California based data science education institute a while back, it is out now. Data Science ...
Data Analysis for Business Analytics
A complete Data Analysis package for Business Analytics
Interpreting a coefficient in a simple OLS regression
Interpreting univariate OLS coefficients
A simplified notation for OLS regression
A simplified regression notation
On picking the Viridis color scheme
The economist meets visual arts
Goalkeepers, random play and multiple testing
Random action may be a well thought out strategy. Sometimes you would want to act randomly to make your next action hard to predict. This is particularly imp...
Selection bias – a war story
During the Second World War, in a secret Manhattan building, statisticians and mathematicians were recruited from across the U.S.A. to carry out data analysi...
Some history of sampling
Statistical enumerations of land, people, and property have taken place in many of the better organized empires and states since Babylonian times. These all ...
Variants of random sampling
There are many variations on random sampling that aim to further improve representation or reduce the costs of data collection.
casestudies
Ch02
Finding a good deal among hotels : data preparation
Ch07A Finding a good deal among hotels with simple regression
This case study introduces non-parametric regression such as lowess, and linear regression (OLS), residuals and goodness of fit (R-squared).
Part I: DATA EXPLORATION
PART I: DATA EXPLORATION
Part II: REGRESSION ANALYSIS
PART II: REGRESSION ANALYSIS
Part III: PREDICTION
PART III: PREDICTION
Part IV: CAUSAL ANALYSIS
PART IV: CAUSAL ANALYSIS
chapters
Part I: DATA EXPLORATION
Chapter 01: Origins of Data This chapter is about data collection and data quality. The chapter starts by introducing key concepts of data. It then describes...
Part II: REGRESSION ANALYSIS
Chapter 07: Simple Regression In this chapter, we introduce simple non-parametric regression and simple linear regression. We discuss nonparametric regressio...
Part III: PREDICTION
Chapter 13: A Framework for Prediction This chapter introduces a framework for prediction. We discuss the distinction between various types of prediction, s...
Part IV: CAUSAL ANALYSIS
Chapter 19: A Framework for Causal Analysis This chapter introduces a framework for causal analysis. The chapter starts by introducing the potential outcomes...
content
datasets
README: airbnb dataset
This is a README file for the airbnb dataset.
README: airline-tickets-usa dataset
This is a README file for the airline-tickets-usa dataset. Used in case study 22A How does a merger between airlines affect prices?
README: arizona-electricity dataset
This is a README file for the arizona-electricity dataset. Used in case study 12B Electricity consumption and temperature
README: cps-earnings dataset
This is a README file for the cps-earnings dataset. Used in the case studies 9A Estimating gender and age differences in earnings and 10A Understanding the...
README: hotels-europe dataset
This is a README file for the hotels-europe dataset that includes information on price and features of hotels in 46 European cities and for 10 different dat...
README: hotels-vienna dataset
This is a README file for the hotels-vienna dataset that includes information on price and features of hotels in Vienna for one date. Used in case studies ...
README: health-share dataset
This is a README file for the share-health dataset. Used in case study 11A Does smoking pose a health risk?
README: wms-management-survey dataset
This is a README file for the wms-manegement-survey dataset.
README: working-from-home dataset
This is a README file for the working-from-home dataset. Used in case study 20B