# Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

## Datasets summary

Datasets details for case studiesa

## How to set up your computer for R

Set up system, get R and R studio, create a project

## How to set up your computer for Stata

Set up system, get Stata, create a project

## How to use our code in R

Set up system, get R and R studio, create a project

## Data Chats Podcast

I did a podcast with Chris Richardson of Pragmatic Institute, a California based data science education institute a while back, it is out now. Data Science ...

## Data Analysis for Business Analytics

A complete Data Analysis package for Business Analytics

## Interpreting a coefficient in a simple OLS regression

Interpreting univariate OLS coefficients

## A simplified notation for OLS regression

A simplified regression notation

## On picking the Viridis color scheme

The economist meets visual arts

## Goalkeepers, random play and multiple testing

Random action may be a well thought out strategy. Sometimes you would want to act randomly to make your next action hard to predict. This is particularly imp...

## Selection bias – a war story

During the Second World War, in a secret Manhattan building, statisticians and mathematicians were recruited from across the U.S.A. to carry out data analysi...

## Some history of sampling

Statistical enumerations of land, people, and property have taken place in many of the better organized empires and states since Babylonian times. These all ...

## Variants of random sampling

There are many variations on random sampling that aim to further improve representation or reduce the costs of data collection.

## Ch02

Finding a good deal among hotels : data preparation

## Ch07A Finding a good deal among hotels with simple regression

This case study introduces non-parametric regression such as lowess, and linear regression (OLS), residuals and goodness of fit (R-squared).

## Part I: DATA EXPLORATION

PART I: DATA EXPLORATION

## Part II: REGRESSION ANALYSIS

PART II: REGRESSION ANALYSIS

## Part III: PREDICTION

PART III: PREDICTION

## Part IV: CAUSAL ANALYSIS

PART IV: CAUSAL ANALYSIS

## Part I: DATA EXPLORATION

Chapter 01: Origins of Data This chapter is about data collection and data quality. The chapter starts by introducing key concepts of data. It then describes...

## Part II: REGRESSION ANALYSIS

Chapter 07: Simple Regression In this chapter, we introduce simple non-parametric regression and simple linear regression. We discuss nonparametric regressio...

## Part III: PREDICTION

Chapter 13: A Framework for Prediction This chapter introduces a framework for prediction. We discuss the distinction between various types of prediction, s...

## Part IV: CAUSAL ANALYSIS

Chapter 19: A Framework for Causal Analysis This chapter introduces a framework for causal analysis. The chapter starts by introducing the potential outcomes...

## datasets

This is a README file for the airbnb dataset.

This is a README file for the airline-tickets-usa dataset. Used in case study 22A How does a merger between airlines affect prices?

This is a README file for the arizona-electricity dataset. Used in case study 12B Electricity consumption and temperature

This is a README file for the cps-earnings dataset. Used in the case studies 9A Estimating gender and age differences in earnings and 10A Understanding the...

This is a README file for the hotels-europe dataset that includes information on price and features of hotels in 46 European cities and for 10 different dat...

This is a README file for the hotels-vienna dataset that includes information on price and features of hotels in Vienna for one date. Used in case studies ...