Chapters
Each chapter provides summaries, outline, slides, and case study links.
Table of Contents
Downloads: Full contents (PDF), Index (PDF), Sample Chapters 10 & 14
Slides: For LaTeX versions, contact us.
PART I: DATA EXPLORATION
Chapter 01: Origins of Data
This chapter is about data collection and data quality. More
chapter outline → slides CH01A CH01B CH01C
| Section | Title |
|---|---|
| 1.1 | What Is Data? |
| 1.2 | Data Structures |
| 1.A | CASE STUDY – Finding a Good Deal among Hotels: Data Collection |
| 1.3 | Data Quality |
| 1.B | CASE STUDY – Comparing Online and Offline Prices: Data Collection |
| 1.C | CASE STUDY – Management Quality: Data Collection |
| 1.4 | How Data Is Born: The Big Picture |
| 1.5 | Collecting Data from Existing Sources |
| 1.6 | Surveys |
| 1.7 | Sampling |
| 1.8 | Random Sampling |
| 1.9 | Big Data |
| 1.10 | Good Practices in Data Collection |
| 1.11 | Ethical and Legal Issues of Data Collection |
| 1.12 | Main Takeaways |
| Practice Questions | |
| Data Exercises | |
| References and Further Reading |
Chapter 02: Preparing Data for Analysis
This chapter is about preparing data for analysis: how to start working with data. More
chapter outline → slides CH02A CH02B CH02C
| Section | Title |
|---|---|
| 2.1 | Types of Variables |
| 2.2 | Stock Variables, Flow Variables |
| 2.3 | Types of Observations |
| 2.4 | Tidy Data |
| 2.A | CASE STUDY – Finding a Good Deal among Hotels: Data Preparation |
| 2.5 | Tidy Approach for Multi-dimensional Data |
| 2.B | CASE STUDY – Displaying Immunization Rates across Countries |
| 2.6 | Relational Data and Linking Data Tables |
| 2.C | CASE STUDY – Identifying Successful Football Managers |
| 2.7 | Entity Resolution: Duplicates, Ambiguous Identification, and Non-entity Rows |
| 2.8 | Discovering Missing Values |
| 2.9 | Managing Missing Values |
| 2.10 | The Process of Cleaning Data |
| 2.11 | Reproducible Workflow: Write Code and Document Your Steps |
| 2.12 | Organizing Data Tables for a Project |
| 2.13 | Main Takeaways |
| Practice Questions | |
| Data Exercises | |
| References and Further Reading | |
| 2.U1 | Under the Hood: Naming Files |
Chapter 03: Exploratory Data Analysis
The chapter starts with exploratory data analysis is important. More
chapter outline → slides CH03A CH03B CH03C CH03D CH03U1
| Section | Title |
|---|---|
| 3.1 | Why Do Exploratory Data Analysis? |
| 3.2 | Frequencies and Probabilities |
| 3.3 | Visualizing Distributions |
| 3.A | CASE STUDY – Finding a Good Deal among Hotels: Data Exploration |
| 3.4 | Extreme Values |
| 3.5 | Good Graphs: Guidelines for Data Visualization |
| 3.6 | Summary Statistics for Quantitative Variables |
| 3.B | CASE STUDY – Comparing Hotel Prices in Europe: Vienna vs. London |
| 3.7 | Visualizing Summary Statistics |
| 3.C | CASE STUDY – Measuring Home Team Advantage in Football |
| 3.8 | Good Tables |
| 3.9 | Theoretical Distributions |
| 3.D | CASE STUDY – Distributions of Body Height and Income |
| 3.10 | Steps of Exploratory Data Analysis |
| 3.11 | Main Takeaways |
| Practice Questions | |
| Data Exercises | |
| References and Further Reading | |
| 3.U1 | Under the Hood: More on Theoretical Distributions |
| Bernoulli Distribution | |
| Binomial Distribution | |
| Uniform Distribution | |
| Power-Law Distribution |
Chapter 04: Comparison and Correlation
Most methods of data analysis are based on comparing values of one variable, y, across observations with different values of another variable, x, or more such variables. This chapter introduces simple methods of such comparison. More
chapter outline → slides CH04A
| Section | Title |
|---|---|
| 4.1 | The y and the x |
| 4.A | CASE STUDY – Management Quality and Firm Size: Describing Patterns of Association |
| 4.2 | Conditioning |
| 4.3 | Conditional Probabilities |
| 4.4 | Conditional Distribution, Conditional Expectation |
| 4.5 | Conditional Distribution, Conditional Expectation with Quantitative x |
| 4.6 | Dependence, Covariance, Correlation |
| 4.7 | From Latent Variables to Observed Variables |
| 4.8 | Sources of Variation in x |
| 4.9 | Main Takeaways |
| Practice Questions | |
| Data Exercises | |
| References and Further Reading | |
| 4.U1 | Under the Hood: Inverse Conditional Probabilities, Bayes’ Rule |
Chapter 05: Generalizing from Data
This chapter introduces the conceptual issues with generalizing results from our data to the general pattern we care about and methods of statistical inference. More
chapter outline → slides CH05A
| Section | Title |
|---|---|
| 5.1 | Why Generalize from Data? |
| 5.2 | Repeated Samples, Estimands, and Estimators |
| 5.3 | Sampling Distributions and Standard Error |
| 5.4 | Confidence Intervals |
| 5.A | CASE STUDY – What Likelihood of Loss to Expect on a Stock Portfolio? |
| 5.5 | Estimating SE: The Bootstrap |
| 5.6 | Estimating SE: Standard Error Formulas |
| 5.7 | External Validity |
| 5.8 | Assessing External Validity in Practice |
| 5.9 | Main Takeaways |
| Practice Questions | |
| Data Exercises | |
| References and Further Reading |
Chapter 06: Testing Hypotheses
This chapter introduces the logic and practice of testing hypotheses. More
chapter outline → slides CH06A CH06B
| Section | Title |
|---|---|
| 6.1 | The Logic of Testing Hypotheses |
| 6.A | CASE STUDY – Comparing Online and Offline Prices: Testing the Difference |
| 6.2 | Null Hypothesis, Alternative Hypothesis |
| 6.3 | The t-Test |
| 6.4 | Making a Decision; False Negatives, False Positives |
| 6.5 | The p-Value |
| 6.6 | Steps of Hypothesis Testing |
| 6.7 | One-Sided Alternatives |
| 6.B | CASE STUDY – Testing the Likelihood of Loss on a Stock Portfolio |
| 6.8 | Testing Multiple Hypotheses |
| 6.9 | p-Hacking |
| 6.10 | Testing Hypotheses with Big Data |
| 6.11 | Main Takeaways |
| Practice Questions | |
| Data Exercises | |
| References and Further Reading |
PART II: REGRESSION ANALYSIS
Chapter 07: Simple Regression
In this chapter, we introduce simple non-parametric regression and simple linear regression. More
chapter outline → slides CH07A
| Section | Title |
|---|---|
| 7.1 | When and Why Do Simple Regression Analysis? |
| 7.2 | Regression: Definition |
| 7.3 | Non-parametric Regression |
| 7.A | CASE STUDY – Finding a Good Deal among Hotels with Simple Regression |
| 7.4 | Linear Regression: Introduction |
| 7.5 | Linear Regression: Coefficient Interpretation |
| 7.6 | Linear Regression with a Binary Explanatory Variable |
| 7.7 | Coefficient Formula |
| 7.8 | Predicted Dependent Variable and Regression Residual |
| 7.9 | Goodness of Fit, R-Squared |
| 7.10 | Correlation and Linear Regression |
| 7.11 | Regression Analysis, Regression toward the Mean, Mean Reversion |
| 7.12 | Regression and Causation |
| 7.13 | Main Takeaways |
| Practice Questions | |
| Data Exercises | |
| References and Further Reading | |
| 7.U1 | Under the Hood: Derivation of the OLS Formulae for the Intercept and Slope Coefficients |
| 7.U2 | Under the Hood: More on Residuals and Predicted Values with OLS |
Chapter 08: Complicated Patterns and Messy Data
The first part of this chapter covers how linear regression analysis can accommodate nonlinear patterns. More
chapter outline → slides CH08A CH08B CH08C
| Section | Title |
|---|---|
| 8.1 | When and Why Care about the Shape of the Association between y and x? |
| 8.2 | Taking Relative Differences or Log |
| 8.3 | Log Transformation and Non-positive Values |
| 8.4 | Interpreting Log Values in a Regression |
| 8.A | CASE STUDY – Finding a Good Deal among Hotels with Nonlinear Function |
| 8.5 | Other Transformations of Variables |
| 8.B | CASE STUDY – How is Life Expectancy Related to the Average Income of a Country? |
| 8.6 | Regression with a Piecewise Linear Spline |
| 8.7 | Regression with Polynomial |
| 8.8 | Choosing a Functional Form in a Regression |
| 8.9 | Extreme Values and Influential Observations |
| 8.10 | Measurement Error in Variables |
| 8.11 | Classical Measurement Error |
| 8.C | CASE STUDY – Hotel Ratings and Measurement Error |
| 8.12 | Non-classical Measurement Error and General Advice |
| 8.13 | Using Weights in Regression Analysis |
| 8.14 | Main Takeaways |
| Practice Questions | |
| Data Exercises | |
| References and Further Reading | |
| 8.U1 | Under the Hood: Details of the Log Approximation |
| 8.U2 | Under the Hood: Deriving the Consequences of Classical Measurement Error |
Chapter 09: Generalizing Results of a Regression
This chapter discusses the methods of generalizing results of a linear regression from our data to the general pattern we care about. More
chapter outline → slides CH09A CH09B
| Section | Title |
|---|---|
| 9.1 | Generalizing Linear Regression Coefficients |
| 9.2 | Statistical Inference: CI and SE of Regression Coefficients |
| 9.A | CASE STUDY – Estimating Gender and Age Differences in Earnings |
| 9.3 | Intervals for Predicted Values |
| 9.4 | Testing Hypotheses about Regression Coefficients |
| 9.5 | Testing More Complex Hypotheses |
| 9.6 | Presenting Regression Results |
| 9.7 | Data Analysis to Help Assess External Validity |
| 9.B | CASE STUDY – How Stable is the Hotel Price–Distance to Center Relationship? |
| 9.8 | Main Takeaways |
| Practice Questions | |
| Data Exercises | |
| References and Further Reading | |
| 9.U1 | Under the Hood: The Simple SE Formula for Regression Intercept |
| 9.U2 | Under the Hood: The Law of Large Numbers for ˆβ |
| 9.U3 | Under the Hood: Deriving SE(ˆβ) with the Central Limit Theorem |
| 9.U4 | Under the Hood: Degrees of Freedom Adjustment for the SE Formula |
Chapter 10: Multiple Linear Regression
This chapter introduces multiple regression. More
chapter outline → slides CH10A CH10B
| Section | Title |
|---|---|
| 10.1 | Multiple Regression: Why and When? |
| 10.2 | Multiple Linear Regression with Two Explanatory Variables |
| 10.3 | Multiple Regression and Simple Regression: Omitted Variable Bias |
| 10.A | CASE STUDY – Understanding the Gender Difference in Earnings |
| 10.4 | Multiple Linear Regression Terminology |
| 10.5 | Standard Errors and Confidence Intervals in Multiple Linear Regression |
| 10.6 | Hypothesis Testing in Multiple Linear Regression |
| 10.7 | Multiple Linear Regression with Three or More Explanatory Variables |
| 10.8 | Nonlinear Patterns and Multiple Linear Regression |
| 10.9 | Qualitative Right-Hand-Side Variables |
| 10.10 | Interactions: Uncovering Different Slopes across Groups |
| 10.11 | Multiple Regression and Causal Analysis |
| 10.12 | Multiple Regression and Prediction |
| 10.B | CASE STUDY – Finding a Good Deal among Hotels with Multiple Regression |
| 10.13 | Main Takeaways |
| Practice Questions | |
| Data Exercises | |
| References and Further Reading | |
| 10.U1 | Under the Hood: A Two-Step Procedure to Get the Multiple Regression Coefficient |
Chapter 11: Modeling Probabilities
This chapter introduces probability models that have a binary dependent variable. More
chapter outline → slides CH11A CH11B
| Section | Title |
|---|---|
| 11.1 | The Linear Probability Model |
| 11.2 | Predicted Probabilities in the Linear Probability Model |
| 11.A | CASE STUDY – Does Smoking Pose a Health Risk? |
| 11.3 | Logit and Probit |
| 11.4 | Marginal Differences |
| 11.5 | Goodness of Fit: R-Squared and Alternatives |
| 11.6 | The Distribution of Predicted Probabilities |
| 11.7 | Bias and Calibration |
| 11.B | CASE STUDY – Are Australian Weather Forecasts Well Calibrated? |
| 11.8 | Refinement |
| 11.9 | Using Probability Models for Other Kinds of y Variables |
| 11.10 | Main Takeaways |
| Practice Questions | |
| Data Exercises | |
| References and Further Reading | |
| 11.U1 | Under the Hood: Saturated Models |
| 11.U2 | Under the Hood: Maximum Likelihood Estimation and Search Algorithms |
| 11.U3 | Under the Hood: From Logit and Probit Coefficients to Marginal Differences |
Chapter 12: Regression with Time Series Data
In this chapter we discuss the opportunities and challenges brought about by regression analysis of time series data and how to address those challenges. More
chapter outline → slides CH12A CH12B
| Section | Title |
|---|---|
| 12.1 | Preparation of Time Series Data |
| 12.2 | Trend and Seasonality |
| 12.3 | Stationarity, Non-stationarity, Random Walk |
| 12.A | CASE STUDY – Returns on a Company Stock and Market Returns |
| 12.4 | Time Series Regression |
| 12.5 | Trends, Seasonality, Random Walks in a Regression |
| 12.B | CASE STUDY – Electricity Consumption and Temperature |
| 12.6 | Serial Correlation |
| 12.7 | Dealing with Serial Correlation in Time Series Regressions |
| 12.8 | Lags of x in a Time Series Regression |
| 12.9 | The Process of Time Series Regression Analysis |
| 12.10 | Main Takeaways |
| Practice Questions | |
| Data Exercises | |
| References and Further Reading | |
| 12.U1 | Under the Hood: Testing for Unit Root |
PART III: PREDICTION
Chapter 13: A Framework for Prediction
This chapter introduces a framework for prediction. More
chapter outline → slides CH13A
| Section | Title |
|---|---|
| 13.1 | Prediction Basics |
| 13.2 | Various Kinds of Prediction |
| 13.A | CASE STUDY – Predicting Used Car Value with Linear Regressions |
| 13.3 | The Prediction Error and Its Components |
| 13.4 | The Loss Function |
| 13.5 | Mean Squared Error (MSE) and Root Mean Squared Error (RMSE) |
| 13.6 | Bias and Variance of Predictions |
| 13.7 | The Task of Finding the Best Model |
| 13.8 | Finding the Best Model by Best Fit and Penalty: The BIC |
| 13.9 | Finding the Best Model by Training and Test Samples |
| 13.10 | Finding the Best Model by Cross-Validation |
| 13.11 | External Validity and Stable Patterns |
| 13.12 | Machine Learning and the Role of Algorithms |
| 13.13 | Main Takeaways |
| Practice Questions | |
| Data Exercises | |
| References and Further Reading |
Chapter 14: Model Building for Prediction
This chapter discusses how to build regression models for prediction and how to evaluate the predictions they produce. More
chapter outline → slides CH14A CH14B
| Section | Title |
|---|---|
| 14.1 | Steps of Prediction |
| 14.2 | Sample Design |
| 14.3 | Label Engineering and Predicting Log y |
| 14.A | CASE STUDY – Predicting Used Car Value: Log Prices |
| 14.4 | Feature Engineering: Dealing with Missing Values |
| 14.5 | Feature Engineering: What x Variables to Have and in What Functional Form |
| 14.B | CASE STUDY – Predicting Airbnb Apartment Prices: Selecting a Regression Model |
| 14.6 | We Can’t Try Out All Possible Models |
| 14.7 | Evaluating the Prediction Using a Holdout Set |
| 14.8 | Selecting Variables in Regressions by LASSO |
| 14.9 | Diagnostics |
| 14.10 | Prediction with Big Data |
| 14.11 | Main Takeaways |
| Practice Questions | |
| Data Exercises | |
| References and Further Reading | |
| 14.U1 | Under the Hood: Text Parsing |
| 14.U2 | Under the Hood: Log Correction |
Chapter 15: Regression Trees
This chapter introduces the regression tree, an alternative to linear regression for prediction purposes that can find the most important predictor variables and their interactions and can approximate any functional form automatically. More
chapter outline → slides CH15A
| Section | Title |
|---|---|
| 15.1 | The Case for Regression Trees |
| 15.2 | Regression Tree Basics |
| 15.3 | Measuring Fit and Stopping Rules |
| 15.A | CASE STUDY – Predicting Used Car Value with a Regression Tree |
| 15.4 | Regression Tree with Multiple Predictor Variables |
| 15.5 | Pruning a Regression Tree |
| 15.6 | A Regression Tree is a Non-parametric Regression |
| 15.7 | Variable Importance |
| 15.8 | Pros and Cons of Using a Regression Tree for Prediction |
| 15.9 | Main Takeaways |
| Practice Questions | |
| Data Exercises | |
| References and Further Reading |
Chapter 16: Random Forest and Boosting
This chapter introduces two ensemble methods based on regression trees: the random forest and boosting. More
chapter outline → slides CH16A
| Section | Title |
|---|---|
| 16.1 | From a Tree to a Forest: Ensemble Methods |
| 16.2 | Random Forest |
| 16.3 | The Practice of Prediction with Random Forest |
| 16.A | CASE STUDY – Predicting Airbnb Apartment Prices with Random Forest |
| 16.4 | Diagnostics: The Variable Importance Plot |
| 16.5 | Diagnostics: The Partial Dependence Plot |
| 16.6 | Diagnostics: Fit in Various Subsets |
| 16.7 | An Introduction to Boosting and the GBM Model |
| 16.8 | A Review of Different Approaches to Predict a Quantitative y |
| 16.9 | Main Takeaways |
| Practice Questions | |
| Data Exercises | |
| References and Further Reading |
Chapter 17: Probability Prediction and Classification
This chapter introduces the framework and methods of probability prediction and classification analysis for binary y variables. More
chapter outline → slides CH17A
| Section | Title |
|---|---|
| 17.1 | Predicting a Binary y: Probability Prediction and Classification |
| 17.A | CASE STUDY – Predicting Firm Exit: Probability and Classification |
| 17.2 | The Practice of Predicting Probabilities |
| 17.3 | Classification and the Confusion Table |
| 17.4 | Illustrating the Trade-Off between Different Classification Thresholds: The ROC Curve |
| 17.5 | Loss Function and Finding the Optimal Classification Threshold |
| 17.6 | Probability Prediction and Classification with Random Forest |
| 17.7 | Class Imbalance |
| 17.8 | The Process of Prediction with a Binary Target Variable |
| 17.9 | Main Takeaways |
| Practice Questions | |
| Data Exercises | |
| References and Further Reading | |
| 17.U1 | Under the Hood: The Gini Node Impurity Measure and MSE |
| 17.U2 | Under the Hood: On the Method of Finding an Optimal Threshold |
Chapter 18: Forecasting from Time Series Data
This chapter discusses forecasting: prediction from time series data for one or more time periods in the future. More
chapter outline → slides CH18A CH18B
| Section | Title |
|---|---|
| 18.1 | Forecasting: Prediction Using Time Series Data |
| 18.2 | Holdout, Training, and Test Samples in Time Series Data |
| 18.3 | Long-Horizon Forecasting: Seasonality and Predictable Events |
| 18.4 | Long-Horizon Forecasting: Trends |
| 18.A | CASE STUDY – Forecasting Daily Ticket Volumes for a Swimming Pool |
| 18.5 | Forecasting for a Short Horizon Using the Patterns of Serial Correlation |
| 18.6 | Modeling Serial Correlation: AR(1) |
| 18.7 | Modeling Serial Correlation: ARIMA |
| 18.B | CASE STUDY – Forecasting a Home Price Index |
| 18.8 | VAR: Vector Autoregressions |
| 18.9 | External Validity of Forecasts |
| 18.10 | Main Takeaways |
| Practice Questions | |
| Data Exercises | |
| References and Further Reading | |
| 18.U1 | Under the Hood: Details of the ARIMA Model |
| 18.U2 | Under the Hood: Auto-Arima |
PART IV: CAUSAL ANALYSIS
Chapter 19: A Framework for Causal Analysis
This chapter introduces a framework for causal analysis. More
chapter outline → slides CH19A
| Section | Title |
|---|---|
| 19.1 | Intervention, Treatment, Subjects, Outcomes |
| 19.2 | Potential Outcomes |
| 19.3 | The Individual Treatment Effect |
| 19.4 | Heterogeneous Treatment Effects |
| 19.5 | ATE: The Average Treatment Effect |
| 19.6 | Average Effects in Subgroups and ATET |
| 19.7 | Quantitative Causal Variables |
| 19.A | CASE STUDY – Food and Health |
| 19.8 | Ceteris Paribus: Other Things Being the Same |
| 19.9 | Causal Maps |
| 19.10 | Comparing Different Observations to Uncover Average Effects |
| 19.11 | Random Assignment |
| 19.12 | Sources of Variation in the Causal Variable |
| 19.13 | Experimenting versus Conditioning |
| 19.14 | Confounders in Observational Data |
| 19.15 | From Latent Variables to Measured Variables |
| 19.16 | Bad Conditioners: Variables Not to Condition On |
| 19.17 | External Validity, Internal Validity |
| 19.18 | Constructive Skepticism |
| 19.19 | Main Takeaways |
| Practice Questions | |
| Data Exercises | |
| References and Further Reading |
Chapter 20: Designing and Analyzing Experiments
This chapter discusses the most important questions about designing an experiment and analyzing data from an experiment to estimate the average effect of an intervention. More
chapter outline → slides CH20A CH20B
| Section | Title |
|---|---|
| 20.1 | Randomized Experiments and Potential Outcomes |
| 20.2 | Field Experiments, A/B Testing, Survey Experiments |
| 20.A | CASE STUDY – Working from Home and Employee Performance |
| 20.B | CASE STUDY – Fine Tuning Social Media Advertising |
| 20.3 | The Experimental Setup: Definitions |
| 20.4 | Random Assignment in Practice |
| 20.5 | Number of Subjects and Proportion Treated |
| 20.6 | Random Assignment and Covariate Balance |
| 20.7 | Imperfect Compliance and Intent-to-Treat |
| 20.8 | Estimation and Statistical Inference |
| 20.9 | Including Covariates in a Regression |
| 20.10 | Spillovers |
| 20.11 | Additional Threats to Internal Validity |
| 20.12 | External Validity, and How to Use the Results in Decision Making |
| 20.13 | Main Takeaways |
| Practice Questions | |
| Data Exercises | |
| References and Further Reading | |
| 20.U1 | Under the Hood: LATE: The Local Average Treatment Effect |
| 20.U2 | Under the Hood: The Formula for Sample Size Calculation |
Chapter 21: Regression and Matching with Observational Data
In this chapter we discuss how to condition on potential confounder variables in practice, and how to interpret the results when our question is causal. More
chapter outline → slides CH21A
| Section | Title |
|---|---|
| 21.1 | Thought Experiments |
| 21.A | CASE STUDY – Founder/Family Ownership and Quality of Management |
| 21.2 | Variables to Condition on, Variables Not to Condition On |
| 21.3 | Conditioning on Confounders by Regression |
| 21.4 | Selection of Variables and Functional Form in a Regression for Causal Analysis |
| 21.5 | Matching |
| 21.6 | Common Support |
| 21.7 | Matching on the Propensity Score |
| 21.8 | Comparing Linear Regression and Matching |
| 21.9 | Instrumental Variables |
| 21.10 | Regression-Discontinuity |
| 21.11 | Main Takeaways |
| Practice Questions | |
| Data Exercises | |
| References and Further Reading | |
| 21.U1 | Under the Hood: Unobserved Heterogeneity and Endogenous x in a Regression |
| 21.U2 | Under the Hood: LATE is IV |
Chapter 22: Difference-in-Differences
This chapter introduces difference-in-differences analysis, or diff-in-diffs for short, and its use in understanding the effect of an intervention. More
chapter outline → slides CH22A
| Section | Title |
|---|---|
| 22.1 | Conditioning on Pre-intervention Outcomes |
| 22.2 | Basic Difference-in-Differences Analysis: Comparing Average Changes |
| 22.A | CASE STUDY – How Does a Merger between Airlines Affect Prices? |
| 22.3 | The Parallel Trends Assumption |
| 22.4 | Conditioning on Additional Confounders in Diff-in-Diffs Regressions |
| 22.5 | Quantitative Causal Variable |
| 22.6 | Difference-in-Differences with Pooled Cross-Sections |
| 22.7 | Main Takeaways |
| Practice Questions | |
| Data Exercises | |
| References and Further Reading |
Chapter 23: Methods for Panel Data
This chapter introduces the most widely used regression methods to uncover the effect of an intervention when observational time series (tseries) data or cross-section time-series (xt) panel data is available with more than two time periods. More
chapter outline → slides CH23A CH23B
| Section | Title |
|---|---|
| 23.1 | Multiple Time Periods Can Be Helpful |
| 23.2 | Estimating Effects Using Observational Time Series |
| 23.3 | Lags to Estimate the Time Path of Effects |
| 23.4 | Leads to Examine Pre-trends and Reverse Effects |
| 23.5 | Pooled Time Series to Estimate the Effect for One Unit |
| 23.A | CASE STUDY – Import Demand and Industrial Production |
| 23.6 | Panel Regression with Fixed Effects |
| 23.7 | Aggregate Trend |
| 23.B | CASE STUDY – Immunization against Measles and Saving Children |
| 23.8 | Clustered Standard Errors |
| 23.9 | Panel Regression in First Differences |
| 23.10 | Lags and Leads in FD Panel Regressions |
| 23.11 | Aggregate Trend and Individual Trends in FD Models |
| 23.12 | Panel Regressions and Causality |
| 23.13 | First Differences or Fixed Effects? |
| 23.14 | Dealing with Unbalanced Panels |
| 23.15 | Main Takeaways |
| Practice Questions | |
| Data Exercises | |
| References and Further Reading |
Chapter 24: Appropriate Control Groups for Panel Data
This chapter discusses how data analysts can select a subset of the untreated observations in the data that are the best to learn about the counterfactual, and when that needs to be a conscious choice instead of using all available observations in the data. More
chapter outline → slides CH24A CH24B
| Section | Title |
|---|---|
| 24.1 | When and Why to Select a Control Group in xt Panel Data |
| 24.2 | Comparative Case Studies |
| 24.3 | The Synthetic Control Method |
| 24.A | CASE STUDY – Estimating the Effect of the 2010 Haiti Earthquake on GDP |
| 24.4 | Event Studies |
| 24.B | CASE STUDY – Estimating the Impact of Replacing Football Team Managers |
| 24.5 | Selecting a Control Group in Event Studies |
| 24.6 | Main Takeaways |
| Practice Questions | |
| Data Exercises | |
| References and Further Reading |