Chapter 10: Multiple Linear Regression
Understanding associations while controlling for multiple factors
Chapter 10: Multiple Linear Regression
Gender wage gaps. There is a substantial difference in the average earnings of women and men in all countries. You want to understand more about the potential origins of that difference, focusing on employees with a graduate degree. You have data on a large sample of employees with their earnings and characteristics like age and degree type. How can you uncover gender differences that are not due to differences in these other characteristics?
Finding hotel deals. Youβve analyzed hotel prices in a city to find hotels that are underpriced relative to their distance from the city center. But hotels also differ in quality features related to price. How can you find hotels that are underpriced relative to all their features?
What Youβll Learn
This chapter introduces multiple linear regression β the most widely used method to uncover patterns of associations between variables. Youβll learn:
- Why and when to use multiple regression
- How to interpret coefficients in the presence of multiple explanatory variables
- The concept of omitted variable bias and why it matters
- Statistical inference with multiple regression
- How to include categorical variables and interactions
- Applications to causal analysis and prediction
π Chapter Structure
This chapter is organized into 4 pages for optimal learning, with 6 case studies using real data:
Page 1: Foundation β β
Sections 10.1-10.3 - Why and when to use multiple regression (Section 10.1) - Multiple linear regression with two explanatory variables (Section 10.2) - Multiple regression and simple regression: Omitted variable bias (Section 10.3)
Case Study A1: Understanding the Gender Difference in Earnings
Multiple linear regression
What youβll master: Core concepts of multiple regression and why controlling for other variables matters
Page 2: Statistical Inference β β
Sections 10.4-10.6 - Multiple linear regression terminology (Section 10.4) - Standard errors and confidence intervals (Section 10.5) - Hypothesis testing in multiple linear regression (Section 10.6)
Case Study A2: Understanding the Gender Difference in Earnings
Statistical inference
What youβll master: Statistical inference in multiple regression and interpreting uncertainty
Page 3: Extensions β β
Sections 10.7-10.8 - Multiple linear regression with three or more explanatory variables (Section 10.7) - Nonlinear patterns and multiple linear regression (Section 10.8)
Case Study A3: Understanding the Gender Difference in Earnings
Nonlinear patterns and multiple linear regression
What youβll master: Working with many variables and capturing nonlinear relationships
Page 4: Applications β β
Sections 10.9-10.12 - Qualitative right-hand-side variables (Section 10.9) - Interactions: Different slopes across groups (Section 10.10) - Multiple regression and causal analysis (Section 10.11) - Multiple regression and prediction (Section 10.12)
Case Studies: - A4: Gender Earnings - Qualitative variables - A5: Gender Earnings - Interactions - A6: Gender Earnings - Causal interpretation - B1: Hotel Prices - Prediction with multiple regression
What youβll master: Categorical variables, interactions, and applying multiple regression to causal questions and prediction
π― Learning Objectives
By the end of this chapter, you will be able to:
- β Identify questions best answered with multiple regression from available data
- β Estimate multiple linear regression coefficients and present and interpret them
- β Estimate appropriate standard errors, create confidence intervals and test coefficients
- β Select variables to include in a multiple regression guided by the purpose of analysis
- β Understand the relationship between multiple regression results and causal effects
- β Use multiple regression for prediction and residual analysis
- β Include categorical variables using dummy variables
- β Use interaction terms to allow different relationships across groups
π Case Studies
This chapter includes 6 case studies using real data, distributed across the 4 pages:
Case Studies A1-A6: Gender Earnings Gap
π Code Repository
- Data: Current Population Survey (CPS), USA, 2014
- Sample: 18,241 employees with graduate degrees (ages 24-65)
- Question: Understanding the gender wage gap
The six case studies progressively build understanding: - A1 (Page 1): Basic multiple regression with age - A2 (Page 2): Statistical inference on coefficients - A3 (Page 3): Nonlinear age patterns - A4 (Page 4): Education categories - A5 (Page 4): Gender Γ age interactions - A6 (Page 4): Causal interpretation and many covariates
Case Study B1: Hotel Prices
π Code Repository
- Data: ~217 hotels in Vienna, November 2017
- Sample: Hotels with 3-4 stars within 8 miles of city center
- Question: Finding underpriced hotels using multiple features
- Location: Page 4
π» Code & Data
All case studies include: - R Code: Complete, reproducible analysis with tidyverse - Python Code: Python equivalents
- Stata Code: For regression-focused analyses - Datasets: Cleaned and ready to use - Codespaces: One-click cloud coding environment
Click any βOpen in Codespaceβ button to: 1. Launch a pre-configured coding environment in your browser 2. Run all analyses without installing anything 3. Modify code and experiment with the data 4. See exactly how tables and figures were created
No setup required β just click and code!
π€ AI Practice Tasks
Each page includes interactive AI practice tasks where you can: - Copy prompts to use with AI assistants - Get personalized explanations of concepts - Generate practice problems tailored to your learning - Check your understanding with worked examples
How to use: 1. Click βπ Copy & Open in AI Chatβ on any AI task 2. Work through the explanation or problem 3. Return to the textbook to continue learning
β±οΈ Time Estimates
- Quick overview: 2-3 hours (read all pages, skim examples)
- Deep learning: 6-8 hours (work through all examples and AI tasks)
- With hands-on coding: 10-12 hours (replicate all analyses)
- Complete mastery: 15+ hours (coding + practice problems + extensions)
Recommended pace: 1-2 pages per study session
π Part II: Regression Analysis Context
Chapter 10 is part of the Regression Analysis section of the textbook:
Chapter 7: Simple Regression
Foundation β one explanatory variable
Chapter 8: Complicated Patterns and Messy Data
Nonlinear patterns and robust methods
Chapter 9: Generalizing Results of a Regression
Statistical inference in simple regression
β Chapter 10: Multiple Linear Regression β
You are here
Multiple explanatory variables and controlling for covariates
Chapter 11: Modeling Probabilities
Binary outcomes and logistic regression
Chapter 12: Regression with Time Series Data
Temporal patterns and forecasting
π Prerequisites
Required knowledge: - Chapter 7: Simple Regression (essential) - Chapter 9: Generalizing Regression Results (essential) - Basic statistics: mean, variance, correlation, standard deviation - Understanding of confidence intervals and hypothesis tests
Helpful but not essential: - Chapter 8: Complicated Patterns (for nonlinear models) - Linear algebra basics (not covered in this book) - Matrix notation (not used in this chapter)
Do not skip Chapter 7 and 9! Multiple regression builds directly on simple regression concepts. Without understanding simple regression, you will struggle with: - What regression coefficients mean - How to interpret standard errors and confidence intervals - The logic of hypothesis testing - The difference between correlation and causation
π Study Strategies
For First-Time Learners
- Read sequentially β Donβt skip ahead; concepts build on each other
- Pause at examples β Try to interpret results before reading the interpretation
- Use AI tasks actively β They reinforce learning better than passive reading
- Focus on intuition first β Understand βwhyβ before memorizing formulas
- Return to review boxes β They summarize key concepts
For Review or Reference
- Start with review boxes β Get the key concepts quickly
- Jump to specific sections β Use the detailed table of contents
- Check the glossary β Quick definitions of all terms
- Review case study summaries β See applications without details
For Instructors
- Assign pages progressively β 4 natural units for homework/discussion
- Use AI tasks as assignments β Students submit their AI conversations
- Focus on case study interpretation β Better than just running code
- Emphasize review boxes β Core concepts students must master
- Page 4 is comprehensive β May need two class sessions to cover fully
π What Makes This Chapter Unique
Compared to other textbooks:
- Integrated case studies β Same dataset across 6 studies showing progression
- Practical focus β Always connects theory to real applications
- Modern tools β Robust standard errors, emphasis on interpretation
- Honest about causality β Clear about what regression can and cannot show
- Visual learning β Extensive use of graphs and intuitive explanations
- Accessible math β Formulas included but explained intuitively
π Key Concepts Preview
By the end of this chapter, youβll deeply understand:
Core concepts: - Multiple linear regression equation and interpretation - Conditional vs. unconditional differences - Controlling for covariates - Omitted variable bias
Statistical concepts: - Standard errors in multiple regression - Confidence intervals and hypothesis tests - F-tests for joint hypotheses - Multicollinearity and its consequences
Practical tools: - Dummy variables for categories - Interaction terms for different slopes - Nonlinear patterns in multiple regression - Prediction and residual analysis
Big picture: - When multiple regression helps with causality - When to focus on prediction vs. causal inference - How to select variables for different purposes - Limits of observational data
π Ready to Start?
Begin with Page 1: Foundation β
Or explore: - Page 2: Statistical Inference β - Page 3: Extensions β
- Page 4: Applications β - View Glossary β Quick reference for all terms
Chapter complete! All 4 pages covering Sections 10.1-10.12 are now available.
- Keep a notebook β Write down key insights and questions
- Work through formulas β Donβt just read them, calculate examples
- Compare specifications β Notice how results change with different models
- Think causally β Always ask βwhatβs omitted?β when interpreting coefficients
- Visualize results β Draw graphs to understand patterns
- Discuss with peers β Explaining concepts helps you learn
- Apply to your data β Think how methods apply to your research
π Book Information
Full Title: Data Analysis for Business, Economics, and Policy
Authors: GΓ‘bor BΓ©kΓ©s & GΓ‘bor KΓ©zdi
Publisher: Cambridge University Press (2021)
Interactive Edition: 2025
Resources: - π Main textbook site - π» Code repository - π Datasets - π Instructor resources