Additional Reading Suggestions
Life did not stop when we finished the manusdcript. Actually, we keep finding great stuff. So let us make some suggestion for additional readings per chapters.
Part I
Chapter 01
- On surveys, a great review is “How to run survey: A guide to creating your own identifying variation and revelealing the invisible”, NBER DP Stefanie Stantcheva.
Chapter 06
- On p-hacking, a fantastic story is about a body of research in social psychology written up in New York Times Magazine in 2017. The review of methods started in 2012 soon led to the birth of data investigation team Data Colada in 2013 by Profs Uri Simonsohn, Leif Nelson and Joe Simmons. They also wrote a paper on p-curve, a tool to analyze a body of literature. Read any other pieces of Data Colada on challenges to reproducibility. Amazing stuff.
Part II
Chapter 09
- Regarding external validity, one way to check robustness is to take out 1% of the data and repeat the exercise. The simple take is to do it many times randonly + many times by edge of distribution of key variables. The smart take is suggested by Tamara Broderick, Ryan Giordano, Rachael Meager in “An Automatic Finite-Sample Robustness Metric: Can Dropping a Little Data Change Conclusions?” Hard-core statistics. Preprint
Chapter 10
- on the Simpsons paradox, one article where it is front and center is about identifying who migrates published in the [Journal of Development Economics] (https://www.sciencedirect.com/science/article/abs/pii/S0304387824001081?via%3Dihub): Michael A. Clemens, Mariapia Mendola, Migration from developing countries: Selection, income elasticity, and Simpson’s paradox, Journal of Development Economics, Volume 171, 2024
Part III
Chapter 16
-
On the partial dependence plots, you may check out both a very useful review of R’s pdp package as well as Christoph Molnar’s Interpretable ML book.
-
On similar house prediction project, Julia Silge does a super nice job hoing through steps, showing graphs. Making great use of text. Boosted trees. Tidymodels and more. Check out her post and video: Predict housing prices in Austin TX with tidymodels and xgboost
-
Why Random Forest work. Useful paper Alicia Curth, Alan Jeffares, Mihaela van der Schaar
Part IV
Chapter 19
- On DAGs and Potential outcomes, deep discussion for social scientists: Imbens, Guido W. 2020. “Potential Outcome and Directed Acyclic Graph Approaches to Causality: Relevance for Empirical Practice in Economics.” Journal of Economic Literature, 58 (4): 1129-79. LINK to paper. An amazing review that includes Twitter quotes.
Chapter 19
- Beetroot juice is said to be great. Review study Another review. For example, reference to an RCT with beetroot juice – dietary inorganic nitrate acutely reduces blood pressure. Study. Review in medical journal
Chapter 20
- On A/B testing, some neat ideas in presentation by Harlan Harris, with code in R