Beyond: Directions to frontier


What’s beyond? I. Advanced stuff

Our textbook is not a pamphlet, you know it’s 750 pages. But it is comprehensive so will abviously miss a great deal of methods, tools and ideas. Here we collect some brief pointers to more advanced ideas that could be useful for an analyst.

What’s beyond? II. New ideas, changed views

When we were students, textbooks seemed like the collection of ultimate and absolute truth. What’s in a textbook must be right. But well, not really.

Sure, there are views that are pretty stable. The first derivative \(ln(x)=1/x\). Smoking is bad for health. But some widely holds views do change. But, back in the eighties, butter was thought to be bad and you were supposed to get marganine. It turned out, while too much butter is not good, marganine is worse overall.

Similarly in statistics, machine learning, and causal inference, we have known OLS is BLUE for a long time. But the fact that we can get negative weights when estimating a tinme and unit fixed effect model is pretty new. It was discovered around writing of the manusdcript for the first edition.

Some stuff in the textbook is close to the frontier of knowledge. But this fontier might change. So, here we’ll collect a bunch of very short dicussion of emerging ideas.

Beyond

Part I

Chapter 01

  • We talk about surveys quite a bit in Chapter 01, giving some practical advice as well. But, of course, there is more. In many jobs, private and public!, you will be asked to design surveys. A great review is “How to run survey: A guide to creating your own identifying variation and revelealing the invisible”, NBER DP Stefanie Stantcheva. With ideas, and cautionary tales.

Chapter 06

  • On p-hacking, a fantastic story is about a body of research in social psychology written up in New York Times Magazine in 2017. The review of methods started in 2012 soon led to the birth of data investigation team Data Colada in 2013 by Profs Uri Simonsohn, Leif Nelson and Joe Simmons. They also wrote a paper on p-curve, a tool to analyze a body of literature. Read any other pieces of Data Colada on challenges to reproducibility. Amazing stuff.

Part II

Chapter 09

  • Regarding external validity, one way to check robustness is to take out 1% of the data and repeat the exercise. The simple take is to do it many times randonly + many times by edge of distribution of key variables. The smart take is suggested by Tamara Broderick, Ryan Giordano, Rachael Meager in “An Automatic Finite-Sample Robustness Metric: Can Dropping a Little Data Change Conclusions?” Hard-core statistics. Preprint

Part III

Chapter 14

Chapter 16

Chapter 17

  • As machine learning methods developed, one key desire has been about interpretability. What are the key predictors? How do features relate to outcome. A massive literature emerged using SHAP, an algorithm, based on the game theoretical idea of Shapley values. One example is Christoph Molnar’s Interpretable ML book. It looks a great tool. However, a paper published in early 2024 showed severe limitations of the method. In a PNAS paper Bilodeau, Jaques, Koh, and Kim Impossibility theorems for feature attribution advised against using it.

Part IV

Chapter 19

  • On DAGs and Potential outcomes, deep discussion for social scientists: Imbens, Guido W. 2020. “Potential Outcome and Directed Acyclic Graph Approaches to Causality: Relevance for Empirical Practice in Economics.” Journal of Economic Literature, 58 (4): 1129-79. LINK to paper. An amazing review that includes Twitter quotes.

Chapter 19

Chapter 20

  • On A/B testing, some neat ideas in presentation by Harlan Harris, with code in R

Chapter 24

  • Fixed effects. Difference in differences. Event studies. These core methods just experienced the largest revival in research for two decades. Since the finish of the 1st edition manuscript in the beginning of 2020, several very important papers were published on potential pitfalls of the two-way fixed effects model. There are several key new results. First, weights matter. For example, xt fixed effect results may be incorrect in the sense that some observations are taken with negative weight. Second, more attention must be paid on selecting control observations, especially when the intervention is staggered (deployed at tódifferent time on different units) and/or heterogeneous in terms of dose. [MORE]

  • Sythetic controls and difference in differences evolved separately. But recently a marriage of convenience is taking place. In a 2021 AER paper by Arkhangelsky, Athey, Hirshberg, Imbens and Wager Synthetic Difference-in-Differences offered a way to think about them together. Also, Susan Athey has a great youtube video presenting it. [MORE] On SC, see also Abadie and ives-i-Bastida Synthetic Controls in Action