Popular but high quality books on data
- Nate Silver, The signal and the noise (2012) A great book from a leading expert in polling and sports statistics on prediction. A big picture including statistical and other kinds of predictions; a must read for all who want to do predictive analytics.
- Hans Rosling et al., [Factfulness: Ten Reasons We’re Wrong About the World–and Why Things Are Better Than You Think (2018)] (https://www.gapminder.org/factfulness-book/) A book summarizing decades of public advocacy from the late doctor and epidemilogist Hans Rosling and his collaborators to understand the world around us by making sense of cross-country data. A must read for everyone, really.
- David Salsburg Lady tasting Tea - How Statistics revolutionized science in the twentieth century (2002) A great book about the history of statistics and statistical ideas with many great stories. A must read for statistics nerds.
- Philip Tetlock, Superforecasting: The Art and Science of Prediction A great book summarizing some of the research and ideas of one of the leading experst in prediction. Not explicitely about statistical predictions; more of a big picture reading for those who want to evaluate predictions.
- Nassim Nicholas Taleb: Fooled by Randomness: The Hidden Role of Chance in Life and in the Markets
- Seth Stephens-Davidowitz, Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are
- David Spiegelhalter The Art of Statistics
- Cathy O’Neil, Weapons of Math Destruction
Books related to decison-making with data
- Daniel Kahneman, Thinking fast and slow (2011) A great book summarizing a live’s resarch of the economics Nobel-winner psychologist.
- Stefan Szymanski Money and Soccer and Simon Kuper and Stefan Szymanski Socceronomics Two great books on understanding football via data.
- Michael Pollan: In defense of food (2008) A great book from an investigative journalist on what we should eat and why, with a very good description of what nutrition research can and cannot uncover using observational data.
- Michael Lewis, Moneyball
- Andrew Leigh: Randomistas: How Radical Researchers Are Changing Our World
- Cole Nussbaumer Knaflic: Storytelling with Data
- Kieran Healy Data Visualization - A practical introduction
- Claus Wilke Data vizualization
- Alberto Cario How Charts Lie: Getting Smarter about Visual Information
More advanced stuff
More advanced books - analytics, statistics and data science
- Ajay Agrawal, Joshua Gans and Avi Goldfarb, Prediction Machines: The Simple Economics of Artificial Intelligence
- Eric Siegel Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die
- Michael Luca and Max H. Bazerman (2020) The Power of Experiments: Decision Making in a Data-Driven World
- Judea Pearl The Book of Why - intermediate book on causality
- Roger Peng The Art of Data Science
- Efron-Hastie Computer Age Statistical Inference - intense stat book with Big Data in mind
- Christopher M. Bishop Pattern Recognition and Machine Learning
- Nina Zumel and John Mount Practical Data Science with R - nice collection of coding/analysis ideas - often covered in our courses.
- Hyndman and Athananasopoulos Forecasting: Principles and Practice - useful time series book
- Kelleher, John D. Brendan Tierney Data Science (MIT Press 2018)
Blogs and more
Interesting, non-technical articles
- Roger Peng on data science principles
- McKinsey’s non-technical discussion of machine learning
- David Donoho 50 years of data science
- Susan Athey in Science (2017) Beyond prediction: Using big data for policy problems
- American statistical organization on research and the p-value. Statistical Significance and the Dichotomization of Evidence
- Time series forecasting competition materials https://www.m4.unic.ac.cy/
- Roger Peng on good data science
- NYT Upshot on Polling errors
- 538 on nutrition. https://fivethirtyeight.com/features/you-cant-trust-what-you-read-about-nutrition/
- Nick Barrowman on why data is not independent from judgement Why Data is never raw
Podcasts, blogs to follow
- https://simplystatistics.org/ - A statistics blog by Rafa Irizarry, Roger Peng, and Jeff Leek
- http://nssdeviations.com/ - The Data Science Podcast Roger Peng and Hilary Parker talk about the latest in data science and data analysis in academia and industry. [recommended]
- http://andrewgelman.com/ - Statistical Modeling, Causal Inference, and Social Science
Practice data and code
- Nice collection of data collections - https://www.columnfivemedia.com/100-best-free-data-sources-infographic
- Weekly newsletter - tinyletter.com/data-is-plural
- Nicely searchable source - public.enigma.com/#data-connections
- Nice educational collection of coding http://Idre.ucla.edu
- Very nice initiative for collaborative data projects. Include many datasets with info. https://data.world/
- This is a collection of ML/AI papes with code. Mostly very technical - paperswithcode.com/
- Amazing collection by Hadley Wickham - DS Stats337
- U Washington Data Lab - Intetrviews on business data viz Enterprise-analysis-interviews