Data Analysis for Business, Economics, and Policy

Data Analysis for Business, Economics, and Policy

** A comprehensive textbook on data analysis for business, applied economics and public policy students, that uses case studies with real-world data. **

Why use this book?

Data analysis is a process. It starts with formulating a question and collecting appropriate data, or assessing whether the available data can help answer the question. Then comes cleaning and organizing the data, tedious but essential tasks that affect the results of the analysis as much as any other step in the process. Exploratory data analysis gives context to the eventual results and helps deciding the details of the analytical method to be applied. The main analysis consists of choosing and implementing the method to answer the question, with potential robustness checks. Along the way, correct interpretation and effective presentation of the results are crucial. Carefully crafted data visualization help summarize our findings and convey key messages. The final task is to answer the original question, with potential qualifications and directions for future inquiries.

Our textbook equips future data analysts with the most important tools, methods and skills they need through the entire process of data analysis to answer data focused, real life questions. We cover all the fundamental methods that help along the process of data analysis. The textbook is divided into four parts covering data wrangling and exploration, regression analysis, prediction with machine learning, and causal analysis. We explain when, why, and how the various methods work, and how they are related to each other. MORE on content

To cover all of the steps that are necessary to carry out an actual data analysis project, we lean on 47 fully developed case studies. While each case study focuses on the particular method discussed in the chapter, they illustrate all elements of the process from question through analysis to conclusion. MORE on case studies

We share all raw and cleaned data we use in the case studies. We also share the codes that clean the data and produce all results, tables, and graphs in Stata, R, and Python so students can tinker with our code and compare the solutions in the different software. MORE on data and code

This textbook was written to be a complete course in data analysis. This textbook could be useful for university students in graduate programs as core text in applied statistics and econometrics, quantitative methods, or data analysis. It may also complement online courses that teach specific methods to give more context and explanation. Undergraduate courses can also make use of this textbook, even though the workload on students exceeds the typical undergraduate workload. Finally, the textbook can serve as a handbook for practitioners to guide them through all steps of real-life data analysis. MORE on why use this book?

About authors

Gábor Békés

Gábor Békés is an Assistant Professor at the Department of Economics and Business of the Central European University and director of the MS in Business Analyticsprogram. He is a research affiliate at the Center for Economic Policy Research (CEPR). He published in top economics journals on multinational firm activities and productivity, business clusters, and innovation spillovers. He managed international data collection projects on firm performance and supply chains. He has done both policy advising (the European Commission, ECB) as well as private sector consultancy (in finance, business intelligence and real estate). He has taught graduate-level data analysis and economic geography courses since 2012. Personal website

Gábor Kézdi

Gábor Kézdi is a Research Associate Professor at the University of Michigan’s Institute for Social Research. He published in top journals in economics, statistics, and political science on topics including household finances, health, education, demography, and ethnic disadvantages and prejudice. He managed several data collection projects in Europe; currently, he is co-investigator of the Health and Retirement Study in the U.S. He consulted various governmental and non-governmental institutions on the disadvantage of the Roma minority and the evaluation of social interventions. He has taught data analysis, econometrics, and labor economics from undergraduate to Ph.D. levels since 2002 and supervised a number of MA and PhD students. Personal website

Case studies

We have summarized briefly all our 47 case studies, and provided data and code. For details check out Material for case studies

You may also download a pdf with a summary of case studies


To see content of the textbook, you may simply download the Table of contents.

For more details, check out Chapter summaries

You can access sample chapters, too. HERE - TO ADD

Data and code

We provide access to get all the data and code we used. You can download data by topics, all get them together in a zipped file. You can get code also one by one or in a zipped file.

Code is available in R, Stata (except for machine learning) and Python (mostly).

For more check out the Data and code page.

Teaching material

We provide 24 slideshows for the 24 chapters. For educational purposes you can download them and use them, modify them with giving us some credit.

We provide answers to practice questions for instructors at the Cambridge UP website