Teaching Guide

In this teaching guide we offer some discussion of the intended target audience, followed by advice on teaching the book in undergraduate and graduate programs. We offer some tentative example sequences, too. The guide finally offers some advice on using auxiliary material.

Target audience

Who is this book for?

This textbook was written to be a complete course in data analysis. It introduces and discusses the most important concepts and methods in exploratory data analysis, regression analysis, machine learning and causal analysis. Thus, readers don’t need to have a background in those areas.

Tools and insights from Econometrics and Data Science

The textbook includes formulae to define methods and tools, but it explains all formulae in plain English, both when a formula is introduced and, then, when it is used in a case study. Thus, understanding formulae is not necessary to learn data analysis from this textbook. They are of great help, though, and we encourage all students and practitioners to work with formulae whenever possible. The mathematics background required to understand these formulae is quite low, at the the level of basic calculus.

This textbook could be useful for university students in graduate programs as core text in applied statistics and econometrics, quantitative methods, or data analysis. The textbook is best used as core text for non-research degree Masters programs or part of the curriculum in a Phd or research Masters programs. It may also complement online courses that teach specific methods to give more context and explanation. Undergraduate courses can also make use of this textbook, even though the workload on students exceeds the typical undergraduate workload. Finally, the textbook can serve as a handbook for practitioners to guide them through all steps of real-life data analysis.

Expected background

The textbook is comprehensive in that no prior knowledge of statistics is required beyond high school math (such as basic calculus). Some knowledge of matrix algebra and probability is useful, but we cover, albeit briefly, what is needed.

No programming experience is needed, because students can learn coding during the course. Moreover, we are developing courses that can be taught along the textbook to teach coding for data analysis in R, or Python, or Stata. The first version of these courses are expected by Spring 2022.

The case studies cover a wide range of fields that can cater to the diverse interests of an undergraduate group. For these reasons the book is adequate for both graduate and undergraduate courses.

Undergraduate and graduate programs

Using the book for undergraduate teaching

The textbook can be used in undergraduate programs with a focus on business, economics, finance, quantitative social studies, or public policy. In turns out that half of the interest in terms of inspection copy requests came related to undergraduate programs.

We believe that the textbook may be covered over two academic years. Parts I and II would be the subjects in one academic year, and parts III and IV, or subsets of them, may be covered in subsequent courses.

In more academic programs where students are prepared for a PhD, a formal econometrics course shall follow this book, where derivations, and advanced methods are covered.

In more business focused programs, selected chapters shall be taught, with a focus on case studies and interpretation. Still, we believe that business and management students shall also learn basics of machine learning and causal analysis.

Example sequences: undergraduate programs

Undergraduate programs in social sciences and business

  • Program examples: 4-year US-style undergraduate programs and 3-year European-style undergraduate programs in social sciences, management, PPE
  • Duration: 1 academic year (2 semesters / 3 trimesters);
  • Sequence: The first two parts of the textbook. I Data Exploration, II Regression Analysis. The courses may include some bits from additional chapters.

Undergraduate programs focusing on economics or quantitative social sciences

  • Program examples: 4-year US-style undergraduate programs with econonomics, quantitaive social science or business major; 3-year European-style specialized undergraduate programs in econ or quantitative social sciences
  • Duration: 1 academic year (2 semesters / 3 trimesters);
  • Sequence: Cover the book over multiple semesters. In the first year, ten chpaters may be covered (with some sections skipped) from the first two parts of the textbook. I Data Exploration, II Regression Analysis. In subsequent courses, the rest of the material and the second half of the book may be used, too: III Prediction and IV Causal Analysis.

Using the book for graduate teaching

The textbook maybe an ideal material for applied graduate programs. Our goal has been to make students acquire a working knowledge of the entire process of data analysis, learning the most important methods. In particular, our textbook prepares students to carry out real-life data analysis without taking more courses.

The textbook is thus well suited for masters’ level programs, such as MBA, MA Economics (non-PhD track), MSc in Business Economics/Management, MA in Public Policy, MSc in Finance, MA in Health Policy, MA in Quantitative Social Sciences, or Business Analytics programs. Moreover, it is well suited for academic programs that are less focused on formal statistical and econometric theory, such as Ph.D. in Management or Ph.D. in Public policy.

As the textbook was designed for applied Masters level programs in mind, and for these programs, the whole textbook may be taught within an academic year. A parallel course on coding could be taught, too. As some students will arrive with some knowledge of descriptive statistics and regression analysis, the first 6-8 chapters may be covered faster.

More academic and two-year Masters may consider using this textbook in the first year followed by formal courses in statistical and econometric theory for those targeting a Phd program in the future. Another option is to use this textbook to teach applied empirical methods in their 2nd year. According to our experience, students appreciate if their first data analysis courses equip them with skills and knowledge so they can start working on actual empirical projects. Moreover, applied courses tend to increase students’ motivation to learn more, taking more advanced and more formal statistics, econometrics, and predictive analysis / machine learning courses later.

Example sequences: graduate programs

Graduate applied business, economics, or social science programs

Graduate business analytics programs

  • Program examples: MSc in Business Analytics, MSc in Data Science for Business, MSc in Business Statistics
  • Duration: 1 academic year (2 semesters / 3 trimesters)
  • Sequence: The first three parts of textbook: I Data Exploration, II Regression Analysis, III Prediction, perhaps adding the first two chapters in part IV Causal Analysis (the framework and running experiments).

A single core data course in graduate business/management programs

Duration: one semester

Sequence: I Data Exploration and II Regression Analysis. + It may make sense to cut from the heavier material in chapters 5 and 6 and, instead, devote time to the framework of prediction (Chapter 13 and designing experiments Chapter 20).

Business, policy, or social science PhD programs, first year

PhD in economics, second-year applied econometrics / data analysis courses

  • Duration: 1 academic year (2 semesters / 3 trimesters)
  • Sequence: Add a course on prediction based on Part III. Add an applied course with a case study focus using bits from II Regression Analysis and Part IV. The prediction course may be complemented with other data science / forecasting books (James, Witten, Hastie and Tibshirani, or Hyndma and Athanasopoulos).

Instructors can use the entire book or parts of it

The book is designed to teach a year-long course in data analysis. Starting from the basics, such as data collection, descriptive statistics going through regression analysis all the way to predictive algorithms and causal analysis. A curated content that should provide a sound basis for many types of analysis.

As an academic year is approximately 25-30 weeks long, the textbook with its 24 chapters may be covered on a week-by-week basis. Indeed, most chapters can be covered within a week, leaving extra time for some of the chapters and review. Often, instructors have a material and will use selected chapters as background reading. Furthermore, one option is to take some of our case studies first. One-third of the book is about the case studies, with motivation, data description, explanations, and the results themselves.

Comprehensive, deep and practical

Some advice using auxiliary material

The book allows for individual learning, facilitating flexible teaching

The textbook is suitable for individual learning as well as traditional classroom teaching, online teaching, or any hybrid version. It aims to include everything students need for individual learning, including intuitive explanations, practical advice, practice questions, and the opportunity to work with code. All of this helps instructors, too, to design their courses in flexible ways. In particular, they can assign entire chapters, or parts of chapters, as individual reading. That allows them spending precious contact time on the especially important or interesting topics and using teaching methods other than lecturing (e.g. going over case studies together with the students). The additional resources help instructors: they can assign homework or administer quizzes based on the practice questions, and they can make students replicate the case study results, also with small modifications, in homework assignments.

A few tips

Assign reading in advance. The textbook is suited for individual learning. Instructors can make active use of that feature and assign readings to students. Indeed, they can assign an entire chapter in advance, do simple quizzes in the beginning of class to make sure students study the material, and focus on questions and case studies.

Run code in class. Instructors can reproduce all case study results by running code, in front of the class or together with students, in both online and offline settings. According to our experience, students appreciate seeing the sausage in the making, and it empowers them to tinker with and run code on their own. In a traditional setting with lectures and practice sessions, the practice sessions instructors should always run code in class. But lectures can include demonstrations running code, too.

Make students replicate results. Each chapter ends with five data exercises, some of which are simple variations on the main case study. Students should start with the case study code and modify them to get the new results. According to our experience, students like replicating the results and doing simple modifications, because this helps them with coding as well as understanding the conceptual aspects of data analysis.

MCQ. Multiple choice questions offered at the Publisher’s website may be used for start of the class quiz or as exam questions.