The Linear Plan

The Linear Plan

This page exists because the book is a re-arrangement of an existing course, and a careful reader will want to know what was moved, why, and where the original lives. It also serves as a working document while the book is in draft: it explains the editorial logic of the chapter order so it can be challenged.

The arc in one paragraph

Start with what an LLM actually is and which one to use. Then learn to use chat models well — for documentation, joins, reports, graphs. Move to the terminal and to agentic CLIs (Claude Code), which let AI act on real files in real projects. Apply that workflow to text data — interview transcripts, sentiment. Then to research workflows — control variables, instrumental variables. Then to APIs and automation, which let you stop clicking and start scheduling. End with a capstone that exercises everything against one messy real-world question.

How weekly course material maps to chapters

Course source Chapter(s)
Week 0 — AI for coding prep Ch. 1 (Foundations)
Week 1 — LLM review Ch. 2 (Foundations)
Week 2 — Documentation Ch. 5 (Working in chat)
Week 3 — Reporting Ch. 8–9 (Working in chat)
Week 4 — Agentic AI with CLIs Ch. 14 (CLI workflows)
Week 5 — Advanced CLI workflows Ch. 15 (CLI workflows)
Week 6 — Data to report Ch. 16 (CLI workflows)
Week 7 — Text as data Ch. 20 (Text as data)
Week 8 — Sentiment analysis Ch. 21 (Text as data)
Week 9 — AI research: controls Ch. 23 (AI in research)
Week 10 — AI research: IV Ch. 24 (AI in research)
Capstone (3 sessions) Ch. 31–34 (Capstone part)

How knowledge-base articles fold in

The course-website’s “Knowledge Base” sidebar holds reusable reference pages. In the book they stop being a separate sidebar and become chapters in the place where you first need them.

Knowledge-base page Folded into
Which AI model Ch. 3 — right after the LLM review
Glossary of LLM terms Ch. 4 — closes Part I
Documentation fundamentals Ch. 6 — alongside the documentation chapter
Joining data tables Ch. 7 — before reports require joined data
Creating graphs Ch. 10 — closes the chat-workflow part
Terminal basics Ch. 11 — opens the CLI part
Installing AI CLI tools Ch. 12 — install before use
VS Code + Copilot setup Ch. 13 — alternative IDE-based workflow
Designing larger analytics projects Ch. 17 — once CLIs are working
Reproducible research Ch. 18 — closes the CLI part
NLP basics Ch. 19 — opens text-as-data
PDF guide Ch. 22 — closes text-as-data
Get AI API keys Ch. 25 — opens APIs part
APIs intro Ch. 26
LLM APIs in Python Ch. 27
APIs under the hood Ch. 28
Walkthrough: World Bank + FRED Ch. 29
Walkthrough: FBref Ch. 30
Beyond — what to read next Reference

What stays in the website but not the book

  • The slideshows are linked from the website but live elsewhere; the book mentions them in passing.
  • Case-study pages (with their data dictionaries and sample code) live in the website’s case-studies/ and are referenced from the book; the Reference part summarises each in one page.
  • The week-by-week schedule and dates belong to the live course, not the book.

Editorial principles for the draft

  1. One chapter, one idea. Where a course week stuffed two ideas into one page (e.g. text-analysis intro + sentiment), the book splits them into two chapters.
  2. Reference pages stop being reference pages. A reader on a linear path doesn’t want side-doors; they want the next chapter to assume what the previous one taught.
  3. Labs preserve original wording. The narrative is rewritten for a single voice; the labs keep the original assignment text so a student can submit them with confidence.
  4. The capstone is a part, not an appendix. It is the payoff for the whole book. Three chapters, one per session, with a brief that lets a self-learner do it solo or a team do it together.
  5. APIs come late. Most students will not need APIs until the capstone. Chapter-25–30 sit between the research-methods part and the capstone deliberately.

Editorial decisions for this edition

These were open questions in the first draft. They are now locked for the Spring 2026 edition. Subsequent editions can revisit, but a single edition holds these constant.

  • Code language: Python only. Earlier course editions juggled Python and R, which made every chapter longer than it needed to be and every example shallower than it could be. The book is Python-only; principles transfer.
  • CLI agent: Claude Code, with brief footnotes for alternatives. The CLI chapters are written against Claude Code — installation, CLAUDE.md, skills, headless mode. Cursor / Codex / Aider get one-paragraph mentions where they differ usefully. The principles transfer.
  • Format: web-first. The book is built and read on the web. A print edition is the print editor’s problem to derive from a frozen snapshot — we do not pre-emptively cripple the web version with print-friendly compromises (no width-limited graphs, hero sections stay, embeds stay).
  • Versioning: yearly spring edition. The book is updated once a year, in spring, dated and labelled — e.g. Spring 2026, 26 April 2026. The model snapshot lives in versions.qmd and is the first place to update each year. Inter-edition errata go in the GitHub issue tracker.
  • Econometrics review: brief, with a textbook redirect. Parts V and VII brush against DiD, IV, and selection-on-observables. The book covers the workflow — how AI helps, where it bluffs — and assumes the reader knows the method, or is willing to follow the redirect to Békés & Kézdi (2021) for it.
  • Chapter length: short. Many short single-idea chapters, not few long bundled ones. As content gets fleshed out, prefer splitting over merging.