The Linear Plan
The Linear Plan
This page exists because the book is a re-arrangement of an existing course, and a careful reader will want to know what was moved, why, and where the original lives. It also serves as a working document while the book is in draft: it explains the editorial logic of the chapter order so it can be challenged.
The arc in one paragraph
Start with what an LLM actually is and which one to use. Then learn to use chat models well — for documentation, joins, reports, graphs. Move to the terminal and to agentic CLIs (Claude Code), which let AI act on real files in real projects. Apply that workflow to text data — interview transcripts, sentiment. Then to research workflows — control variables, instrumental variables. Then to APIs and automation, which let you stop clicking and start scheduling. End with a capstone that exercises everything against one messy real-world question.
How weekly course material maps to chapters
| Course source | Chapter(s) |
|---|---|
| Week 0 — AI for coding prep | Ch. 1 (Foundations) |
| Week 1 — LLM review | Ch. 2 (Foundations) |
| Week 2 — Documentation | Ch. 5 (Working in chat) |
| Week 3 — Reporting | Ch. 8–9 (Working in chat) |
| Week 4 — Agentic AI with CLIs | Ch. 14 (CLI workflows) |
| Week 5 — Advanced CLI workflows | Ch. 15 (CLI workflows) |
| Week 6 — Data to report | Ch. 16 (CLI workflows) |
| Week 7 — Text as data | Ch. 20 (Text as data) |
| Week 8 — Sentiment analysis | Ch. 21 (Text as data) |
| Week 9 — AI research: controls | Ch. 23 (AI in research) |
| Week 10 — AI research: IV | Ch. 24 (AI in research) |
| Capstone (3 sessions) | Ch. 31–34 (Capstone part) |
How knowledge-base articles fold in
The course-website’s “Knowledge Base” sidebar holds reusable reference pages. In the book they stop being a separate sidebar and become chapters in the place where you first need them.
| Knowledge-base page | Folded into |
|---|---|
| Which AI model | Ch. 3 — right after the LLM review |
| Glossary of LLM terms | Ch. 4 — closes Part I |
| Documentation fundamentals | Ch. 6 — alongside the documentation chapter |
| Joining data tables | Ch. 7 — before reports require joined data |
| Creating graphs | Ch. 10 — closes the chat-workflow part |
| Terminal basics | Ch. 11 — opens the CLI part |
| Installing AI CLI tools | Ch. 12 — install before use |
| VS Code + Copilot setup | Ch. 13 — alternative IDE-based workflow |
| Designing larger analytics projects | Ch. 17 — once CLIs are working |
| Reproducible research | Ch. 18 — closes the CLI part |
| NLP basics | Ch. 19 — opens text-as-data |
| PDF guide | Ch. 22 — closes text-as-data |
| Get AI API keys | Ch. 25 — opens APIs part |
| APIs intro | Ch. 26 |
| LLM APIs in Python | Ch. 27 |
| APIs under the hood | Ch. 28 |
| Walkthrough: World Bank + FRED | Ch. 29 |
| Walkthrough: FBref | Ch. 30 |
| Beyond — what to read next | Reference |
What stays in the website but not the book
- The slideshows are linked from the website but live elsewhere; the book mentions them in passing.
- Case-study pages (with their data dictionaries and sample code) live in the website’s
case-studies/and are referenced from the book; the Reference part summarises each in one page. - The week-by-week schedule and dates belong to the live course, not the book.
Editorial principles for the draft
- One chapter, one idea. Where a course week stuffed two ideas into one page (e.g. text-analysis intro + sentiment), the book splits them into two chapters.
- Reference pages stop being reference pages. A reader on a linear path doesn’t want side-doors; they want the next chapter to assume what the previous one taught.
- Labs preserve original wording. The narrative is rewritten for a single voice; the labs keep the original assignment text so a student can submit them with confidence.
- The capstone is a part, not an appendix. It is the payoff for the whole book. Three chapters, one per session, with a brief that lets a self-learner do it solo or a team do it together.
- APIs come late. Most students will not need APIs until the capstone. Chapter-25–30 sit between the research-methods part and the capstone deliberately.
Editorial decisions for this edition
These were open questions in the first draft. They are now locked for the Spring 2026 edition. Subsequent editions can revisit, but a single edition holds these constant.
- Code language: Python only. Earlier course editions juggled Python and R, which made every chapter longer than it needed to be and every example shallower than it could be. The book is Python-only; principles transfer.
- CLI agent: Claude Code, with brief footnotes for alternatives. The CLI chapters are written against Claude Code — installation,
CLAUDE.md, skills, headless mode. Cursor / Codex / Aider get one-paragraph mentions where they differ usefully. The principles transfer. - Format: web-first. The book is built and read on the web. A print edition is the print editor’s problem to derive from a frozen snapshot — we do not pre-emptively cripple the web version with print-friendly compromises (no width-limited graphs, hero sections stay, embeds stay).
- Versioning: yearly spring edition. The book is updated once a year, in spring, dated and labelled — e.g. Spring 2026, 26 April 2026. The model snapshot lives in versions.qmd and is the first place to update each year. Inter-edition errata go in the GitHub issue tracker.
- Econometrics review: brief, with a textbook redirect. Parts V and VII brush against DiD, IV, and selection-on-observables. The book covers the workflow — how AI helps, where it bluffs — and assumes the reader knows the method, or is willing to follow the redirect to Békés & Kézdi (2021) for it.
- Chapter length: short. Many short single-idea chapters, not few long bundled ones. As content gets fleshed out, prefer splitting over merging.