The Linear Plan

This page exists because the book is a re-arrangement of an existing course, and a careful reader will want to know what was moved, why, and where the original lives. It also serves as a working document while the book is in draft: it explains the editorial logic of the chapter order so it can be challenged.

The arc in one paragraph

Start with what an LLM actually is and which one to use. Then learn to use chat models well — for documentation, joins, reports, graphs. Move to the terminal and to agentic CLIs (Claude Code), which let AI act on real files in real projects. Apply that workflow to text data — interview transcripts, sentiment. Then to research workflows — control variables, instrumental variables. Then to APIs and automation, which let you stop clicking and start scheduling. End with a capstone that exercises everything against one messy real-world question.

How weekly course material maps to chapters

Course source	Chapter(s)
Week 0 — AI for coding prep	Ch. 1 (Foundations)
Week 1 — LLM review	Ch. 2 (Foundations)
Week 2 — Documentation	Ch. 5 (Working in chat)
Week 3 — Reporting	Ch. 8–9 (Working in chat)
Week 4 — Agentic AI with CLIs	Ch. 14 (CLI workflows)
Week 5 — Advanced CLI workflows	Ch. 15 (CLI workflows)
Week 6 — Data to report	Ch. 16 (CLI workflows)
Week 7 — Text as data	Ch. 20 (Text as data)
Week 8 — Sentiment analysis	Ch. 21 (Text as data)
Week 9 — AI research: controls	Ch. 23 (AI in research)
Week 10 — AI research: IV	Ch. 24 (AI in research)
Capstone (3 sessions)	Ch. 31–34 (Capstone part)

How knowledge-base articles fold in

The course-website’s “Knowledge Base” sidebar holds reusable reference pages. In the book they stop being a separate sidebar and become chapters in the place where you first need them.

Knowledge-base page	Folded into
Which AI model	Ch. 3 — right after the LLM review
Glossary of LLM terms	Ch. 4 — closes Part I
Documentation fundamentals	Ch. 6 — alongside the documentation chapter
Joining data tables	Ch. 7 — before reports require joined data
Creating graphs	Ch. 10 — closes the chat-workflow part
Terminal basics	Ch. 11 — opens the CLI part
Installing AI CLI tools	Ch. 12 — install before use
VS Code + Copilot setup	Ch. 13 — alternative IDE-based workflow
Designing larger analytics projects	Ch. 17 — once CLIs are working
Reproducible research	Ch. 18 — closes the CLI part
NLP basics	Ch. 19 — opens text-as-data
PDF guide	Ch. 22 — closes text-as-data
Get AI API keys	Ch. 25 — opens APIs part
APIs intro	Ch. 26
LLM APIs in Python	Ch. 27
APIs under the hood	Ch. 28
Walkthrough: World Bank + FRED	Ch. 29
Walkthrough: FBref	Ch. 30
Beyond — what to read next	Reference

What stays in the website but not the book

The slideshows are linked from the website but live elsewhere; the book mentions them in passing.
Case-study pages (with their data dictionaries and sample code) live in the website’s case-studies/ and are referenced from the book; the Reference part summarises each in one page.
The week-by-week schedule and dates belong to the live course, not the book.

Editorial principles for the draft

One chapter, one idea. Where a course week stuffed two ideas into one page (e.g. text-analysis intro + sentiment), the book splits them into two chapters.
Reference pages stop being reference pages. A reader on a linear path doesn’t want side-doors; they want the next chapter to assume what the previous one taught.
Labs preserve original wording. The narrative is rewritten for a single voice; the labs keep the original assignment text so a student can submit them with confidence.
The capstone is a part, not an appendix. It is the payoff for the whole book. Three chapters, one per session, with a brief that lets a self-learner do it solo or a team do it together.
APIs come late. Most students will not need APIs until the capstone. Chapter-25–30 sit between the research-methods part and the capstone deliberately.

Editorial decisions for this edition

These were open questions in the first draft. They are now locked for the Spring 2026 edition. Subsequent editions can revisit, but a single edition holds these constant.

Code language: Python only. Earlier course editions juggled Python and R, which made every chapter longer than it needed to be and every example shallower than it could be. The book is Python-only; principles transfer.
CLI agent: Claude Code, with brief footnotes for alternatives. The CLI chapters are written against Claude Code — installation, CLAUDE.md, skills, headless mode. Cursor / Codex / Aider get one-paragraph mentions where they differ usefully. The principles transfer.
Format: web-first. The book is built and read on the web. A print edition is the print editor’s problem to derive from a frozen snapshot — we do not pre-emptively cripple the web version with print-friendly compromises (no width-limited graphs, hero sections stay, embeds stay).
Versioning: yearly spring edition. The book is updated once a year, in spring, dated and labelled — e.g. Spring 2026, 26 April 2026. The model snapshot lives in versions.qmd and is the first place to update each year. Inter-edition errata go in the GitHub issue tracker.
Econometrics review: brief, with a textbook redirect. Parts V and VII brush against DiD, IV, and selection-on-observables. The book covers the workflow — how AI helps, where it bluffs — and assumes the reader knows the method, or is willing to follow the redirect to Békés & Kézdi (2021) for it.
Chapter length: short. Many short single-idea chapters, not few long bundled ones. As content gets fleshed out, prefer splitting over merging.

--- title: "The Linear Plan" --- # The Linear Plan {.unnumbered} This page exists because the book is a *re-arrangement* of an existing course, and a careful reader will want to know what was moved, why, and where the original lives. It also serves as a working document while the book is in draft: it explains the editorial logic of the chapter order so it can be challenged. ## The arc in one paragraph Start with what an LLM actually is and which one to use. Then learn to use chat models well — for documentation, joins, reports, graphs. Move to the terminal and to agentic CLIs (Claude Code), which let AI act on real files in real projects. Apply that workflow to text data — interview transcripts, sentiment. Then to research workflows — control variables, instrumental variables. Then to APIs and automation, which let you stop clicking and start scheduling. End with a capstone that exercises everything against one messy real-world question. ## How weekly course material maps to chapters | Course source | Chapter(s) | |--------------------------------------|---------------------------------------------------| | Week 0 — AI for coding prep | Ch. 1 (Foundations) | | Week 1 — LLM review | Ch. 2 (Foundations) | | Week 2 — Documentation | Ch. 5 (Working in chat) | | Week 3 — Reporting | Ch. 8–9 (Working in chat) | | Week 4 — Agentic AI with CLIs | Ch. 14 (CLI workflows) | | Week 5 — Advanced CLI workflows | Ch. 15 (CLI workflows) | | Week 6 — Data to report | Ch. 16 (CLI workflows) | | Week 7 — Text as data | Ch. 20 (Text as data) | | Week 8 — Sentiment analysis | Ch. 21 (Text as data) | | Week 9 — AI research: controls | Ch. 23 (AI in research) | | Week 10 — AI research: IV | Ch. 24 (AI in research) | | Capstone (3 sessions) | Ch. 31–34 (Capstone part) | ## How knowledge-base articles fold in The course-website's "Knowledge Base" sidebar holds reusable reference pages. In the book they stop being a separate sidebar and become **chapters in the place where you first need them**. | Knowledge-base page | Folded into | |--------------------------------------|---------------------------------------------------| | Which AI model | Ch. 3 — right after the LLM review | | Glossary of LLM terms | Ch. 4 — closes Part I | | Documentation fundamentals | Ch. 6 — alongside the documentation chapter | | Joining data tables | Ch. 7 — before reports require joined data | | Creating graphs | Ch. 10 — closes the chat-workflow part | | Terminal basics | Ch. 11 — opens the CLI part | | Installing AI CLI tools | Ch. 12 — install before use | | VS Code + Copilot setup | Ch. 13 — alternative IDE-based workflow | | Designing larger analytics projects | Ch. 17 — once CLIs are working | | Reproducible research | Ch. 18 — closes the CLI part | | NLP basics | Ch. 19 — opens text-as-data | | PDF guide | Ch. 22 — closes text-as-data | | Get AI API keys | Ch. 25 — opens APIs part | | APIs intro | Ch. 26 | | LLM APIs in Python | Ch. 27 | | APIs under the hood | Ch. 28 | | Walkthrough: World Bank + FRED | Ch. 29 | | Walkthrough: FBref | Ch. 30 | | Beyond — what to read next | Reference | ## What stays in the website but not the book - The **slideshows** are linked from the website but live elsewhere; the book mentions them in passing. - **Case-study pages** (with their data dictionaries and sample code) live in the website's `case-studies/` and are referenced from the book; the Reference part summarises each in one page. - The week-by-week **schedule and dates** belong to the live course, not the book. ## Editorial principles for the draft 1. **One chapter, one idea.** Where a course week stuffed two ideas into one page (e.g. text-analysis intro + sentiment), the book splits them into two chapters. 2. **Reference pages stop being reference pages.** A reader on a linear path doesn't want side-doors; they want the next chapter to assume what the previous one taught. 3. **Labs preserve original wording.** The narrative is rewritten for a single voice; the labs keep the original assignment text so a student can submit them with confidence. 4. **The capstone is a part, not an appendix.** It is the payoff for the whole book. Three chapters, one per session, with a brief that lets a self-learner do it solo or a team do it together. 5. **APIs come late.** Most students will not need APIs until the capstone. Chapter-25–30 sit between the research-methods part and the capstone deliberately. ## Editorial decisions for this edition These were open questions in the first draft. They are now locked for the **Spring 2026** edition. Subsequent editions can revisit, but a single edition holds these constant. - **Code language: Python only.** Earlier course editions juggled Python and R, which made every chapter longer than it needed to be and every example shallower than it could be. The book is Python-only; principles transfer. - **CLI agent: Claude Code, with brief footnotes for alternatives.** The CLI chapters are written against Claude Code — installation, `CLAUDE.md`, skills, headless mode. Cursor / Codex / Aider get one-paragraph mentions where they differ usefully. The principles transfer. - **Format: web-first.** The book is built and read on the web. A print edition is the print editor's problem to derive from a frozen snapshot — we do not pre-emptively cripple the web version with print-friendly compromises (no width-limited graphs, hero sections stay, embeds stay). - **Versioning: yearly spring edition.** The book is updated once a year, in spring, dated and labelled — e.g. *Spring 2026, 26 April 2026*. The model snapshot lives in [versions.qmd](versions.qmd) and is the first place to update each year. Inter-edition errata go in the GitHub issue tracker. - **Econometrics review: brief, with a textbook redirect.** Parts V and VII brush against DiD, IV, and selection-on-observables. The book covers the *workflow* — how AI helps, where it bluffs — and assumes the reader knows the *method*, or is willing to follow the redirect to [Békés & Kézdi (2021)](https://gabors-data-analysis.com/getting-started) for it. - **Chapter length: short.** Many short single-idea chapters, not few long bundled ones. As content gets fleshed out, prefer splitting over merging.