Week 04: Agentic AI with CLIs like Claude Code
From chat to terminal - AI that works directly with your files
Week 04: Agentic AI with Claude Code
From chat to terminal - AI that works directly with your files

About the class
So far we’ve used AI through chat interfaces - copying code back and forth, describing our data, pasting error messages. This week we introduce agentic AI: AI that can see your files, run your code, and iterate on results directly. It is a command-line interface (CLI) – it runs in your terminal. You type commands, the AI responds with code, runs it, and shows results - all in one place. It can assemble knowledge with great deal of automation.
Instead of describing your data to an AI, you can say “look at these CSV files and tell me what’s here.” Instead of copying error messages, Claude Code sees them and fixes the code itself.
Tool Options
We use Claude Code in this class, but the concepts apply to similar tools:
| Tool | Provider | Notes |
|---|---|---|
| Claude Code | Anthropic | What we use. Requires Claude Pro or API. |
| Gemini CLI | Similar workflow, uses Gemini models. | |
| Codex CLI | OpenAI | OpenAI’s command-line tool. |
| Copilot in VS Code | GitHub/OpenAI | AI-powered code completion in VS Code. |
| Cursor / Windsurf | Third-party | IDE-based, similar agentic features. |
| Antigravity | IDE with AI coding assistance. |
There are differences in capabilities and focus, but the core concepts are the same.
Everything you learn today transfers to these alternatives. The prompts, the workflow, the verification habits - all the same. So, we’ll use Claude Code as we go but just replace it with other tools. Also, tools are keep evolving, so focus on process more than specifics of a tool.
Learning Objectives
By the end of this session, students will:
- Understand the difference between chat-based AI and agentic AI tools
- Install and configure Claude Code on their machine
- Use Claude Code to explore, understand, and analyze a multi-file dataset
- Experience the workflow difference: iteration speed, context awareness, debugging
Before class
Setup Required - Do This Before Class
1. Installation and setup
Follow the installation guide: Installing AI CLI Tools
We use Claude Code, but Gemini CLI or Codex CLI also work. The guide covers all options for Windows, Mac, and Linux.
Claude Code lives in the Terminal. New to the terminal? Start with Terminal Basics
2. Ensure python
Claude Code can run Python code, so Python must be installed:
- Download from python.org/downloads
- Check “Add Python to PATH” during installation
- Install packages:
python -m pip install pandas numpy matplotlib
3. Ensure you have a GitHub account
- and know how basics of working with git
Class Plan
🤝 Review on readme docs (20 min)
Review Readme assignment
- broad discussion of experience
What worked, what was different?
- First run
- Extended context window run 1 (create a system prompt)
- Extended context window run 2 (with slideshows)
Problems with AI generated reports.
Read each other document.
- control?
- style?
Part 1: Why CLI with Agentic AI? (30 min)
Why CLI Tools? The Core Benefits
Think about the friction in using chat-based AI for data analysis:
- Copy CSV/code → Paste to ChatGPT
- Upload files directly - expensive for free plans
- See error → Copy error → Paste back to ChatGPT
- Repeat…
CLI tools eliminate this friction:
Files are already there - Claude Code sees your CSV files, R scripts, and outputs directly - No uploading, no copy-paste, no describing your data structure - Just: “Look at these files and calculate average occupancy by city”
Code runs immediately - AI writes code and executes it in one step - Sees errors and fixes them automatically - You get results, not just code snippets
Context stays intact - Remembers your entire project structure - Understands how files relate to each other (hotels.csv joins with cities.csv) - Keeps track of what you’ve done across multiple steps
Iteration is fast - “Something looks wrong” → AI investigates without you copying data - “Try it a different way” → Runs new approach immediately - No manual round-trips for every change
This covers the basics to get started. For advanced workflows (custom commands, git integration, large-scale data handling), see Advanced CLI Workflows.
What Stays the Same
Underlying intelligence: Both use state-of-the-art models (Claude Sonnet/Opus 4.5 vs. Gemini 3.0 Pro/Flash) with comparable reasoning for statistical analysis and code generation
Natural language interface: Describe tasks in plain language: “Clean this dataset and create a summary statistics table” or “Debug why my regression results look wrong”
Iterative refinement: Both support conversational debugging—ask the CLI to fix errors, try different estimation methods, or refine exhibit formatting through multiple rounds
Your preferences respected: Both learn and apply your coding standards (tidyverse, viridis, clean functions) once configured through their respective
.mdinstruction filesMultimodal input for data work: Both handle various data formats: CSV, Excel, PDFs, images of tables/charts. Useful for extracting data from PDFs, reproducing charts from papers, or generating code from hand-drawn sketches
Discussion: How Would You Do This?
Task: “What’s the average hotel occupancy by city in this dataset?”
You have a folder with several CSV files. You want to answer this question using AI assistance.
Think-pair-share (10 min):
- What steps would you take using ChatGPT or Claude.ai (chat interface)?
- What information would you need to give the AI?
- What could go wrong? How would you handle errors?
Discuss with a neighbor, then share with the class.
The friction: Every step requires manual copy-paste. Context gets lost. Errors require round-trips.
Part 2: Hands-on - First steps (25 min)
Setting Up Claude Code
set up folder
Create a git project folder for the Austrian Hotels data austrian-hotels-data
get the data
Download the Austrian Hotels dataset from here.
launch Claude Code
Open your terminal, navigate to your data folder:
cd path/to/austrian-hotels-data
claude
Task 1: setup
Ask to set up a data analysis project folder with data already there.
Have a look, ask changes if you’d like
Task 2: data
Get all csv files in the /data folder
Ask for a summary of data tables
Interact to learn more about the data
- What files are in this folder? Give me a quick overview.
- Show me 5 sample rows from each CSV file.
Task 3: relations
- ask to look at data as linked tables, and tell you keys to join them
- Read the hotels and cities files. How are they related? What’s the join key?
Task 4: simple analysis
- Look at the CSV files here and calculate average occupancy by city
- write python code, run it, and show results
Task 5 Discussions
- How does this compare to using ChatGPT?
- Where do you see the biggets advantages? Any downsides or risks?
- What analysis questions could you ask with this data?
Part 3: Hands-on - Cleaning
Task 1: Class discussion on potential problems
- what are the crucial steps when cleaning tabular data
Task 2: Cleaning with Claude Code
- Use claude code to find problems.
- create clean versions of data tables in a new folder
/data_cleaned
Part 4: Hands-on - Joining tables
Task 1: quick review on joins
- Review of key concepts Joining Tables Guide
Task 2 joining tables
Join hotels and cities - “Join the hotels and cities data. How many hotels are in each province?”
Aggregate occupancy - “What’s the average occupancy rate by city? Show me a table sorted highest to lowest.”
Find patterns - “Which 5-star hotels have the lowest average daily rate? Something seems off - investigate.”
Tips:
- If something looks wrong, ask “Why did that happen?” or “Check the row counts”
- Ask Claude Code to show intermediate steps: “Show me the data after the join, before aggregating”
- If you get an error, just wait - Claude Code will often fix it automatically
Task 3, manual check
- open a created data table and look into it.
- how would you test and debug?
Part 5: Hands-on Generate New Data
The Power Move: AI Creates Data
One of the most useful capabilities of Claude Code is generating realistic simulated data. This is exactly how the Austrian Hotels dataset was created - by an earlier version of Claude!
Task 1: Creating Booking Channel Data
- Let us create a new data table:
I want to create a new CSV file called hotel_bookings.csv that shows
what percentage of each hotel's bookings come from different channels
Write Python code to generate this, using the hotels_modified.csv as input.
- How could you be more specific to get realistic patterns?
- How would you check?
–>
Part 6: Debugging and Iteration
When Things Go Wrong
Claude Code isn’t perfect. Common issues:
- Wrong join type - Ask: “How many rows before and after the join? Did we lose data?”
- Missing values - Ask: “Are there NaN values? Where did they come from?”
- Unexpected results - Ask: “Walk me through the calculation step by step”
- Code errors - Often Claude Code fixes these automatically. If not, just describe what went wrong.
- Hallucinations - AI may generate plausible but incorrect code or stats. Always verify.
Trust but Verify
Always check:
- Row counts after joins
- Summary statistics (do means make sense?)
- A few random rows (do values look realistic?)
Good habit: Ask Claude Code to explain what it did:
Explain the code you just wrote. What assumptions did you make?
Operation tips
- use git projects
- as CLI tools are file-based, it’s easier to manage with git
- also gives you version control and backup (if something goes wrong you can pull back to a working state)
- start with
- higher level review: what do I have in the folder?
- thinking and plan first
- execute
resources
Bottom Line for Data Analysis
CLI tools shine when you have:
- Complex multi-file projects with data pipelines (
raw → clean → analysis → exhibits) - Need for reproducible workflows that others can run
- Large datasets or documents requiring substantial context
- Iterative analysis where the AI should test and debug autonomously
The chatbox remains better for:
- Quick one-off questions or code snippets
- Exploratory conversations about methodology
- Situations where you want tight control over each step
Discussion Questions
End of Class Reflection:
Workflow change: How did using Claude Code feel different from chat-based AI? What was faster? What was harder?
Trust calibration: When did you trust Claude Code’s output? When did you double-check? How do you decide?
Use cases: For what tasks would you now prefer Claude Code over ChatGPT/Claude.ai? When would you still use chat?
Data generation: What did you learn from creating synthetic data? How could this help in your own projects?
Assignment
Due: Sunday 23:55 before Week 5
Summary: Using Claude Code, create a new data table that joins to the Austrian Hotels dataset, then perform an analysis that answers an interesting question.
Resources
Claude Code & Terminal:
- Terminal Basics - Essential commands
- Install Claude Code - Installation guide
- Claude Code Documentation - Official docs
Additional references
- Short showcase of what is possible now (2026 Jan) ) – must watch 15 mins
Course Reference:
- Technical Terms Glossary - Key AI/data concepts
- Joining Tables Guide - Join types explained
- Advanced CLI Workflows - Power features for experienced users
Austrian Hotels Dataset:
- Dataset Overview - All files and documentation
- Data Schema - Table descriptions and relationships
Python Basics (if needed):
- Python for Data Analysis - Free online book
- pandas documentation: pandas.pydata.org
Some personal comments
- This is where AI gets genuinely useful for data work. Chat interfaces are great for learning and quick questions, but agentic AI changes how you actually do the work.
- The booking channel data we use in class? It was generated by Claude Code while preparing this course. Meta, isn’t it?
- Don’t worry if setup takes time. Getting your environment right is a one-time cost that pays off quickly.