Week 05: Advanced CLI Workflows
Power features for reproducible research with Claude Code and Gemini CLI
Week 05: Advanced CLI Workflows
Power features for reproducible research with Claude Code and Gemini CLI

Overview
Last week we introduced agentic AI with Claude Code β running AI in the terminal that can see files, run code, and iterate. This week we go deeper: custom skills, project-specific instructions, git integration, and autonomous execution. These features turn CLI tools from clever assistants into reproducible research companions.
The focus is on Claude Code, with notes on Gemini CLI equivalents where relevant.
Learning Outcomes
By the end of the session, students will:
- Create custom skills (reusable workflows) for their data analysis projects
- Set up project-specific instructions with
CLAUDE.md/GEMINI.md - Use git integration for version-controlled, traceable analysis
- Understand autonomous execution modes and when to use them
Preparation / Before Class
π§ Prerequisites
Required:
- Working Claude Code installation from Week 04
- Basic familiarity with running Claude Code in a project directory
- A project folder with at least one data file and one script
Recommended Reading:
- Advanced CLI Workflows reference page β the full reference guide for todayβs topics
- Installing CLI Tools β if you need to catch up
Class Material
π οΈ Custom Skills and Reusable Workflows (25 min)
What are skills?
Skills are reusable instruction sets that Claude Code can load automatically or on demand. Each skill lives in a .claude/skills/<skill-name>/SKILL.md file.
Scope:
- Personal:
~/.claude/skills/<skill-name>/SKILL.md(available in all your projects) - Project:
.claude/skills/<skill-name>/SKILL.md(this repo only)
Hands-on: Create a /clean-data skill
Create .claude/skills/clean-data/SKILL.md in your project:
---
name: clean-data
description: Run the data cleaning pipeline for this project.
---
Run the data cleaning pipeline:
1. Check for missing values in raw data files
2. Run data_cleaning.R script
3. Verify output file dimensions
4. Generate a data quality reportNow typing /clean-data in Claude Code executes this workflow consistently.
Discussion: What repetitive analysis tasks could you turn into skills?
Gemini CLI equivalent: MCP (Model Context Protocol) servers β external processes that expose tools to the AI.
π Project-Specific Instructions: CLAUDE.md (20 min)
The idea: Instead of repeating preferences in every prompt, write them once in a CLAUDE.md file. Claude Code reads it automatically.
Example for a data analysis project:
## Code Style
- Use tidyverse syntax for R code
- Use pandas for Python data manipulation
- Prefer ggplot2 with viridis color scheme
## Data Standards
- All dates in ISO 8601 format (YYYY-MM-DD)
- Column names: lowercase with underscores
## Analysis Preferences
- Always check for missing values before analysis
- Report sample sizes in all tablesHierarchical loading: Global (~/.claude/CLAUDE.md) β Project (./CLAUDE.md) β Subdirectory. More specific settings override general ones.
Hands-on: Create a CLAUDE.md for one of the course case studies (e.g., Austrian Hotels or Football Interviews).
Gemini CLI equivalent: GEMINI.md β same concept, same locations.
π Git Integration (20 min)
Why it matters: Version control + AI = traceable, reproducible analysis. Claude Code understands git natively.
Branch management:
Create a new branch for the robustness checks analysis,
implement the checks, and prepare a summary of changes
Claude Code will create the branch, make changes, commit with descriptive messages, and provide a diff summary.
Understanding project history:
Look at the git history for data_cleaning.R.
Why did we change the outlier detection threshold?
Claude Code reads commit messages and diffs to understand past decisions β useful when revisiting old projects.
Hands-on: Use Claude Code to create a branch, make an analysis change, and commit with a meaningful message. Review the diff together.
π Autonomous Execution (15 min)
The concept: Both tools can run commands without constant approval β useful for trusted, repetitive workflows.
Gemini CLI βYolo Modeβ (--yolo or -y):
gemini -p "Install missing packages, clean the data, run all analyses" --yoloClaude Code headless mode:
claude -p "your prompt"When to use autonomous execution:
- Trusted, repetitive pipelines youβve run before
- Well-defined tasks with clear success criteria
When NOT to:
- New or unfamiliar code
- Tasks that modify shared resources
- Anything you havenβt verified manually first
Discussion: What are the risks of autonomous execution in a research context? How do you balance speed with verification?
End of Week Discussion Points
- How could custom skills improve reproducibility in your research?
- What instructions would you put in your
CLAUDE.mdfor your thesis/project? - When is autonomous execution appropriate vs. risky in data analysis?
- How does git integration change the way you think about AI-assisted analysis?
Assignment
Due: Before Week 6
Use Claude Code to solve an interesting problem and report on the experience.