Week 05: Advanced CLI Workflows

Power features for reproducible research with Claude Code and Gemini CLI

Published

February 10, 2026

Week 05: Advanced CLI Workflows

Power features for reproducible research with Claude Code and Gemini CLI


Overview

Last week we introduced agentic AI with Claude Code – running AI in the terminal that can see files, run code, and iterate. This week we go deeper: custom skills, project-specific instructions, git integration, and autonomous execution. These features turn CLI tools from clever assistants into reproducible research companions.

The focus is on Claude Code, with notes on Gemini CLI equivalents where relevant.

Learning Outcomes

By the end of the session, students will:

  • Create custom skills (reusable workflows) for their data analysis projects
  • Set up project-specific instructions with CLAUDE.md / GEMINI.md
  • Use git integration for version-controlled, traceable analysis
  • Understand autonomous execution modes and when to use them

Preparation / Before Class

πŸ”§ Prerequisites

Required:

  • Working Claude Code installation from Week 04
  • Basic familiarity with running Claude Code in a project directory
  • A project folder with at least one data file and one script

Recommended Reading:

Class Material

πŸ› οΈ Custom Skills and Reusable Workflows (25 min)

What are skills?

Skills are reusable instruction sets that Claude Code can load automatically or on demand. Each skill lives in a .claude/skills/<skill-name>/SKILL.md file.

Scope:

  • Personal: ~/.claude/skills/<skill-name>/SKILL.md (available in all your projects)
  • Project: .claude/skills/<skill-name>/SKILL.md (this repo only)

Hands-on: Create a /clean-data skill

Create .claude/skills/clean-data/SKILL.md in your project:

---
name: clean-data
description: Run the data cleaning pipeline for this project.
---

Run the data cleaning pipeline:
1. Check for missing values in raw data files
2. Run data_cleaning.R script
3. Verify output file dimensions
4. Generate a data quality report

Now typing /clean-data in Claude Code executes this workflow consistently.

Discussion: What repetitive analysis tasks could you turn into skills?

Gemini CLI equivalent: MCP (Model Context Protocol) servers – external processes that expose tools to the AI.

πŸ“‹ Project-Specific Instructions: CLAUDE.md (20 min)

The idea: Instead of repeating preferences in every prompt, write them once in a CLAUDE.md file. Claude Code reads it automatically.

Example for a data analysis project:

## Code Style
- Use tidyverse syntax for R code
- Use pandas for Python data manipulation
- Prefer ggplot2 with viridis color scheme

## Data Standards
- All dates in ISO 8601 format (YYYY-MM-DD)
- Column names: lowercase with underscores

## Analysis Preferences
- Always check for missing values before analysis
- Report sample sizes in all tables

Hierarchical loading: Global (~/.claude/CLAUDE.md) β†’ Project (./CLAUDE.md) β†’ Subdirectory. More specific settings override general ones.

Hands-on: Create a CLAUDE.md for one of the course case studies (e.g., Austrian Hotels or Football Interviews).

Gemini CLI equivalent: GEMINI.md – same concept, same locations.

πŸ”€ Git Integration (20 min)

Why it matters: Version control + AI = traceable, reproducible analysis. Claude Code understands git natively.

Branch management:

Create a new branch for the robustness checks analysis,
implement the checks, and prepare a summary of changes

Claude Code will create the branch, make changes, commit with descriptive messages, and provide a diff summary.

Understanding project history:

Look at the git history for data_cleaning.R.
Why did we change the outlier detection threshold?

Claude Code reads commit messages and diffs to understand past decisions – useful when revisiting old projects.

Hands-on: Use Claude Code to create a branch, make an analysis change, and commit with a meaningful message. Review the diff together.

πŸš€ Autonomous Execution (15 min)

The concept: Both tools can run commands without constant approval – useful for trusted, repetitive workflows.

Gemini CLI β€œYolo Mode” (--yolo or -y):

gemini -p "Install missing packages, clean the data, run all analyses" --yolo

Claude Code headless mode:

claude -p "your prompt"

When to use autonomous execution:

  • Trusted, repetitive pipelines you’ve run before
  • Well-defined tasks with clear success criteria

When NOT to:

  • New or unfamiliar code
  • Tasks that modify shared resources
  • Anything you haven’t verified manually first

Discussion: What are the risks of autonomous execution in a research context? How do you balance speed with verification?

End of Week Discussion Points

  • How could custom skills improve reproducibility in your research?
  • What instructions would you put in your CLAUDE.md for your thesis/project?
  • When is autonomous execution appropriate vs. risky in data analysis?
  • How does git integration change the way you think about AI-assisted analysis?

Assignment

NoteAssignment 5: Advanced CLI Workflows

Due: Before Week 6

Use Claude Code to solve an interesting problem and report on the experience.

Full Assignment Details