Data Analysis with AI 02: Prompting

From Prompting to Context Engineering

Gábor Békés (CEU)

2026-01-19

The AI course

This slideshow is part of my data analysis with AI material.

Check out the course website gabors-data-analysis.com/ai-course/

About me and this slideshow

  • I am an economist and not an AI developer, expert, guru, evangelist
  • I am an active AI user in teaching and research
  • I teach a series a Data Analysis courses based on my textbook
    • This project is closely related to concepts and material in the book, but can be consumed alone.
  • This slideshow was created to help students and instructors active in data analysis in education, research, public policy or business
  • Enjoy.

From Prompting to Context Engineering

The Big Picture: 2024 → 2026

  • 2024: “Prompt engineering” was the key skill
  • 2025: Rise of “context engineering”
  • 2026: Context engineering + agentic workflows

What changed?

  • Models became better at following instructions (literally!)
  • Longer context windows (200k → 1M tokens)
  • Tool use and agentic capabilities became standard

What is Context Engineering?

“The delicate art and science of filling the context window with just the right information for the next step.” — Andrej Karpathy

It’s not just about what words to use but what configuration of context will produce the desired behavior.

Context = More than Prompts

Prompting (old view)

  • System prompt
  • User message
  • Maybe few examples

Context Engineering (new view)

  • Instructions & constraints
  • Reference materials
  • Memory / history
  • Tool definitions
  • Retrieved documents (RAG)
  • Environmental state

Core Prompting Strategies (Updated)

Prompting Still Matters

  • Prompting is the foundation of context engineering
  • Think of it as giving instructions to someone else (= The RA approach)
  • But modern models are more literal — they do exactly what you ask

Key Change: Precise Instruction Following

Claude 4.x and GPT-4.1+ models:

  • Take instructions literally
  • Don’t “go above and beyond” unless asked
  • If you say “suggest changes” → they suggest, not implement

Implication: Be more explicit about what you want.

“Create a dashboard” → might give minimal output

“Create a fully-featured analytics dashboard with filters, charts, and interactive elements” → what you probably wanted

Six Key Prompting Strategies

  1. Write clear instructions
  2. Provide reference materials
  3. Break complex tasks into subtasks
  4. Step by step reasoning
  5. Use external tools
  6. Test systematically

Source: OpenAI guide to prompt engineering

Still valid, but implementation has evolved.

1 Clear Instructions

General Principles

  • Be specific about style, requirements
  • Include relevant context and constraints
  • Use example formats when needed
  • New: Explain why — motivation helps

Data Analysis Apps 📊

  • Define audience (academic/business/public)
  • Define desired statistical approaches
  • Clarify output format (tables, graphs, reports)
  • Specify level of detail expected

1 Clear Instructions Implementation

General Tactics ⚡

  • Adopt personas
  • Specify output format explicitly
  • Define scope clearly
  • Ask for specific extensions
  • Use XML tags for structure

Example Prompts 💡

  • “Write a report for government officials”
  • “Format correlation matrices as heatmaps”
  • “Show full statistical tests including p-values”
  • “Go beyond the basics — include sensitivity checks”

1 Clear Instructions: XML Tags

Claude specifically benefits from XML tags to structure inputs:

<context>
You are analyzing panel data on firm productivity.
</context>

<data_description>
Variables: firm_id, year, revenue, employees, industry
Panel: 2010-2020, N=5000 firms
</data_description>

<task>
Estimate a fixed effects regression with year dummies.
Report results in a publication-ready table.
</task>

2. Provide References

General Tactics ⚡

  • Share relevant documentation
  • Include descriptions
  • Require citations
  • Much improved vs 2024

Data Analysis Apps 📊

  • Reference statistical methodology
  • Share domain context and assumptions
  • Provide data quality metrics
  • Upload data dictionaries

2. Provide References: What Changed

  • 2024: Limited context, frequent hallucinations
  • 2026: 200k-1M token contexts, much better grounding

New best practice: Upload full documentation

  • Entire codebooks
  • Methodology sections from papers
  • Sample outputs you want to replicate

3. Break Complex Tasks

General Tactics ⚡

  • Divide into logical steps
  • Build complexity gradually
  • Validate intermediate steps

Data Analysis Applications 📊

  • Cleaning → EDA → Analysis
  • Xsec OLS → panel FE → event study
  • Does OLS \(y\), \(x\) make sense?

3. Break Complex Tasks: Why Still Important

Even with massive context windows, breaking tasks helps:

  • Not because of context limits
  • But because focused tasks produce higher quality
  • Easier to catch errors at each step
  • Better for iteration

4. Step by Step Reasoning

General Tactics ⚡

  • Request step-by-step reasoning
  • Explain assumptions
  • Validate interim results

Data Analysis Tactics 📊

  • Show data exploration process
  • Justify method selection
  • Document assumption checks
  • Explain statistical decisions

4. Step by Step: Reasoning Models

Big development: Dedicated reasoning models

  • OpenAI o3, o4-mini
  • Claude with extended thinking
  • Gemini 3 with thinking levels

These models “think before responding” — internal chain of thought

5. External Tools

General Tactics ⚡

  • Use code execution (Python, R)
  • Interactive tools: canvas, artifacts
  • Project folders for context
  • MCP for tool integration

Data Analysis Tactics 📊

  • Ask for direct solution with execution
  • Upload data files
  • Use web search for recent methods
  • Connect to databases via MCP

6. Test Systematically

General Tactics ⚡

  • Validate against known results
  • Check validity
  • Compare methods

Data Analysis Tactics 📊

  • Cross-validation
  • Benchmark datasets
  • Statistical assumption verification
  • Sensitivity analysis

What’s New in 2025-2026

Model-Specific Prompting

Different models, different strategies:

Model Type Key Insight
Claude 4.x Precise instruction following; use XML tags
GPT o3/o4-mini Reasoning automatic; state objective + format
GPT-4.1/5 Classic prompting; break into subtasks
Gemini 3 Keep it simple; use default temperature

Reasoning Models: o3 and o4-mini

Released April 2025. Key features:

  • Internal chain-of-thought (you don’t see it)
  • Better at math, coding, multi-step logic
  • Can “think with images”
  • Less need for “think step-by-step” hacks

Prompting tip: State objective + output format. Reasoning is automatic.

Claude 4.x Best Practices

From Anthropic’s guidance:

  1. Be explicit — models take you literally
  2. Use XML tags — structure your inputs
  3. Provide motivation — explain why you want something
  4. Request verbosity — model is more concise by default
  5. Extended thinking — enable for complex problems

Gemini 3 Best Practices

Released November 2025. Google’s guidance:

  1. Simplify prompts — Gemini 3 rewards clarity over elaborate scaffolding
  2. Keep temperature at 1.0 — reasoning optimized for default; lower values can cause loops
  3. Use thinking levels — LOW/MEDIUM/HIGH instead of chain-of-thought hacks
  4. Put directive after input — place instructions after the data block
  5. Use grounding for facts — connect to search for current information

Gemini 3: Unique Strengths

What Gemini 3 does especially well:

  • Massive context: 1M tokens standard (2M in some tiers)
  • Multimodal reasoning: Excellent on image + text tasks
  • Structured output: Infers JSON/table formats with minimal cues
  • Cost efficiency: Flash tier very cheap for high-volume tasks

For data analysis: Great for document analysis, long codebooks, and visual data (charts, graphs)

Extended Thinking

Claude and other models now support “extended thinking”:

  • Model explicitly reasons through problem
  • Visible chain of thought
  • Much better for complex analysis

When to use: Complex reasoning, multi-step problems, data analysis decisions

Cost: Higher latency, more tokens

Agentic Workflows

Models can now:

  • Execute code directly
  • Use multiple tools in sequence
  • Browse the web
  • Read/write files
  • Operate autonomously for longer tasks

This changes how we work with AI.

Agentic Coding Tools

Command line tools:

  • Claude Code
  • OpenAI Codex

IDE’s (Integrated Development Environment):

  • GitHub Copilot: Seamless integration in VS Code
  • Cursor: AI-native code editor
  • Antigravity: Google’s IDE with generous quota limits

See Claude Code Best Practices

Context Engineering in Practice

For data analysis, think about:

  1. System context: Your role, constraints, style
  2. Data context: Variable descriptions, data dictionary
  3. Method context: Statistical approach, assumptions
  4. Output context: Format, audience, length

Context Rot and Pollution

Long conversations degrade performance:

  • Context rot: Performance degrades as context fills
  • Context pollution: Irrelevant info distracts model
  • Context confusion: Model loses track of instructions

Solution: Start fresh for new tasks, use memory tools

Next level solution: Claude Code auto-compact: Keeps your sessions running indefinitely by intelligently summarizing conversations when you approach context limits.

Coming in Week 03: System Prompts & Skills

We’ll dive deeper into:

  • Building custom system prompts for data analysis
  • Creating Skills (Claude) and Gems (Gemini)
  • Reusable prompt templates for common tasks
  • Project-level context management

See Week 03: System Prompts and Skills

Practical Advice for Data Analysis

Iteration Over One-Shot

More important than ever:

  1. Start with clear objective
  2. Get initial output
  3. Refine with specific feedback
  4. Validate results
  5. Document process

Example Workflow: Regression Analysis

Turn 1: "Here's my data [upload]. Describe the variables 
        and check for issues."

Turn 2: "Run OLS of Y on X1, X2 with robust SEs. 
        Show diagnostics."

Turn 3: "Now add fixed effects. Compare results."

Turn 4: "Create publication-ready table. 
        AER style, 3 columns."

What to Put in System Prompts

For data analysis work:

You are helping with academic research in economics. 

Preferences:
- R with tidyverse
- viridis color scheme
- Publication-quality output
- Show code and explain decisions
- Flag potential issues proactively

Current project: [brief description]

More on this in Week 03

Should We Say Please/Thank You?

Should We Say Please/Thank You?

Answer: Doesn’t matter for output quality. Do what feels natural.

Summary

Key Takeaways

  1. Context engineering > prompt engineering
  2. Modern models are more literal — be explicit
  3. Reasoning models handle complexity automatically
  4. Tools and agents expand what’s possible
  5. Iteration remains essential
  6. Structure your inputs (XML tags, clear sections)

Resources

Date stamp

This version: 2026-01-19 (v0.6.0)

Previous versions: v0.3.2 (2025-05-28), v0.1.2 (2025-04-21)

bekesg@ceu.edu