Which AI model shall we chose?
In what follows, here is my personal take as of date:2026-01-12.
Basics
Generative AI based on Large Language Models (genAI) is great for many tasks. In this course we only focus on aspects of Data Analysis:
- designing analysis
- writing code
- data wrangling such as joining tables, sample design, variable transformations
- exploratory data analysis
- modelling, regressions, machine learning
- causal inference
- creating tables and graphs
- writing reports and presentations
Basics
- Most models have a free and paid tiers. Free ones are good as kind of Google search replacement. For serious work you’ll be better off with paid tiers.
- As of now the leading models are Google’s Gemini 3.0, OpenAI’s GPT-5.2, and Anthropic’s Claude Opus 4.5.
- Each model have faster (cheaper) and thinking (more expensive) variants. For data analysis work, the thinking variants are recommended.
- All major models have deep research
- All major models support tool use (e.g. web browsing, code execution) and agentic patterns (multi-step workflows with memory and tool use).
- Open weights models are also quite good, Deepseek and Q. You can even run a 7bn parameter model on an expensive computer
Flagship models: 2026 January 12
landscape overview
| Feature | GPT-5.2 | o3 (reasoning) | Gemini 3 Deep Think | Claude Opus 4.5 |
|---|---|---|---|---|
| Design focus | Flagship multimodal with native reasoning | Deep reasoning for complex tasks | 1M token context with extended thinking | Coding & complex instruction following |
| Multimodality | Text + images + audio + video (I/O) | Text + tool calls | Text + images + video | Text + images + Computer Use 2.0 |
| Browsing / tools | Integrated search + MCP support | Auto-search for up-to-date facts | Grounded in Google Search | Deep integration with MCP & OS |
| Default style | Balanced, adaptive | Concise, source-cited | Thorough, exploratory | Professional, highly structured |
| Context window | 400k tokens | 128k tokens | 1M tokens | 500k tokens |
| Strengths | All-rounder, agentic workflows | Step-wise analysis, audit trail | Long-context research | Coding agents, OS automation |
| Weak spots | Expensive for simple tasks | No native media I/O | Slower response times | Slower than Haiku/Sonnet |
Models for data analysis
For data-analysis projects, the recommended approach is:
- Reasoning models (o3, Deep Think) for complex analysis requiring accuracy
- GPT-5.2 for agentic workflows and multimodal tasks
- Gemini 3 for long-context document analysis (entire datasets, reports)
- Claude Opus 4.5 for coding and complex technical writing
All major models now support tool use (MCP) and agentic patterns.
Research & Writing
- Synthesis Over Summarization: AI tools increasingly synthesize multi-source inputs into structured insights rather than paraphrasing single documents.
- Security & Privacy: Modern workspaces rely on isolated execution contexts; strong non-training guarantees apply primarily to paid and enterprise tiers.
- Multimodal Capability: AI can interpret charts, screenshots, and handwritten notes and incorporate them into drafts.
For data analysis workspaces – comparison
| Workspace | 2026 Key Features |
|---|---|
| Anthropic Claude Artifacts | • Creates interactive applications (tutors, calculators) within the output window. • Real-time iteration on complex document structures. |
| OpenAI ChatGPT Canvas | • Advanced frontier models with contextual persistence for tone and style. • Inline editing with granular control over specific sections. |
| Google NotebookLM | • Interactive Audio Overviews with user interruption and questioning. • Grounded citations linked directly to uploaded source segments. |
| Perplexity Pages | • Multi-source synthesis using live web retrieval. • Inline citation and consistency checking against sources. |
Data Analysis details
- Sandboxed Execution: Code runs in secure, ephemeral environments with no local system access.
- Statistical Rigor: Strong support for Python-based libraries (e.g. pandas, scikit-learn) for exploratory and predictive analysis.
- Direct Integration: AI can manipulate data directly within spreadsheets or dedicated analysis windows.
- Limits: Reproducibility, package versions, and state persistence remain constrained relative to local workflows.
Data Analysis details
| Workspace | 2026 Key Features for Analysis |
|---|---|
| ChatGPT Data Analysis | • Executes Python in managed compute environments for multi-file datasets. • Assisted data cleaning and predictive modeling workflows. |
| Claude Analysis | • High-fidelity SVG and lightweight interactive output. • Fast iteration on statistical tables with publication-ready formatting. |
| Google Gemini in Sheets | • Multimodal cleaning: converts screenshots or PDFs into structured tables. • Natural-language formula generation and transformations. |
| Microsoft Copilot in Excel | • Native Python-in-Excel for statistical scripts inside spreadsheets. • Automated pivots, summaries, and forecasting via prompts. |
Coding Assistance
- Agent-Assisted Workflows: AI can coordinate multi-step tasks such as refactoring or bug fixing across large codebases, with human review.
- Environment Security: Code is tested in secure sandboxes before changes are proposed.
- Interconnected Tools: Integrated with development and collaboration platforms (e.g. Jira, Slack).
Coding Assistance — comparison
| Workspace | 2026 Key Features for Developers |
|---|---|
| Anthropic Claude Code | • Native VS Code extension with agent-style workflows and inline diffs. • Supports complex multi-file edits and testing assistance. |
| GitHub Copilot | • Uses Extensions to interact with external dev tools (Azure, Slack, Jira). • Deep context from local and remote repositories. |
| Cursor | • AI-first editor with awareness of project-wide dependencies. • Strong support for iterative refactors across files. |
| Windsurf (Codeium) | • Cascade mode for orchestrating large-scale refactors. • Robust free tier for students and individual users. |
Security note: SOC2 compliance is common; strict zero-retention guarantees typically apply to enterprise or explicitly configured accounts.
What changed from 2025 Q2
- No need to think much re which models to use.
- Leading models similar capability, but different. Not really sure how…
Feedback
Dear Reader. I have limited experience. Suggestions are welcome, please post an issue.