Data Analysis with AI: Concepts

Large Language Models: Key Concepts

Gábor Békés (CEU)

2025-04-21

About me and this slideshow

I am an economist and not an AI developer, expert, guru, evangelist
I am an active AI user in teaching and research
I teach a series a Data Analysis courses based on my textbook
- This project is closely related to concepts and material in the book, but can be consumed alone. (or with a drink)
This slideshow was created to help students and instructors active in data analysis in education, research, public policy or business
Enjoy.

Hello

Use of Artificial Intelligence

Why

Teaching Data Analysis courses + prepping for 2nd edition of Data Analysis textbook
AI is both amazing help and scary as #C!*
This is a class to
- discuss and share ideas of use
- gain experience and confidence
- find useful use cases
- learn bit more about LLMs and their impact
Try out different ways to approach a problem
- One prompt vs interaction
- Compare human vs machine understanding of a text

This class

This is designed as first slideshow in a six week course

100-120 minutes per week.
5 assignments
extra readings

All open source at github.com/gabors-data-analysis/da-w-ai

This class – approach

focus on data analysis steps: research question, code, statistics, reporting
self-help group to openly discuss experience and trauma
get you some experience with selected tasks
move from execution as key skill to design and debugging
get you a class you can put into your CV
(extra) talk about topics I care about in data analysis

This class – topics and case studies

Week 1: Review LLMs – An FT graph
Week 2: EDA and data documentation – World Values Survey (VWS)
Week 3: Analysis and report creation – World Values Survey (VWS)
Week 4: Data manipulation, wrangling – Synthetic Hotels
Week 5: Text analysis and information extraction – Post match interviews (VWS)
Week 6: Different ways of sentiment analysis – Post match interviews (VWS)

Intro to the concept of LLMs

LLM Development Timeline

Key Milestones in LLM Development I

Neural Language Models (2003): First successful application of neural networks to language modeling, establishing the statistical foundations for predicting word sequences based on context.
Word Embeddings (2013): Development of Word2Vec and distributed representations, enabling words to be mapped into vector spaces where semantic relationships are preserved mathematically.
Transformer Architecture (2017): Introduction of the Transformer model with self-attention mechanisms, eliminating sequential computation constraints and enabling efficient parallel processing.

Key Milestones in LLM Development II

Pretraining + Fine-tuning (2018): BERT - Emergence of the two-stage paradigm where models are first pretrained on vast unlabeled text, then fine-tuned for specific downstream tasks.
ChatGPT (2022): Release of a conversational AI interface that demonstrated unprecedented natural language capabilities to the general public, driving mainstream adoption.
Reinforcement Learning from Human Feedback (2023): Refinement of models through human preferences, aligning AI outputs with human values and reducing harmful responses.

References

[1]: Bengio, Y., Ducharme, R., Vincent, P., & Jauvin, C. (2003). “A Neural Probabilistic Language Model.” Journal of Machine Learning Research.
[2]: Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). “Distributed Representations of Words and Phrases and their Compositionality.” Advances in Neural Information Processing Systems.
[3]: Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). “Attention Is All You Need.” Advances in Neural Information Processing Systems.
[4]: Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.” arXiv preprint.
[5]: OpenAI. (2022). “ChatGPT: Optimizing Language Models for Dialogue.” OpenAI Blog.
[6]: Anthropic. (2023). “Constitutional AI: Harmlessness from AI Feedback.” arXiv preprint.

Key Milestones in LLM Development III

New ideas on using synthetic data to train models
DeepSeek’s cheaper approach
Agentic AI

What are Large Language Models?

Statistical models predicting next tokens
Transform text into mathematical space
Scale (training data) matters enormously
Pattern recognition at massive scale

LLMs as Prediction Machines

Economic Framework: Similar to forecasting models
- Input → Black Box → Predicted Output
Key Difference: Works with unstructured text data
Training Process: Supervised learning at scale
Training Material: “Everything” (all internet + many books)

Understanding Tokens

Context Window & Memory

Note

Larger context = Better understanding but higher computational cost

Token window

1 token = 4 characters, 4 tokens= 3 words (In English)
varies by models
ChatGPT 2022 window of 4,000 tokens.
ChatGPT 2025 window of 128,000 tokens = 250p book
Tokens matter – more context, more relevant answers
- Over limit: hallucinate, off-topic.

Reinforcement Learning in LLMs

Key Components

RLHF: Reinforcement Learning from Human Feedback
- Models learn from human preferences
- Helps align outputs with human values
Constitutional AI
- Models trained to follow specific rules
- Reduces harmful or biased outputs
Direct Preference Optimization
- Model trained to prefer responses that humans rank higher

Impact on LLMs

Better Alignment
- More helpful responses
- Reduced harmful content
- Better instruction following
Improved Quality
- More consistent outputs
- Better reasoning
- Clearer explanations

RL Improvements in Claude Development

Key RL Techniques Used

Constitutional AI
- Core part of Claude’s development
- Helps ensure helpful, safe responses
- Improves reliability of outputs
Direct Preference Optimization (DPO)
- More efficient than traditional RLHF
- Reduces training complexity
- Better alignment with preferences

Observable Improvements

Response Quality
- More nuanced understanding
- Better reasoning capabilities
- More consistent outputs
Task Performance
- Improved coding abilities
- Better at complex analysis
- More reliable fact adherence

What’s new (2025-04): AI as teammate

AI as teammate
Have a group of people work with AI as a teammate, include in discussion, etc
- Started already with medical teams

Working with LLMs

Cyborgs vs Centaurs

The Centaur and Cyborg Approaches based on Co-Intelligence: Living and Working with AI By Ethan Mollick

Co-Intelligence

The Jagged Frontier of LLM Capabilities

lot of tasks may be considered to be done by LLM
Uncertainty re how well LLM will do them – “Jagged Frontier”
Some unexpectedly easy, others surprisingly hard
Testing the frontier for data analysis – this class

Image created Claude.ai

The Centaur Approach

Clear division between human and LLM tasks
Strategic task allocation based on strengths
Human maintains control and oversight
LLM used as a specialized tool
Quality through specialization
Better for high-stakes decisions

Image created in detailed photorealistic style by Ralph Losey with ChatGPT4 Visual Muse version

The Cyborg Approach

Deep integration between human and LLM
Continuous interaction and feedback
Iterative refinement of outputs
Learning from each interaction
Faster iteration cycles
More creative solutions emerge

Image created in detailed photorealistic style by Ralph Losey with ChatGPT4 Visual Muse version

Analysis Approaches: Centaur vs Cyborg

Stage	Centaur 🧑‍💻	Cyborg 🦾
Plan	👤 Design research plan 🤖 Suggest variables	👤🤖 Interactive brainstorming 👤🤖 Collaborative refinement
Data Prep	👤 Define cleaning rules 🤖 Execute cleaning code 👤 Validate	👤🤖 Iterative cleaning 👤🤖 Joint discovery and modification
Analysis	👤 Choose methods 🤖 Implement code 👤 Validate results	👤🤖 Exploratory conversation 👤🤖 Dynamic adjustment 👤🤖 Continuous validation
Reporting	👤 Outline findings 🤖 Draft sections 👤 Finalize	👤🤖 Co-writing process 👤🤖 Real-time feedback 👤🤖 Iterative improvement

Practical Guidelines

Start with clear task boundaries (Centaur)
Gradually increase integration (Cyborg)
Many workflows combine both approaches
Higher stakes = more control
Always validate critical outputs
Build experience in prompt engineering 📍 this class

Practical Guidelines (2025-04)

Current LLMs good but not perfect
Hard to fully outsource
Cyborg is the default mode
AI as team mate is emerging

Future of Data Analysis Workflows

What we see

Major gains in coding
Some gains elsewhere
Enhanced productivity (25-40% shown in studies)
Focus on human judgment and expertise

What we don’t see

Which tasks exactly
What new iteration LLM will improve

Key concepts for Using LLMs

LLM in work

Prompt as small task
- New mindset: having an assistant: design, ask, check
- 📍 This class Data Analysis related tasks
Built into coding
- Github copilot in VSCode, RStudio, Jupyter Notebook
Specialized tools (ChatGPT Canvas, Claude Projects)
Anthropic “prompt generator” to optimize the prompts that via Anthropic Console Dashboard (click “Generate a Prompt”).
Agents

Prompt(ing): 2023–2025

In 2023-24, great deal of belief in prompt engineering as skill
In 2025 there are still useful concepts and ideas 📍 Week 2
But not many tricks.
Highly relevant response = provide any important details or context.

Interactive Workspaces for LLM Collaboration (1/2)

Major AI Platforms

Workspace	Key Features
Anthropic Claude Artifacts	• Dedicated output window • Supports text, code, flowchart, SVG, website • Real-time refinement and modification • Sharing and remixing capabilities
ChatGPT Canvas	• Separate collaboration window • Text editing and coding capabilities • Options for edits, length adjustment • Code review and porting features
OpenAI Advanced Data Analysis	• Data upload and analysis • Visualization capabilities • Python code execution in back end • Error correction and refinement

Interactive Workspaces for LLM Collaboration (2/2)

Specialized Tools

Workspace	Key Features
Claude Analysis Tool	• Fast exploratory data analysis • Interactive visualizations with real-time adjustments
Google NotebookLM	• Document upload for research grounding • Citation and quote provision • “Deep dive conversation” podcast generation
Microsoft Copilot	• Assistance in Word, Excel, etc. • Data analysis, formula construction
Google Gemini for Workspace	• Integration with Google’s office suite, Assistance in Docs etc
Cursor AI Code Editor	• AI-assisted coding • Code suggestions and queries, optimization, debugging • Real-time collaboration

What can go wrong

Hallucination: Prediction Errors

Type I Error (False Positive)

Generating incorrect but plausible information
Example: Creating non-existent research citations

Type II Error (False Negative)

Failing to generate correct information
Example: Missing key facts in training data

Economic Impact of errors

Cost of verification (humans, AI), risk assesment

Hallucination of references (still!) – Claude 3.7 in 2025-04

Hallucination of references (still!) – truth

Big debate on errors and hallucination

Is hallucination and errors inherent or may be improved
Read Carl T. Bergstrom and C. Brandon Ogbunu: chatgpt isn’t hallucinating it’s bullshitting, 04.06.2023
Even a paper Michael Townsen Hicks, James Humphries & Joe Slater

The issue is important in medicine

Medical research

Hallucination 2025 Example

Examle with conversations

ChatGPT convo on tokens

Use explicit push for calculations

ChatGPT convo on tokens 2

Stochastic Parrot

Image created in detailed photorealistic style by Ralph Losey with ChatGPT4 Visual Muse version

Stochastic Parrots

Stochastic = when prompted repeatedly, LLMs may give different answers
Parrot = LLMs can repeat information without understanding
Philosophy = to what extent do they understand the state of the world?
List of words often used by LLMs

Data Analysis

To what extent running something yields same result? 📍 this class
How good are predictions? 📍 this class

Hallucination 2025

Is a problem
- You can now ask for more thinking, RAG (retrieval augmented generation) leading to less hallucination
- You can set “temperature setting” to “low”
Knowledge (math)
Learned from training

LLM vs human 2025

LLM also trained on scientific papers, books
New methods to improve accuracy
Solve scientific problems
Reasoning models, like OpenAI o1 (o3 in Updates)

AI use cases

You have already seen many use cases

Some more ideas

Economics research

Impact of AI on workers – paper collection (2024-06)

Some business

Thomson Reuters legal

AI Use Cases: Student response

Coding Assistance & Debugging
- “It helps me with fixing errors in coding.”, “Find small errors I can’t on my own.”
- “I used it to generate data in Excel to then work with it in Python and R.”
- “I used it for Python projects .. instead of Google to get answers fast.”
Concept Clarification & Learning Support
- ““To understand certain topics”, “for clarifications on macroeconomics and data analysis concepts.”
- “For Micro and Macro courses, to understand graphs easily.”
- “I uploaded the material and explained what I wanted to do in detail and asked it to create me a study guide”
Writing & Proofreading
- “I use AI for text and code touch-ups for smoother language.”
- “While writing papers, I use ChatGPT as proofreader”, “improving the coherence.”
- “I usually give an idea and AI makes it perfect.”

AI Use Cases: Predictions

Literature Review & Summarization
AI helps quickly find relevant papers, summarize key arguments, and extract citations, saving time in reviewing large bodies of work.
Data Analysis & Coding Assistance
AI supports coding in R, Python, and Stata, assisting with debugging, automating repetitive tasks, and suggesting statistical methods for empirical research.
Writing & Editing Support
AI aids in drafting, structuring, and refining academic writing, improving clarity, grammar, and coherence while maintaining academic integrity.

How I use it?

All the time, ChatGPT 4o (Canvas), Claude (Projects), ChatGPT o1 (rare), both paid tiers
Github Copilot in VSCode and Rstudio
This presentation is massively helped by AI
- How can I make a presentation as html –> Quarto and revealjs
- add boxes, etc, create yml, customs.css, add to my website
- Content on RFHL, tokens
- Summary slides on cyborg vs centaur

How I use it?

Ask AI for ideas
- Write first draft fully alone – it helps me think through
- use ai improve sentence by sentence (to avoid blank BS)
- use ai to shorten
Do white board / notebook thinking
- Get AI to OCR and make to text
Write code
- from scratch
- code review
- debug

What were bad experience with AI?

Topics

Background work
Coding
Discussion of topics, results
…

My bad experience

AI written text is typically
- Good grammar
- Convincing structure
- Bland and unoriginal
One paragraph or one page is hard tell apart from a human
10 pages, 10 papers – easy to see

Ethich and Law

Copyright

U.S. Copyright Office 2025 Jan report Copyright and Artificial Intelligence Part 2: Copyrightability: copyright protection is intended for human-created works.

Note

“Do not provide sufficient human control to make users of an AI system the authors of the output. Prompts essentially function as instructions that convey unprotectable ideas. While highly detailed prompts could contain the user’s desired expressive elements, at present they do not control how the AI system processes them in generating the output.”

Ethics

AI was created by using (stealing?) human knowledge

NYT sued OpenAI

Is it Okay to use “Everything” as training material?

Read essay by Robin Sloan

AI in research

Use of Artificial Intelligence in AER

Note

Artificial intelligence software, such as chatbots or other large language models, may not be listed as an author. If artificial intelligence software was used in the preparation of the manuscript, including drafting or editing text, this must be briefly described during the submission process, which will help us understand how authors are or are not using chatbots or other forms of artificial intelligence. Authors are solely accountable for, and must thoroughly fact-check, outputs created with the help of artificial intelligence software.

AI in research: Elsevier

Two key points from Elsevier policy generative AI policies for journals

report for transparency
supervise, take responsibility

Use of Artificial Intelligence in classes

A Harvard lab

You gotta stay a learning human

Updates

Many LLMs, constant evolution

ChatGPT, 4, 4o, o1, o3, Tasks, Canvas, Claude 3.5, Gemini series…
2024 summer piece in The Economist
2025-02 Which Ai by Ethan Mollick, especially regarding data analysis

2025-04-20 update

by ChatGPT o3

Feature	o3 (reasoning‑first)	GPT‑4o	GPT 4/4.5
Design focus	Built‑in pre‑answer reasoning pass → fewer hallucinations	Multimodal, real‑time latency	General‑purpose LLM; turbo = faster/cheaper
Multimodality	Text + tool calls	Text + images + audio (I/O)	Text (+ images via plugins)
Browsing / tools	Auto‑search for up‑to‑date facts	Optional; slower when invoked	Optional
Default style	Concise, source‑cited	Chatty, demo‑friendly	Flexible, slightly verbose
Context window	32k tokens	128k tokens	128k (turbo)
Strengths	Step‑wise analysis, audit trail	multimodal interaction	Coding & broad knowledge
Weak spots	No native media I/O	May trade depth for speed	Still hallucination‑prone

Slide 3 — Why o3 is especially handy for data‑analysis projects

Phase	What o3 auto‑offers	GPT‑4o /4.5
Data ingest	Reads user‑uploaded CSV / PDF, chooses `python` or `file_search` without extra prompting	Explicit instructions (“Please read the file…”)
Exploration / cleaning	Runs private `python` for stats, shows clean tables via `python_user_visible`	Manual code requests; reasoning sometimes mixed in output
Up‑to‑date context	Self‑initiates web searches when facts may be stale; cites inline	User must ask to search; citation less disciplined
Reproducibility	Separates analysis vs. user‑visible code ⇒ clear audit trail	Code & commentary often inter‑mingled
Output control	Defaults to concise bullets & tables; respects Yap score	Tends toward chattier prose unless steered
Risk mgmt.	Built‑in step‑wise reasoning reduces logic slips in pipelines	More vulnerable to silent errors without “think‑step” prompts

The promise of o3

For data‑analysis projects (eg assignments, initial research), o3 itself says it behaves like a careful analyst who

1. fetches the latest data,
1. runs silent tests, and
1. hands you neat, cited deliverables

Leaving GPT‑4o for flashy multimodal demos and 4‑turbo for rapid code generation.

Conclusions and discussion

AI is widely adopted in business

Source: McKinsey Digital

To learn more

Look at beyond where I collect blog posts, videos, books, papers.

Gabor’s current take I

Should study

You have to learn stuff even if AI can also do it.
- Good writing
- core coding
Be a well rounded educated human
Because to supervise AI you need to know what to look for

Use of AI – need to report?

My view in 2024. Report what you have done
My view in 2025. No need to report, AI is now like internet search (or electricity)

Gabor’s current take II

Your place with AI

AI as input, supervision, debugging, responsibility.
Without core knowledge you can’t interact
Strong knowledge and experience helps debugging

Future: more opportunities

Cheaper data analysis = more use cases

Status

This is version 0.4.0
created after teaching the class
redesigned for 6*100-120 minutes – you can adapt class work and assignments to fit time
assumes audience familiar with basics of coding and data analysis

bekesg@ceu.edu

Data Analysis with AI: Concepts

About me and this slideshow

Hello

Use of Artificial Intelligence

Why

This class

This class – approach

This class – topics and case studies

Intro to the concept of LLMs

LLM Development Timeline

Key Milestones in LLM Development I

Key Milestones in LLM Development II

References

Key Milestones in LLM Development III

What are Large Language Models?

LLMs as Prediction Machines

Understanding Tokens

Context Window & Memory

Token window

Reinforcement Learning in LLMs

Key Components

Impact on LLMs

RL Improvements in Claude Development

Key RL Techniques Used

Observable Improvements

What’s new (2025-04): LLM-Related Developments

What’s new (2025-04): AI as teammate

Working with LLMs

Cyborgs vs Centaurs

The Jagged Frontier of LLM Capabilities

The Centaur Approach

The Cyborg Approach

Analysis Approaches: Centaur vs Cyborg

Practical Guidelines

Practical Guidelines (2025-04)

Future of Data Analysis Workflows

What we see

What we don’t see

Key concepts for Using LLMs

LLM in work

Prompt(ing): 2023–2025

Interactive Workspaces for LLM Collaboration (1/2)

Major AI Platforms

Interactive Workspaces for LLM Collaboration (2/2)

Specialized Tools

What can go wrong

Hallucination: Prediction Errors

Type I Error (False Positive)

Type II Error (False Negative)

Economic Impact of errors

Hallucination of references (still!) – Claude 3.7 in 2025-04

Hallucination of references (still!) – truth

Big debate on errors and hallucination

The issue is important in medicine

Hallucination 2025 Example

Stochastic Parrot

Stochastic Parrots

Data Analysis

Hallucination 2025

LLM vs human 2025

AI use cases

You have already seen many use cases

Economics research

Some business

AI Use Cases: Student response

AI Use Cases: Predictions

How I use it?

How I use it?

What were bad experience with AI?

Topics

My bad experience

Ethich and Law

Copyright

Ethics

AI was created by using (stealing?) human knowledge

Is it Okay to use “Everything” as training material?

AI in research

Use of Artificial Intelligence in AER

AI in research: Elsevier

Use of Artificial Intelligence in classes