Doing Data Analysis with AI

A short course

Author

Gábor Békés, Central European University (Austria, EU)

Published

May 26, 2025

Gabor’s Data Analysis with AI

What’s this

This course will equip students, who are already versed in core data analysis methods, with experience to harness AI technologies to improve productivity (yes this is classic LLM sentence). But, yeah, the idea is to help students who studied data analysis / econometrics / quant methods and want to think about how to include AI in their analytics routine, and spend time to share experiences.

As AI becomes more and more powerful, it is also important to provide a platform to dicuss human agency in data analysis. So a key element of the course and its instructor to lead discussions on the role of AI and humans in various aspects of data analysis.

Course description

Content

The course focuses on using large language models (LLMs) such as OpenAI’s ChatGPT, Anthropic Claude.ai, Mistral’s Le Chat, and Google’s Gemini) to carry out tasks in data analysis. It includes topics like data extraction and wrangling, data exploration and descriptive statistics, and creating reports as well as turning text to data.

There are three case studies that we use (1) a simulated set of data tables on hotels in Austria, (2) The World Value Survey, and (3) A series of interview textst.

The course material includes weekly practice assignments.

Background

You need a background in Data Analysis / Econometrics, a good introductory course is enough. I, of course, suggest Chapters 1-12 and 19 of Data Analysis for Business, Economics and Policy (Cambridge UP, 2021). Full slideshows, data and code are open source. But consider buying the book. In particular, the course builds on Chapters 1-6 and 7-10, and 19 of Data Analysis but other Introductory Econometrics + basics of data science knowledge is ok.

Students are expected to have some basic coding knowledge in Python or R (Stata also fine mostly).

Relevance

AI is everywhere and has become essential, most analytic work will be using it. It’s like the Internet a while back. Does not solve all problems, but almost all intellectual tasks will rely on inputs from it.

Learning Outcomes

Key outcomes. By the end of the course, students will be able to

  • Gain experience and confidence using genAI to carry out key tasks in data analysis.
  • Build AI in coding practice including data wrangling, description and reporting and text analysis
  • Have some idea of use cases when AI assistance is (1) greatly useful, (2) helpful, (3) currently problematic.
  • Have some idea of use cases when AI assistance is OK to use as is vs needs strong human supervision
  • Have an understanding of resources to follow for updates.

Target audience

This is a course aimed at 3rd (2nd?) year BA and MA students in any program with required background. Economics, Quantitative Social science, Political Science, Sociology, History. To be frank, all students shall learn data analysis and be comfortable using AI.

But, anyone can use it with adequate background.

Assignments

Assignments are available for all classes

Important to note for assignments: * Use AI but do not submit something that was created by AI. AI is your assistant. * One of the goals of the course is to practice this.

Week01: LLM Review

What are LLMs, how is the magic happening. A non-technical brief intro. How to work with LLMs? Plus ideas on applications. Includes suggested readings, podcasts, and vids to listen to.

Content

Which AI? See my take on current models. As of May 2025.

Week02: Data and code discovery and documentation with AI

Learn how to write a clear and professional code and data documentation. LLMs are great help once you know the basics.

Case study: World Values Survey. Data is at WVS

Content

Week 03: Writing Reports

You have your data and task, and need to write a short report. We compare different options with LLM, from one-shot prompt to iteration.

Case study: World Values Survey. Data is at WVS

Content

Week04: Data wrangling, joining tables

When asked about what I shall add to my textbook, David Card, the Nobel winning empirical economist told me that I shall spend time with joining tables. Here we go.

Case study: simulated Austrian hotels. Data is at hotels

Content

Week05: Text as data 1 – intro lecture

No course of mine can escape football (soccer). Here we look at post-game interviews to learn basics of text analysis and apply LLMs in what they are best - context dependent learning. Two class series. First is more intro to natural language processing.

Case study: football post-game interviews. Data is at interviews

Content

Week06: Text as data 2 – practice

Second class, now we are in action. How does LLM compare to humans?

Case study: football post-game interviews. Data is at interviews

Content

Week07: Creating simulations with apps and dashboards

TBA

Week08: AI as research companion

TBA

Learn more

I’m adding material to learn-more folder. You can start with the beyond page.


Rights and acknowledgement

You can use it to teach and learn freely

Attribution: Békés, Gábor: “Doing Data Analysis with AI: a short course”, available at github.com/gabors-data-analysis/da-w-ai/, v0.5, 2025-05-14

License: CC BY-NC-SA 4.0 – share, attribute, non-commercial (contact me for corporate gigs)

Textbook Please check out the textbook behind all this, buy it if you can. If interested teaching contact the Cambridge UP or me.

Thanks

Thanks: Developed mostly by me, Gábor Békés Thanks a million to the two wonderful human RAs, Ms Zsuzsanna Vadle and Mr Kenneth Colombe, both Phd students. Thanks to Claude.ai that did a great deal of help in creating the simulated dataset. ChatGPT and Claude.ai helped create the slideshows and educated me on NLP. This is a beatiful example of collaboration with great young people while heavily benefiting from advanced AI.

Thanks for CEU’s teaching grant that allowed me pay people and AI.

Questions and suggestions

This material is based my course at CEU in Vienna, Austria.

If you have questions or suggestions or interested to learn more, just fill in this form.

And now, this.

AI use is very costly in terms of energy. Yes, it is becoming cheaper. But humanity is also using much more of it.