Week 05: using text as data

## Overview In this lesson, students will be introduced to sentiment analysis, specifically applied to evaluating general positivity or negativity in football managers' statements about match outcomes. ### Learning Outcomes By the end of the session, students will: - Gain hands-on experience with sentiment analysis. - Understand the complexities and limitations of sentiment analysis. ### Materials - General Sentiment (positive/negative) rating scale [HERE](/week05/assets/sentiment-scale.md) - CSV files: - `student_test_5` download from [HERE as xlsx](/week05/assets/student_test_5.xlsx) [HERE as csv](/week05/assets/student_test_5.csv), (or from moodle) ## Assignment review * Fancy graphs != good graphs (good graph <- careful design) * Precise interpretation >> BS * Less is more * Show only what you understand deeply ## Lecture: NLP basics - **Topic:** Introduction to Sentiment Analysis - **Key points:** - Importance of text analysis and its applications - Introduction to Natural Language Processing (NLP): definition and applications - Key concepts in text analysis: - Tokenization - Preprocessing techniques - Feature extraction - Sentiment analysis: detecting emotion and tone in text - Practical examples from football managers' post-match interviews - Limitations and challenges in text analysis, emphasizing contextual interpretation and ambiguity [Slides](https://gabors-data-analysis.com/courses/da-w-ai-2025/da-w-ai-05-text-to-data#/title-slide) [domain lexicon](/data/interviews/domain_lexicon.csv) ## Practical Activity ### Manual vs AI Sentiment Analysis Activity - **Objective:** Practice manually rating football manager statements as positive or negative. - **Steps:** 1. Review general sentiment rating scale provided [HERE](/week05/assets/sentiment-scale.md) 2. Individually analyze and rate **5 provided test statements** from `student_test.csv`. 3. Now use AI to rate them. 4. Try have a better domain lexicon. Discuss experience, how AI helps, what could go wrong. ### Prediction of score * Modeling choices of results * Think about *how* you would do it first * Check how AI thinks about, rate the examples and look at explanations * take the 5 examples, and compare your predictions vs the AI predictions ### Discussion: Validation and Sentiment Analysis - **Objective:** Discuss validation techniques used in sentiment analysis. - **Topics for discussion:** - Differences between manual and AI ratings - Ground Truth - Introduction to validation methods: - If ground truth -- can do confusion maztric, calculate accuracy - If no ground truth -- measure **agreement** between humans and AI. test difference. - AI is average, but... - AI with persona? - AI biased ? ## End of Week Discussion points * How precise is AI in sentiment analysis? * How did *you* compare to AI in terms of scores? How did any difference make you feel? * Can you think of a past project where AI could have helped you upgrade it?