Interview Analysis Code

## Overview Code examples for sentiment analysis of football manager interviews, supporting both R and Python workflows. **Related to:** Week 5-6 (Text as Data) ## API Setup ### `api-key.R` Template for setting up OpenAI API credentials in R. ```r # Set your API key Sys.setenv(OPENAI_API_KEY = "sk-....") ``` **Important:** Copy to `my-openai-api-key.R` and add to `.gitignore` ## R Scripts ### `sentiment-analysis.R` Complete sentiment analysis workflow using OpenAI API. **Features:** - API key setup and validation - Batch processing with progress tracking - Sentiment classification function (-2 to +2 scale) - Error handling and retry logic - Comparison between multiple API runs **Key functions:** - `classify_text()`: Sends text to OpenAI for sentiment rating - Handles API errors with exponential backoff - Saves results to CSV for further analysis ### `domain_lexicon.r` Creates football-specific sentiment lexicon. **Output:** `domain_lexicon.csv` with: - 100+ football-specific terms - Positive terms (goals, win, excellent): scores 1.2-1.9 - Negative terms (lose, poor, mistake): scores -1.9 to -1.2 - Neutral terms (match, team, play): score 0 ## Python Scripts ### `classify_manager_sentiment.py` Python equivalent of R sentiment analysis. **Features:** - OpenAI API integration using `openai` library - Logging and progress tracking with `tqdm` - Robust error handling - Same rating guidelines as R version **Dependencies:** ```python pip install openai pandas tqdm python-dotenv ``` ### `manager_sentiment_api.md` Documentation and code template for Python API usage. ## Usage Notes ### API Key Management - Store keys in environment variables - Never commit keys to version control - Budget ~$5 for course exercises ### Rate Limiting - Both scripts include retry logic - Exponential backoff for failed requests - Monitor usage through OpenAI dashboard ### Reproducibility - Set `temperature=0` for consistent results - Save intermediate results - Document model versions used ## Output Files Scripts generate: - `manager_sentiment_results.csv`: Individual text ratings - `sentiment_comparison.csv`: Comparison between runs - `classification.log`: Detailed processing logs ## Integration with Course Data These scripts work with: - `/data/interviews/interview-texts-only.xlsx` (input) - `/data/interviews/domain_lexicon.csv` (lexicon) - Student rating files for comparison analysis

Gábor Békés and Gábor Kézdi