## Overview
Code examples for sentiment analysis of football manager interviews, supporting both R and Python workflows.
**Related to:** Week 5-6 (Text as Data)
## API Setup
### `api-key.R`
Template for setting up OpenAI API credentials in R.
```r
# Set your API key
Sys.setenv(OPENAI_API_KEY = "sk-....")
```
**Important:** Copy to `my-openai-api-key.R` and add to `.gitignore`
## R Scripts
### `sentiment-analysis.R`
Complete sentiment analysis workflow using OpenAI API.
**Features:**
- API key setup and validation
- Batch processing with progress tracking
- Sentiment classification function (-2 to +2 scale)
- Error handling and retry logic
- Comparison between multiple API runs
**Key functions:**
- `classify_text()`: Sends text to OpenAI for sentiment rating
- Handles API errors with exponential backoff
- Saves results to CSV for further analysis
### `domain_lexicon.r`
Creates football-specific sentiment lexicon.
**Output:** `domain_lexicon.csv` with:
- 100+ football-specific terms
- Positive terms (goals, win, excellent): scores 1.2-1.9
- Negative terms (lose, poor, mistake): scores -1.9 to -1.2
- Neutral terms (match, team, play): score 0
## Python Scripts
### `classify_manager_sentiment.py`
Python equivalent of R sentiment analysis.
**Features:**
- OpenAI API integration using `openai` library
- Logging and progress tracking with `tqdm`
- Robust error handling
- Same rating guidelines as R version
**Dependencies:**
```python
pip install openai pandas tqdm python-dotenv
```
### `manager_sentiment_api.md`
Documentation and code template for Python API usage.
## Usage Notes
### API Key Management
- Store keys in environment variables
- Never commit keys to version control
- Budget ~$5 for course exercises
### Rate Limiting
- Both scripts include retry logic
- Exponential backoff for failed requests
- Monitor usage through OpenAI dashboard
### Reproducibility
- Set `temperature=0` for consistent results
- Save intermediate results
- Document model versions used
## Output Files
Scripts generate:
- `manager_sentiment_results.csv`: Individual text ratings
- `sentiment_comparison.csv`: Comparison between runs
- `classification.log`: Detailed processing logs
## Integration with Course Data
These scripts work with:
- `/data/interviews/interview-texts-only.xlsx` (input)
- `/data/interviews/domain_lexicon.csv` (lexicon)
- Student rating files for comparison analysis