creating-code-sentiment-analysis

Writing code to run sentiment analysis via APIs.

Here are some prompts we created. Follow them. Need debugging.

First download interview-texts-only.xlsx from data/interviews

Write a Python script for sentiment analysis of post-match manager quotes using OpenAI’s GPT API. Follow these detailed specifications:

🎯 Task Overview

  • Classify each quote on a sentiment scale from -2 to 2 based on the manager’s tone, not the game result.

  • The script should process the xlsx file and produce a new CSV with sentiment scores.

📁 Input File

  • Load interview-texts-only.xlsx .

  • Use utf-8 encoding.

  • Skip malformed lines when reading.

🧠 GPT Classification Logic

  • Use OpenAI’s gpt-4.1-2025-04-14 model (or fallback-compatible GPT-4 model).

  • Construct a detailed prompt that:

Task:

Please read each text carefully and rate the overall sentiment of the manager’s statement as positive or negative. Your rating should reflect the manager’s expressed tone, not your judgment of the match.

Rating Scale:

Score Meaning
2 Strongly positive sentiment (clear optimism, satisfaction, praise).
1 Mildly positive sentiment (generally positive, slight reservations).
0 Neutral or unclear sentiment.
-1 Mildly negative sentiment (general disappointment, frustration).
-2 Strongly negative sentiment (clear criticism, significant disappointment).

Final Notes:

  • Use 0 if unsure or if sentiment is mixed without clear dominance.
  • Retry up to 3 times if API call fails, using exponential backoff (e.g., wait 1s, 2s, 4s).

  • If the response is not an integer, return None.

🔐 API Authentication

  • Use the openai Python package (v1+).

  • Load API key securely from .env file with key OPENAI_API_KEY.

  • If the key is missing, exit with an error.

🧪 Processing & Output

  • Iterate through all rows using tqdm for progress.

  • For each row, call the GPT model and collect the result.

  • Save results (text_id, score) to manager_sentiment_results.csv.

  • At end, print and log score frequency table.

🪵 Logging

  • Use Python’s logging module.

  • Configure it to:

    • Log to both console (stdout) and to file classification.log (overwrite mode).

    • Use format: [timestamp] [level] [message]

    • Set level to INFO.

  • Log key events:

    • Start of API call

    • API responses

    • Number of rows processed

    • Missing columns

    • Final distribution of scores

📦 Dependencies

  • pandas

  • tqdm

  • openai

  • dotenv

🔁 Extras

  • Use functions for:

    • Loading CSV,

    • Classifying text,

    • Error handling.

  • Ensure script can be run directly with #!/usr/bin/env python3 at the top.

Let me know if you want me to generate the exact prompt message string or turn this into a README section.