Skip to content

ahsankhizar5/text-sentiment-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

🎭 Text Sentiment Analysis

A machine learning pipeline to classify IMDB movie reviews as positive or negative using NLP preprocessing, TF-IDF vectorization, and a Logistic Regression model.


🧠 Overview

This project builds a text sentiment analysis model using the IMDB Reviews Dataset. The pipeline involves preprocessing, vectorization, training, evaluation, and live prediction.


📁 Dataset

This project uses the IMDB Reviews Dataset:

➡️ Download from Kaggle

Steps:

  1. Download and extract the dataset.
  2. Rename or ensure the file is named IMDB_Dataset.csv.
  3. Place it in the project root directory.

⚠️ The dataset is not included in the repo due to GitHub file size limits.


🛠️ Tech Stack

  • Python 3
  • Pandas
  • Scikit-learn
  • NLTK
  • TF-IDF Vectorizer
  • Logistic Regression

🚀 Features

  • Clean and normalize text using NLP techniques
  • Convert reviews into numerical features using TF-IDF
  • Train and evaluate a logistic regression model
  • Save trained model and vectorizer for reuse
  • Predict sentiment of custom reviews in real-time

📊 Results

Metric Score
Accuracy 85.13%
F1-Score 85%

📦 Project Structure


text-sentiment-analysis/
├── IMDB\_Dataset.csv
├── sentiment\_analysis.py
├── sentiment\_model.pkl
├── tfidf\_vectorizer.pkl
└── README.md


▶️ Getting Started

  1. Clone the repo

    git clone https://github.com/ahsankhizar5/text-sentiment-analysis.git
    cd text-sentiment-analysis
  2. Install dependencies

    pip install -r requirements.txt
  3. Run the script

    python sentiment_analysis.py
  4. Enter your own review for live prediction!


📌 Example

📝 Try your own review:
Enter a movie review: this seems to be bad one
Predicted Sentiment: Negative 😞

📑 License

MIT License


🤝 Contact

For queries or collaboration, feel free to reach out: Ahsan Khizar GitHubLinkedIn

“Code is not just about solving problems. It’s about building trust, clarity, and real-world impact — one line at a time.”> — Ahsan Khizar

About

A machine learning pipeline to classify IMDB reviews into positive or negative sentiment using TF-IDF and Logistic Regression.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages