Skip to content

AymenGabsi/rag_deepseek_groq_doc_qa

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

7 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ‹ DeepSeek-R1 - Document RAG (Retrieval-Augmented Generation)

πŸ“Œ Overview

This project implements a Retrieval-Augmented Generation (RAG) system for document-based question answering using:

  • DeepSeek-R1 (via Groq) for fast & efficient AI-powered responses.
  • ChromaDB for storing and retrieving document embeddings.
  • HuggingFace Embeddings for vectorization.
  • Streamlit for an interactive UI.

πŸ”Ή Users can upload a PDF, process it into a vector database, and query it for answers!


πŸš€ Features

  • πŸ“„ Upload and process PDF documents into a vector store.
  • πŸ” Retrieve relevant document sections using ChromaDB.
  • πŸ’¬ Generate AI-powered answers using DeepSeek-R1 via Groq.
  • πŸŽ› Interactive UI built with Streamlit.

πŸ“‚ Project Structure

/project-folder
β”‚-- main.py                # Streamlit UI for user interaction
β”‚-- rag_utility.py         # Core logic for document processing & retrieval
β”‚-- config.json            # Stores API keys
β”‚-- requirements.txt       # Required dependencies

πŸ›  Installation & Setup

1️⃣ Clone the Repository

git clone https://github.com/AymenGabsi/rag_deepseek_groq_doc_qa.git
cd your-repo

2️⃣ Set Up a Virtual Environment

python -m venv venv
source venv/bin/activate   # For macOS/Linux
venv\Scripts\activate     # For Windows

3️⃣ Install Dependencies

pip install --upgrade pip setuptools wheel
pip install -r requirements.txt

4️⃣ Configure API Keys

  • Open config.json and replace your_groq_api_key with your actual Groq API Key:
{
  "GROQ_API_KEY": "your_groq_api_key"
}

▢️ Running the Application

After setting up, launch the Streamlit app:

streamlit run main.py

This will open a web interface where you can:

  1. Upload a PDF
  2. Ask questions about its content
  3. Receive AI-generated answers powered by DeepSeek-R1

πŸ— How It Works

  1. Upload a PDF file.
  2. Extract text and convert it into vector embeddings using HuggingFace.
  3. Store embeddings in ChromaDB for efficient retrieval.
  4. Retrieve relevant sections from ChromaDB when a user asks a question.
  5. Send retrieved data to DeepSeek-R1 (Groq) LLM for final answer generation.

πŸ“Œ Technologies Used


πŸ€– Future Enhancements

  • πŸ”„ Support more document formats (TXT, DOCX, HTML, etc.)
  • πŸ“Š Improve retrieval accuracy using hybrid search (BM25 + embeddings).
  • 🎨 Enhance UI with real-time answer explanations.
  • πŸš€ Optimize response speed by caching retrieval results.

πŸ“œ License

This project is licensed under the MIT License.


⭐ Contributing

Contributions are welcome! Feel free to fork this repository, submit issues, or create pull requests.

πŸ“§ For inquiries, reach out via GitHub Issues.

πŸš€ Happy Coding!

About

πŸ“„βš‘ A RAG (Retrieval-Augmented Generation) Document QA system using Deepseek and Groq for fast and efficient querying over documents. Inspired by this tutorial, this project demonstrates how to set up a high-performance AI-powered question-answering system.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages