An open source document intelligence API. Upload any PDF and ask questions — DocuMind finds the most relevant information, reranks it for accuracy, and returns a cited answer with source page numbers.
- Law firm uploads contracts → asks "what are the termination clauses?"
- Student uploads textbook → asks "explain Newton's third law"
- Company uploads HR policy → employees ask questions about leave rules
PDF → pdfplumber (load + chunk) → HuggingFace embeddings → Qdrant vector store → Cohere reranking → Groq LLaMA 3.3 → answer + citations
- Python, FastAPI
- LangChain — LLM application framework
- Qdrant — vector database (local, no Docker needed)
- HuggingFace — free local embeddings (all-MiniLM-L6-v2)
- Cohere Rerank — improves retrieval accuracy
- Groq API — free, fast LLM inference (LLaMA 3.3 70B)
- Qdrant over ChromaDB => production-grade, better performance
- Cohere reranking => separates basic RAG from accurate RAG
- HuggingFace embeddings => free, no API cost, runs locally
GET /health— health checkPOST /upload— upload a PDF and process itPOST /query— ask a question, get answer + sources
- Clone the repo
- Create virtual environment:
python -m venv venv - Activate:
venv\Scripts\activate - Install:
pip install -r requirements.txt - Copy
.env.exampleto.envand add your API keys - Run:
uvicorn api:app - Open
http://127.0.0.1:8000/docsto test
- GROQ_API_KEY — console.groq.com
- GOOGLE_API_KEY — aistudio.google.com
- COHERE_API_KEY — dashboard.cohere.com
- Scanned/image PDFs not supported (no OCR)
- Table extraction is basic (complex tables may lose structure)
- No hybrid search (vector + keyword) — planned improvement
- Single user only - concurrent access requires Qdrant server mode