Skip to content

michaeltsige/Devdocs-Assistant

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DevDocs Assistant

A lightweight retrieval-augmented documentation assistant built with FastAPI, LangChain, ChromaDB, and Ollama.
It allows developers to query their own documentation or project notes using local language models and embeddings.

Overview

DevDocs Assistant provides an API that retrieves relevant document context and generates concise answers using an Ollama model.
It is designed to run locally, fully offline once your vector database is built, and can work with any Ollama model that supports chat and embedding.

Features

  • Retrieval-Augmented Generation (RAG): Combines document retrieval with language model reasoning.
  • Local Embeddings: Uses Ollama’s nomic-embed-text model for vector creation.
  • Modular Model Selection: Defaults to TinyLlama but supports any installed Ollama model.
  • FastAPI Backend: Clean and easy API for integration with frontends or other apps.
  • Persistent Vector Store: Uses ChromaDB to store and retrieve embedded document chunks.

Installation and Setup

  1. Clone the repository

    git clone https://github.com/michaeltsige/Devdocs-Assistant.git
    cd Devdocs-Assistant
    
  2. Create and Activate a virtual environment

    python3 -m venv .venv
    source .venv/bin/activate
    
  3. Install dependencies

    pip install -r requirements.txt
    
  4. Ensure Ollama is running

    Install Ollama from ollama.com and pull your desired models(here tinyllama is pulled):

    ollama pull tinyllama
    ollama pull nomic-embed-text
    
  5. Build or update the vector store

    Run the script to preprocess and embed your docs (build_index.py), before starting the API

    python build_index.py

    This creates the vectorstore/ directory used for retrieval.

  6. Run the API server

    python -m uvicorn app:app --reload --host 0.0.0.0 --port 8000

Overview

Send a POST request to /ask with your question

curl -X POST "http://localhost:8000/ask" \
  -H "Content-Type: application/json" \
  -d '{"question": "What is FastAPI?"}'

Response:

{ "answer": " FastAPI is a high-performance web framework for building APIs with Python."
}

You can verify the app is running by visiting the base URL:

GET http://localhost:8000/

Returns:

{ "message": "DevDocs Assistant is running successfully!" }

Architecture

  • Embeddings: Generated using OllamaEmbeddings(model="nomic-embed-text").
  • Vector Database: Managed by ChromaDB, persisting embeddings to vectorstore/.
  • Retriever: Fetches top-k relevant document chunks (k=5 by default).
  • Prompt Template: Combines question and context for clarity.
  • Language Model: Uses ChatOllama with TinyLlama (or any chosen Ollama model).
  • API: Built with FastAPI for simplicity and scalability.

Configuration

You can modify these parameters in app.py or your build script:

  • model="tinyllama" → change to any Ollama chat model (e.g., llama3, mistral, phi3, etc.)
  • collection_name="devdocs" → rename for separate projects
  • search_kwargs={"k": 5} → adjust the number of retrieved documents
  • temperature=0.3 → tune model response consistency

Deployment

  • Local: Run with uvicorn as shown above.
  • Cloud or containerized: Mount the vectorstore/ and ensure Ollama is running.
  • Frontend integration: The /ask endpoint can easily power a Streamlit, React, or mobile frontend.

Contributing

Contributions are welcome!
You can help by improving retrieval logic, adding caching, supporting more model configurations, or refining prompt templates.

About

Intelligent documentation assistant using RAG with Ollama, ChromaDB, and FastAPI

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages