Skip to content
Goodfriend Whyte edited this page May 23, 2026 · 2 revisions

Welcome to the provenance-energy-rag-chatbot wiki!

Trustworthy Domain-Specific RAG Chatbot With Provenance

Welcome to the project wiki for the Trustworthy Domain-Specific RAG Chatbot With Provenance.

This project is a document-grounded Retrieval-Augmented Generation (RAG) chatbot designed for technical support in the solar and energy equipment domain. It helps users upload technical manuals, ask questions, retrieve relevant evidence, generate grounded answers, and verify every response through visible citations and source cards.

Project Goal

The goal is to build a reliable AI assistant that helps engineers and technical users quickly find information from uploaded documents such as:

  • inverter manuals
  • solar/PV module manuals
  • battery manuals
  • charge controller manuals
  • fan manuals
  • troubleshooting guides
  • maintenance documents
  • fault-code tables

The assistant should not guess when the answer is not available in the uploaded documents. Instead, it should clearly say that there is not enough information to answer reliably.

What The App Can Do

  • Upload PDF, DOCX, TXT, and Markdown documents
  • Extract and clean document text
  • Split documents into searchable chunks
  • Store document chunks in a persistent vector database
  • Retrieve relevant passages for a user question
  • Generate answers grounded in retrieved evidence
  • Show citations and source cards
  • Display filename, page number, section, chunk ID, and relevance score
  • Refuse unsupported answers in document mode
  • Search exact fault codes without calling the LLM
  • Cache repeated responses to reduce unnecessary API calls

Technology Stack

Backend

  • Python
  • FastAPI
  • uv
  • Pydantic
  • ChromaDB
  • SentenceTransformers
  • OpenAI-compatible LLM provider support
  • pytest
  • ruff

Frontend

  • Streamlit
  • httpx
  • uv

Main User Flow

Upload manual
     ↓
Extract text
     ↓
Clean and chunk text
     ↓
Create embeddings
     ↓
Store chunks in ChromaDB
     ↓
Ask technical question
     ↓
Retrieve relevant chunks
     ↓
Generate grounded answer
     ↓
Show citations and source cards