Skip to content

Latest commit

 

History

History
199 lines (133 loc) · 7.39 KB

File metadata and controls

199 lines (133 loc) · 7.39 KB

Getting Started

Python FastAPI MongoDB Voyage AI sentence-transformers

Everything you need to run your first RAG parameter sweep experiment.

Shortest path: QUICKSTART.md — install and first sweep. This guide adds step-by-step detail.

Documentation map: docs/README.md


✅ Prerequisites

Requirement Version Notes
Python 3.12+ Install via python.org or pyenv install 3.12.2
Node.js 22+ Install via nodejs.org or nvm install 22
MongoDB Atlas Free tier (M0) Required — see Cloud Account Setup
Voyage AI Optional Only for Voyage models — see Cloud Account Setup
Docker Desktop + HF_TOKEN Optional Self-hosted SIE only — remote gateway needs no Docker; see SIE Provider Setup

New to Atlas or Voyage? Start with Cloud Account Setup — account creation, connection string, search indexes, API key, and Tier 1 billing (~15 min).

Using SIE (open-source BGE-M3 embeddings)? See SIE Provider Setup — set SIE_ENABLED=true (on/off), then SIE_ENDPOINT (+ SIE_API_KEY if needed) for a remote gateway, or optional local Docker.


📦 Install

git clone https://github.com/neomatrix369/rag-params-finder.git
cd rag-params-finder

# Python environment
uv venv
source .venv/bin/activate   # Windows: .venv\Scripts\activate
uv pip install -e .

# Frontend
cd frontend && npm install && cd ..

⚙️ Configure

1. Set environment variables

cp .env.example .env

Edit .env — minimum for sweeps:

# Required (both sweeps)
MONGODB_URI=mongodb+srv://<user>:<pass>@<cluster>.mongodb.net/rag_params_finder?retryWrites=true&w=majority

# Required for Voyage sweep only — see cloud-setup.md checklist
VOYAGE_API_KEY=vo-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
# Uncomment Tier 1 limits in .env.example (comment out free-tier defaults first)
VOYAGE_RPM_LIMIT=2000
VOYAGE_TPM_LIMIT=16000000

SERVER_URL=http://localhost:8001

Full variable reference: Troubleshooting → Environment Variables. Optional Atlas Admin API keys enable cluster tier + storage quota in the dashboard — see .env.example.

2. Search indexes (required before sweep)

Both example configs use dense + sparse + hybrid — create vector_index_384 (local) or vector_index_1024 (Voyage or SIE) and text_search_index on the chunks collection.

M0 free tier: do this manually in Atlas UI before running a sweep — see Cloud Account Setup → step 6. M0 allows 3 search indexes cluster-wide; unknown indexes from other projects consume quota.

M10+ paid tier: server creates indexes on startup — check uvicorn logs.

Verify and fix quota issues (any tier):

rag-params-finder indexes list              # known vs unknown; count vs M0 limit
rag-params-finder indexes reset             # drop unknown indexes + ensure required
rag-params-finder indexes reset --all       # drop all chunks indexes + recreate

The server preflights search indexes when you submit a sweep: it derives required index names from your YAML (embedding dimensions + sparse/hybrid retrieval), checks cluster capacity, and rejects the experiment with HTTP 422 if indexes are missing or quota is exhausted — before any embedding work starts.


📄 Add Your Documents

Place source documents in input_data/ (gitignored):

mkdir -p input_data/pdfs
cp /path/to/my-document.pdf input_data/pdfs/

Supported formats: .pdf, .txt, .md, .csv

Reference files or directories in your config YAML:

data_paths:
  - ./input_data/pdfs/my-document.pdf   # individual file
  - ./input_data/papers/                # directory — scanned recursively

🚀 Start the Server and Dashboard

Option A — Manual (two terminals)

# Terminal 1: FastAPI server
uvicorn server.main:app --reload --port 8001

# Terminal 2: React dashboard (optional)
cd frontend && npm run dev

Option B — Docker (one command)

Requires Docker Desktop and uv pip install -e . on the host for the CLI.

./start-services.sh
  • Server: http://localhost:8001 (OpenAPI docs at /docs)
  • Dashboard: http://localhost:5374
  • Dev hot reload: docker compose -f docker-compose.yml -f docker-compose.dev.yml up --build

See Troubleshooting → Docker if startup fails.


▶️ Run Your First Experiment

Complete the checklist for your sweep path in Cloud Account Setup → Before you run a sweep first.

# Local sweep — checklist items 1–5 (no Voyage)
rag-params-finder run --config configs/example-mongodb-local.yaml

# Voyage sweep — checklist items 1–9
rag-params-finder run --config configs/example-mongodb-voyage.yaml

# SIE sweep — SIE_ENABLED=true + SIE_ENDPOINT (+ SIE_API_KEY if remote); see sie-setup.md
rag-params-finder run --config configs/example-mongodb-sie.yaml

# Submit and detach (check dashboard for status instead)
rag-params-finder run --config configs/example-mongodb-local.yaml --detach

The CLI will:

  • Submit the config to the server (experiment name gets a timestamp suffix automatically)
  • Display the experiment ID and generated run IDs
  • Poll run progress live unless --detach is used

Open http://localhost:5374 to watch live progress and explore results.

Long sweeps: pause and resume without losing completed runs:

rag-params-finder pause <experiment-id>    # stop after current phase
rag-params-finder resume <experiment-id>   # continue remaining combos

Or use the Pause / Resume buttons on the experiment detail screen in the dashboard.


🤖 Pre-downloading Local Models (Optional)

When using provider: local, sentence-transformers downloads models from HuggingFace on first use (~23 MB each). To avoid startup delay on your first run:

python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('all-MiniLM-L6-v2')"
python -c "from sentence_transformers import CrossEncoder; CrossEncoder('cross-encoder/ms-marco-MiniLM-L-6-v2')"

Models are cached in ~/.cache/huggingface/hub/ after the first download.


👉 Next Steps