All notable changes to rag-params-finder will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
For contributors: Update this file during development, not at release time. Add entries under ## [Unreleased] as you work. When creating a release, ./scripts/release.sh will prompt you to move items to the new version section.
server/db/mongodb_uri.py— cloud vs local MongoDB URI detection (is_atlas_uri,parse_atlas_cluster_name), shared by index bootstrap and Atlas storage quota lookup- Slice 21 — SIE Skateboard — SIE (Superlinked Inference Engine) as a third embedding provider;
embedder_factory.pysingle dispatch point for voyage/local/sie;sie_embedder.py(BGE-M3, Stella-v5, SPLADE-v3 via remote gatewaySIE_ENDPOINT+SIE_API_KEYor optional self-hosted Docker on:8720);sie_guard.pypreflight before SIE sweeps;aim_logger.pyAim experiment run logging (no-op on failure);POST /api/v1/sweepTier 1 ranked sweep endpoint (caller suppliescorpuslist; falls back to topic string);GET /healthextended withsie,versionfields;configs/example-mongodb-sie.yaml; 58 tests - Slice 25 — Atlas Local dev mode —
./start-services.sh --local/RAG_LOCAL_ATLAS=1;mongodb-atlas-localDocker container; search indexes auto-provisioned on server boot for local URI (no Atlas UI manual step) - Documentation navigation — root QUICKSTART.md, docs/README.md persona index; slice tracker at docs/slices/PROGRESS.md
- Slice 14 Docker Compose —
./start-services.sh,stop-services.sh,setup.sh;docker-compose.yml+docker-compose.dev.yml; server/frontend Dockerfiles;/healthzMongoDB ping;scripts/health-check.sh - Contributor docs: Optional
code-review-graphMCP guidance in Development Guide and README Contributing section (not required for end users) - Agent docs: Graph-first exploration workflow in AGENTS.md and CLAUDE.md
- Slice 20 toolchain hardening — unified
./scripts/quality-gates.shmirroring CI;check_integrity.py,pip-audit.sh - CI: scoped 80% coverage gate, ESLint, bandit SAST, pip-audit, gitleaks secrets scan job
- Repo lint: shellcheck (
scripts/*.sh), actionlint (GitHub Actions), markdownlint (.markdownlint.json) in pre-commit,scripts/repo-lint.sh, and CIrepo-lintjob - Pre-push hook: fast gates on every
git push(scripts/pre-push-gates.sh→quality-gates.sh --quick: pytest, frontend verify, gitleaks; viainstall-git-hooks.sh) - Repo hygiene:
.gitleaks.toml,.nvmrc,.editorconfig,.gitattributes, Dependabot - Frontend: ESLint +
eslint-plugin-securitywired in CI and pre-commit
- Slice 24 — Port standardisation — dashboard dev server
:5374(was Vite default:5173); SIE gateway:8720(was:8080); backend:8001unchanged - MongoDB backend unification (Slice 25B) —
./start-services.sh --localandmongodb start|stop|reset|statusreplacescripts/local-atlas.sh;cloud-setup.mdmerged intodocs/user-guide/mongodb-setup.md; local Atlas overlay indocker-compose.ymlviaRAG_SERVER_MONGODB_URIenv override - Pre-push hook (2026-05-28): replaced
pre-commit run --all-fileson push withquality-gates.sh --quickso push runs pytest and frontend verify, not only lint hooks - Upgraded urllib3, starlette, idna, langchain-core via uv dependency overrides
- Pre-commit: gitleaks config, frontend lint hook, bandit hook (
uv run bandit … -ll, aligned with CI)
start-services.sh— pass--profile local-atlasbeforedocker compose up(was incorrectly appended toupargs, causingunknown flag: --profile)- Local Atlas MongoDB connection —
mongo_client_kwargs()applies TLS only for cloud*.mongodb.netURIs; fixes SSL handshake failures againstmongodb-local - CI: read Node version from repo-root
.nvmrc(setup-noderesolves paths from checkout root, notfrontend/working directory) - CI:
astral-sh/setup-uvv4 → v7 (Dependabot) - CLI:
httpx.Client(timeout=…)default timeout on constructor (SAST/runtime parity)
0.11.0 - 2026-05-23
- Weighted averaging metric (
query_avg_score) for query-level fairness — prevents queries with many results from dominating average scores - Tiebreaker explanation UI when multiple configs achieve same max score — shows why one config was ranked #1
- Configurable tiebreaker via
TIEBREAKER_METRICenv var (default:query_avg, legacy:chunk_avg) - Detailed results tab with chunk size/overlap badges — map individual results back to hyperparameter configs
- Query text display in detailed results — see which query each result answered
- Sweep dimensions collapsible panel with Cartesian product calculation (e.g., "1 model × 5 methods × 3 sizes × 2 overlaps = 30 configs")
- Backend sorting now uses 4-level tiebreaker (max_score DESC, query_avg_score DESC, chunk_size ASC, overlap ASC)
- Config ranking uses weighted query_avg_score by default instead of unweighted chunk average
- Hyperparameters and Detailed Results tabs now have clear explanatory headers
- Confusing UI when multiple configs achieved same max score with no explanation why one was "best"
- Detailed Results tab didn't show chunk size/overlap → couldn't map results back to configs
- Missing query text in Detailed Results → couldn't tell which query each result answered
0.10.0 - 2026-05-23
- Unified retriever configuration format (
retrieverslist) — cleaner sweep expansion - Auto-migration from old
retrieval.methodsformat to newretrieversformat - Support for multiple rerankers in sweep (each as separate dimension, not chained)
- Each entry in
retrieverslist is one sweep dimension (one retriever per run, not a pipeline) - Dense/sparse/hybrid + rerankers now treated uniformly in sweep expansion
- Maintained backward-compatible DB fields (
retrieval_method,retrieval_provider,retrieval_model)
0.9.1 - 2026-05-23
- Option A scoped logging across server, CLI, and dashboard (
[rag-params-finder] [Scope] ...format) - Elapsed + ETA display on experiment progress card (linear projection from completed runs)
- Timezone-aware UTC timestamps (PyMongo
tz_aware=True) — fixes browser elapsed/duration misparse started_attimestamp set on first run (excludes queue time from duration and ETA)- Atlas cluster tier specs in vector DB stats (tier, provider, region via
resolve_tier_specs())
- Duration stat shows "—" while running or paused (was incorrectly calculated during execution)
- Single control button location in header (removed duplicate pause/resume/cancel from banners)
0.9.0 - 2026-05-23
- Search index preflight validation before sweep starts — HTTP 422 on submit if indexes insufficient
search_index_planmodule — derives required indexes from config and assesses M0 quota capacitysearch_index_guardmodule — cluster snapshot +ensure_indexesretry with mismatch detectionindexes listCLI command — shows known vs unknown indexes cluster-wideindexes resetCLI command — drop unknown indexes or rebuild chunks indexes (--allflag)- M0 3-index cluster quota guard — prevents sweeps from exceeding free-tier limits
- 17 pytest scenarios for search index planning, capacity assessment, and preflight guards
- Experiments fail fast before embedding if required indexes are missing (saves API calls and time)
0.8.1 - 2026-05-20
- 39 pytest regression tests total
- Embedder dispatch tests (local vs Voyage routing)
- Retriever index selection tests (
get_index_nameby dimension) - Model registry validation tests (provider/model cross-checks)
- Database stats aggregation tests (cluster grouping, experiment footprint)
- Kimchi adapter tests (CAST payload format, runtime dimensions)
0.8.0 - 2026-05-13 to 2026-05-20
- Kimchi embedding provider (CAST OpenAI-compatible embeddings via
/v1/embeddings) - Runtime dimension detection for Kimchi models (no hardcoded dimensions)
- Kimchi model registry entries with
contextualizedflag for routing - 4-model Kimchi example config demonstrating provider-agnostic sweep
0.7.1 - 2026-05-19
- Dedicated thread pools (
SWEEP_EXECUTOR,HEAVY_READ_EXECUTOR) to prevent API blocking during sweeps - Decoupled dashboard poll intervals: list 2s, vector DB stats 60s, Search Explorer 15s (constants in
frontend/src/constants.ts) - Batched db-stats aggregations to reduce MongoDB query load
- PollingIndicator anti-jitter (
showDelayMs=600,minVisibleMs=1000) — sync badge no longer flickers on fast polls
- Experiment list loads within seconds during active sweeps (was blocked by db-stats aggregations)
- Vector DB stats may lag but do not block the list API
0.7.0 - 2026-05-19
- Vector DB stats dashboard with cluster-grouped storage metrics (chunks count, storage MB, cluster quota)
- Atlas storage quota via Admin API (
resolve_tier_specs()+ optionalMONGODB_STORAGE_LIMIT_MBoverride) - Experiment pause/resume control (
_SweepControlthreading events, cooperative sweep halt) - Boot reconciliation for orphaned runs — marks stale
runningexperiments aspartialorcompleteon server restart - Voyage model catalog expansion — 12 models total (voyage-4 series, domain models, voyage-3 legacy)
- voyage-context-3 contextualized embedding API — 32K token window with automatic segment splitting
- Collapsible UI panels with
localStoragepersistence (vector DB stats, experiment rows)
resume_sweep()skips already-completed param signatures (resumes from next incomplete combination)- Experiment status
pausedis non-terminal (dashboard polls continue until resumed or cancelled)
0.6.0 - 2026-05-19
- Experiment deletion with confirmation (CLI + dashboard)
- CLI
deletecommand with--forceflag to skip interactive prompt - Dashboard ConfirmDeleteModal with experiment details and deletion warning
- Cascade cleanup across all collections (experiments, run_status, chunks, results)
- Deletion statistics display (documents deleted per collection)
- Running experiments cannot be deleted — API returns 400 error (must cancel first)
- Delete button disabled for running experiments with tooltip explanation
0.5.0 - 2026-05-17
- LoadingFeedbackPanel with byte-level progress tracking via ReadableStream
- Pagination to experiments list (10 items/page)
- Pagination to runs table (10 runs/page)
- Pagination to configs table (5/page)
- PollingIndicator for background syncs (subtle "Syncing..." badge)
- DashboardShell + AppPageChrome unified layout (shared header/nav/title/back-button across screens)
- ExperimentProgressCard reusable component (circular progress with default/compact variants)
- Activity feed in LoadingFeedbackPanel (fetch milestones: start → headers → chunks → complete)
- Full progress panel for initial loads; subtle polling badge for background refreshes
- Polling indicator only shows after first load completes (
initialLoadDoneflag per screen)
0.4.1 - 2026-05-05
- Architecture Decision Records (ADRs) in
docs/adr/(two-process architecture, dual providers, MongoDB Atlas choice) - Slice specifications in
docs/slices/(detailed acceptance criteria, verification steps) - Pre-commit hooks (ruff, mypy) via
.pre-commit-config.yaml - Comprehensive badges across README and docs (build status, coverage, license, Python/Node versions)
- Quality gates baseline documentation (lint, type check, test counts, bundle size)
0.4.0 - 2026-05-17
- Fixed chunker (character-window slicing with configurable overlap)
- Token chunker (LangChain
TokenTextSplitterwith cl100k_base encoding) - Sentence chunker (NLTK
sent_tokenizewith character-budget grouping) - Semantic chunker (sentence-transformers cosine similarity grouping; chunk_size as hard cap, overlap ignored)
- Sparse retrieval (Atlas BM25 via
$search) - Hybrid retrieval (RRF merge with k=60, combines dense + sparse)
- 5 example configs covering all features (all chunkers, all retrievers, local + Voyage)
- Atlas Full Text Search index setup docs in
CLAUDE.local.md
search()dispatcher conditionally embeds query (only for dense/hybrid, not sparse)- Semantic chunker always uses
all-MiniLM-L6-v2(provider-agnostic, keeps chunking independent of embedding config)
0.3.0 - 2026-05-02
- Local embedding with sentence-transformers (
all-MiniLM-L6-v2, 384-dim, ~23MB) - Local reranking with CrossEncoder (
cross-encoder/ms-marco-MiniLM-L-6-v2, ~23MB) - Explicit provider routing (
providerfield in YAML config drives end-to-end dispatch) - Separate vector indexes per dimension (
vector_index_1024for Voyage,vector_index_384for local) - No API key required for local provider (fully offline embedding + reranking)
- Model registry (
server/core/model_registry.py) — unified catalog for embedding + reranker models - Pydantic validators cross-check model names match declared provider (fast-fail at config parse time)
- Provider flows through
RunParams→ orchestrator → embedder/reranker (explicit routing, no runtime lookups)
0.2.0 - 2026-05-02
- Reranking with Voyage
rerank-2.5-lite(refines dense search: top-20 candidates → top-5 final) - Cartesian product sweep expansion (N models × M methods × P sizes × Q overlaps → N×M×P×Q runs)
- Live status tracking with phase indicators (QUEUED → PARSING → CHUNKING → EMBEDDING → STORING → QUERYING → RERANKING → COMPLETE)
- Multiple queries from persona JSON files (loop over all questions per run, store
persona_idandfocus) - CLI
--watchflag for live progress (Rich Live table polling runs every 2s) - ExperimentDetailScreen with drill-down (phase dots, run table, polling until terminal status)
run_sweep()+run_single()split (sweep management vs pipeline execution, Single Responsibility)on_error: continue/stopallows partial completion without losing all results- Experiment status
partialdistinguishes "some failed" from "all failed" or "all complete"
0.1.0 - 2026-05-02
- End-to-end RAG parameter sweep pipeline (parse PDF → chunk → embed → store → query → search → results)
- Voyage AI voyage-3.5-lite embedding (1024-dim, $0.06/1M tokens)
- Recursive text chunker (LangChain
RecursiveCharacterTextSplitter) - DENSE vector search with MongoDB Atlas (cosine similarity)
- FastAPI server with
/experimentsendpoints (POST submit, GET list, GET by ID) - React dashboard with experiments list (status badges, run counts, clickable rows)
- CLI submission via
rag-params-finder run --config <yaml> - BackgroundTasks for async sweep execution (no Celery required for MVP)
0.0.1 - 2026-04-15
- Project skeleton and initial architecture
- MongoDB Atlas vector search integration (PyMongo client, collection helpers)
- Basic experiment orchestration pipeline (PDF parser, chunker dispatcher, orchestrator stub)
- CLI submission framework (Typer app, config loader, API client)