QueryWise — a text-to-SQL application with a semantic metadata layer. Users ask natural language questions, an LLM generates SQL using business context, executes against their database, and returns human-readable answers. The conversational Assistant enables editing the semantic layer in plain language. Answers become durable, shareable artifacts: saved queries (pinned SQL + typed params), charts, and workspace dashboards.
- Backend: Python 3.12, FastAPI, SQLAlchemy (async), asyncpg, pgvector, Alembic
- Frontend: React 19, TypeScript, Vite, Mantine UI, React Query, React Router
- Databases: PostgreSQL 16 with pgvector extension (app metadata), PostgreSQL 16 (sample/target DB), Google BigQuery, Databricks
- LLM: Provider-agnostic (Anthropic Claude, OpenAI, Ollama, Azure OpenAI)
# Full stack with Docker (preferred)
docker compose up
# Frontend: http://localhost:5173
# Backend: http://localhost:8000
# API docs: http://localhost:8000/docs
# The IFRS 9 sample DB is auto-configured on startup (connection, introspection, metadata seeding).
# To disable: set AUTO_SETUP_SAMPLE_DB=false in .env
# For manual seeding (if auto-setup is disabled):
python backend/scripts/seed_ifrs9_metadata.pyThe sample-db contains an IFRS 9 banking schema with 6 tables: counterparties, facilities, exposures, ecl_provisions, collateral, staging_history. Connection string (from Docker): postgresql://sample:sample_dev@sample-db:5432/sampledb.
The same container hosts a second database, opsdb — a deliberately hostile operational-style schema (no FKs, tenant_id scoping, soft deletes, int status codes, lookup tables, business-logic views, a dead customers_bak table) used to exercise the semantic layer compiler. Connection string: postgresql://sample:sample_dev@sample-db:5432/opsdb. The container runs with pg_stat_statements preloaded; populate query logs with python backend/scripts/run_ops_workload.py. Fixtures: backend/tests/fixtures/ops_seed.sql (+ ops_extensions.sql). Init scripts only apply on a fresh volume (docker compose down -v).
Auto-setup (AUTO_SETUP_SAMPLE_DB=true, default): On first docker compose up, the backend automatically creates the connection, introspects the schema, seeds all metadata (10 glossary terms, 8 metrics, 43 dictionary entries across 12 columns, 1 knowledge document), and launches background embedding generation. Logic in app/services/setup_service.py, called from main.py lifespan hook. Idempotent — safe to restart.
Startup sequence (in main.py lifespan):
ensure_embedding_dimensions()— checks vector column dimensions matchEMBEDDING_DIMENSION, resizes + nulls stale embeddings if mismatched (handles provider switching)auto_setup_sample_db()— connection, introspection, seeds, then launches background embedding generation (non-blocking)
For manual seeding (if auto-setup disabled): python backend/scripts/seed_ifrs9_metadata.py
Run from backend/:
pip install -e ".[llm,dev,bigquery,databricks,lineage]" # Install all deps (add export,observability,jobs as needed)
alembic upgrade head # Run migrations
uvicorn app.main:app --reload # Dev server on :8000
pytest # Run tests
ruff check . # Lint
ruff format . # Format
mypy . # Type checkRun from frontend/:
npm install # Install deps
npm run dev # Dev server on :5173
npm run build # Production build (tsc + vite)
npm run lint # ESLint- Python: Ruff, 100 char line length, Python 3.11 target, rules: E, F, I, N, UP, B
- TypeScript: ESLint, strict mode, no explicit
any - Async everywhere: All DB operations, HTTP calls, and LLM calls are async
- Pytest: asyncio_mode="auto", test paths at
tests/
backend/
├── scripts/ # Seed scripts (seed_ifrs9_metadata.py)
backend/app/
├── api/v1/endpoints/ # FastAPI route handlers (all under /api/v1)
├── api/v1/schemas/ # Pydantic request/response models
├── connectors/ # Database connector plugin system (PostgreSQL, BigQuery, Databricks)
├── db/models/ # SQLAlchemy ORM models (UUID PKs, timestamps)
├── llm/agents/ # LLM agents (composer, validator, interpreter, error handler)
├── llm/providers/ # LLM provider implementations (anthropic, openai, ollama)
├── llm/prompts/ # System/user prompt templates
├── llm/utils.py # Shared LLM utilities (JSON repair for local models)
├── mcp/ # FastMCP server mounted at /mcp (streamable HTTP) — reuses services
├── semantic/ # Core IP: context builder, schema linker, glossary resolver, knowledge resolver
├── services/ # Business logic (query pipeline, connection mgmt, embeddings, knowledge import)
└── utils/ # SQL sanitizer
frontend/src/
├── api/ # Axios API clients (one per resource)
├── components/layout/ # AppShell with sidebar navigation
├── hooks/ # React Query hooks
├── pages/ # Route pages (Query, SavedQueries, Dashboards, Connections, Glossary, Metrics, Dictionary, Knowledge, History)
└── types/ # TypeScript interfaces matching backend schemas
| Variable | Default | Description |
|---|---|---|
DATABASE_URL |
postgresql+asyncpg://querywise:querywise_dev@localhost:5432/querywise |
App metadata DB |
ENCRYPTION_KEY |
dev-encryption-key-change-in-production |
Fernet key for connection strings (used by the env secrets backend) |
SECRETS_BACKEND |
env |
Secrets backend for connection-string encryption (env/aws/gcp/azure/vault) |
LOG_LEVEL |
INFO |
Log level |
LOG_FORMAT |
console |
Log format (console, or json for log aggregation) |
ENABLE_METRICS |
true |
Expose Prometheus metrics at GET /metrics |
OTEL_ENABLED |
false |
Enable OpenTelemetry tracing (console exporter, or OTLP if endpoint set) |
OTEL_EXPORTER_OTLP_ENDPOINT |
— | OTLP/HTTP traces endpoint (e.g. http://jaeger:4318/v1/traces) |
JOB_BACKEND |
inprocess |
Background job runner (inprocess asyncio, or arq Redis) |
REDIS_URL |
redis://localhost:6379/0 |
Redis URL (used when JOB_BACKEND=arq) |
RATE_LIMIT_ENABLED |
true |
Enforce MAX_QUERIES_PER_MINUTE on /query endpoints |
DEFAULT_LLM_PROVIDER |
anthropic |
LLM provider (anthropic, openai, ollama, azure_openai) |
DEFAULT_LLM_MODEL |
claude-sonnet-4-20250514 |
Default model for SQL generation |
EMBEDDING_MODEL |
text-embedding-3-small |
OpenAI embedding model |
CORS_ORIGINS |
["http://localhost:5173"] |
Allowed CORS origins |
AUTO_SETUP_SAMPLE_DB |
true |
Auto-create sample DB connection + seed metadata on startup |
SAMPLE_DB_CONNECTION_STRING |
postgresql://sample:sample_dev@sample-db:5432/sampledb |
Sample DB for auto-setup |
OLLAMA_BASE_URL |
http://host.docker.internal:11434 |
Ollama server URL (use http://ollama:11434 for Docker Ollama) |
OLLAMA_MODEL |
llama3.1:8b |
Ollama model for completions |
OLLAMA_EMBEDDING_MODEL |
nomic-embed-text |
Ollama model for embeddings |
EMBEDDING_DIMENSION |
1536 |
Vector dimension (1536 for OpenAI, 768 for Ollama nomic-embed-text) |
ANTHROPIC_API_KEY |
— | Required if using Anthropic |
OPENAI_API_KEY |
— | Required if using OpenAI (completions + embeddings) |
AZURE_OPENAI_ENDPOINT |
— | Azure OpenAI endpoint (required for azure_openai provider) |
AZURE_OPENAI_API_KEY |
— | Azure OpenAI key |
AZURE_OPENAI_API_VERSION |
2024-10-21 |
Azure OpenAI API version |
AZURE_OPENAI_DEPLOYMENT |
— | Azure OpenAI embedding deployment name |
DISABLE_AUTH |
false |
Local-dev escape hatch — treat every request as the default admin (no login). Never enable in production |
AUTH_PROVIDER |
local |
Interactive login backend: local (password + magic-link), magic_link, or oidc (registered seam, not yet implemented) |
JWT_SECRET |
dev-jwt-secret-change-in-production |
HS256 signing secret for session + magic-link JWTs |
JWT_ACCESS_TTL_MINUTES |
720 |
Session lifetime (minutes) |
MAGIC_LINK_TTL_MINUTES |
15 |
Magic-link token lifetime (minutes) |
AUTH_COOKIE_NAME |
qw_session |
Session cookie name (HTTP-only) |
AUTH_COOKIE_SECURE |
false |
Set true behind TLS (HTTPS-only cookie) |
AUTH_COOKIE_SAMESITE |
lax |
Session cookie SameSite (lax/strict/none) |
DEFAULT_ORG_SLUG |
default |
Slug of the auto-created default organization |
DEFAULT_ADMIN_EMAIL |
admin@querywise.local |
Bootstrapped admin user (created on boot + in migration 004) |
DEFAULT_ADMIN_PASSWORD |
— | If set, the bootstrapped admin gets this local-login password |
When using Ollama, all completions and embeddings go through Ollama — no OpenAI/Anthropic fallback. Two deployment modes:
# 1. Install and start Ollama on your Mac
brew install ollama
ollama serve
# 2. Pull required models
ollama pull llama3.1:8b
ollama pull nomic-embed-text
# 3. Set .env
DEFAULT_LLM_PROVIDER=ollama
OLLAMA_BASE_URL=http://host.docker.internal:11434
OLLAMA_MODEL=llama3.1:8b
OLLAMA_EMBEDDING_MODEL=nomic-embed-text
EMBEDDING_DIMENSION=768
# 4. Start stack (Ollama is NOT in Docker — backend reaches it via host.docker.internal)
docker compose up# .env
DEFAULT_LLM_PROVIDER=ollama
OLLAMA_BASE_URL=http://ollama:11434
OLLAMA_MODEL=llama3.1:8b
OLLAMA_EMBEDDING_MODEL=nomic-embed-text
EMBEDDING_DIMENSION=768
# Start stack with Ollama Docker profile
docker compose --profile ollama-docker up
# Pull required models inside the container
docker compose exec ollama ollama pull llama3.1:8b
docker compose exec ollama ollama pull nomic-embed-textWhy Option A is faster on macOS: Docker on Mac runs inside a Linux VM with no GPU passthrough. Native Ollama uses Apple Metal for GPU-accelerated inference. Expect ~5-10x faster responses.
nomic-embed-text produces 768-dimension vectors. Set EMBEDDING_DIMENSION=768 in .env. Migration 002_configurable_embedding_dim handles initial column creation. On subsequent provider switches, ensure_embedding_dimensions() (in setup_service.py, called from main.py lifespan) detects dimension mismatches at startup, resizes all vector columns, and nulls stale embeddings so they regenerate in the background.
Embeddings are generated in background asyncio tasks (non-blocking):
- On startup: after auto-setup seeds,
launch_background_embeddings()fires a background task - On introspect: background task launched after schema introspection
- On CRUD: each create/update of glossary term, metric, sample query, or knowledge document embeds inline
- Progress tracking: in-memory tracker (
embedding_progress.py), exposed atGET /api/v1/embeddings/status, displayed as a frontend progress banner (auto-polls every 2s, auto-hides when complete)
If the embedding model is unavailable (not pulled, or Ollama is down), the query pipeline falls back to keyword-only context matching instead of crashing. Vector search failures in schema_linker.py trigger a session rollback and keyword fallback. Embedding-based search resumes automatically once the model is available.
OllamaProvider(app/llm/providers/ollama_provider.py) useshttpxto call Ollama REST API- Completions use
format: "json"to force JSON output mode repair_json()inapp/llm/utils.pyhandles common local model JSON issues (markdown fences, Python booleans, trailing commas)- Embeddings: tries
/api/embed(Ollama 0.4+), falls back to/api/embeddings(legacy) automatically get_embedding_provider()follows the configured provider — Ollama embeds locally, Anthropic falls back to OpenAI
- Connectors: Extend
BaseConnectorABC inapp/connectors/, register inconnector_registry.py. Built-in: PostgreSQL (asyncpg), BigQuery (google-cloud-bigquery, lazy-loaded), Databricks (databricks-sql-connector, lazy-loaded). BigQuery uses service account JSON stored encrypted in connection_string field. Databricks uses JSON config (server_hostname,http_path,access_token,catalog) stored encrypted; supports both Unity Catalog (INFORMATION_SCHEMA) and Hive metastore (SHOW/DESCRIBE fallback) - LLM Providers: Extend
BaseLLMProviderABC inapp/llm/providers/, register viaprovider_registry - API routes: All under
/api/v1, defined inapp/api/v1/endpoints/, aggregated inapp/api/v1/router.py - ORM models: UUID primary keys,
created_at/updated_attimestamps, pgvectorVECTOR(settings.embedding_dimension)for embeddings - Services: Business logic in
app/services/, never in endpoints directly - Knowledge: Import text/HTML content, auto-detect HTML, section-aware chunking (450 words, 80 overlap), vector + keyword search for relevant chunks injected into LLM prompt. URL fetching server-side via
httpx. Service inapp/services/knowledge_service.py - SQL safety: Read-only transactions enforced at connector level, static SQL blocklist in
app/utils/sql_sanitizer.py(includes BigQuery-specificEXPORT DATA/LOAD DATAand Databricks-specificCOPY INTO/OPTIMIZE/VACUUMblocks)
Foundational layers added under app/core/ and app/jobs/. All optional
dependencies degrade gracefully — the app boots without structlog /
prometheus_client installed.
- Secrets (
app/core/secrets.py):SecretsProviderABC behind connection-string encryption. Defaultenvbackend = Fernet (preserves original behaviour);aws/gcp/azure/vaultare registered seams (register_secrets_backend).connection_service.pycallsget_secrets_provider(). - Telemetry (
app/core/telemetry.py):configure_logging()(structlog→stdlib fallback,console/json), per-requestX-Request-IDviaObservabilityMiddleware, Prometheus metrics atGET /metrics(setup_metrics), and OpenTelemetry tracing viaconfigure_tracing()+start_span()(no-op whenOTEL_ENABLED=false). The query pipeline (query_service.py) emits spans for build_context → compose → validate → execute → interpret. Request id flows into logs +AppErrorresponses. - Rate limiting (
app/core/rate_limit.py): in-memorySlidingWindowRateLimiterwired to/queryendpoints viainstall_rate_limiting(enforcesMAX_QUERIES_PER_MINUTE). Swap the store for Redis for multi-replica deploys. - Jobs (
app/jobs/):JobQueueABC withInProcessJobQueue(asyncio, default) andArqJobQueue(Redis). Jobs are registered by name inregistry.py;launch_background_embeddingssubmits"generate_embeddings"throughget_job_queue(). For arq, run a worker:JOB_BACKEND=arq arq app.jobs.worker.WorkerSettings(embedding progress then lives in the worker process). - Health (
app/api/v1/endpoints/health.py):GET /health/live(process) andGET /health/ready(DB + job queue + LLM provider, 503 on failure) for K8s probes. - LLM endpoints: Azure OpenAI provider (
azure_openai) added so the pipeline can run inside a customer VPC; registered inprovider_registry. - Tests/CI: unit tests in
backend/tests/(no DB/LLM needed);.github/workflows/ci.ymlinstalls.[llm,dev,observability,lineage]and runs pytest (gating) + ruff/mypy/frontend build (advisory until pre-existing lint debt is cleared). The lineage tests needsqlglot(the[lineage]extra) andpytest.importorskippast it otherwise. Optional deps:pip install -e ".[observability,jobs]".
Real users, teams, roles, and ownership. Single-tenant per deployment; isolation is by workspace_id (a Team) within the auto-created default Organization. organization_id is carried on every core table so a future managed-SaaS fleet needs no migration. Migration 004 creates the identity tables, seeds the default org/workspace/admin, backfills all existing rows, and promotes the free-text created_by/user_id columns to real User FKs.
- Identity models (
app/db/models/):Organization,User,Team(= workspace),Membership(roleadmin|editor|viewer, ranked inROLE_RANK),ApiKey(only the SHA-256 hash stored). - Primitives (
app/core/security.py): PBKDF2 password hashing (stdlib), HS256 JWTs with apurposeclaim (session/magic_link), and API-key gen/hash. Dependency-light + unit-tested. - Request plumbing (
app/core/auth.py):get_current_user(API key → Bearer → HTTP-onlyqw_sessioncookie),get_org_context→AuthContext(active workspace viaX-Workspace-Idheader, else earliest membership), andrequire_role(...).DISABLE_AUTH=trueshort-circuits to the bootstrapped admin for local dev. - Login (
app/services/auth_service.py,app/api/v1/endpoints/auth.py): password + magic-link, both issuing a session-cookie JWT. Magic-link delivery (email/Slack) lands in Phase 4 — the token is logged and, outside production, returned byPOST /auth/magic-link.app/core/auth_providers.pyis a name-keyed seam (local/magic_link/oidc); OIDC is registered but not implemented. - AuthZ in services (per the existing convention):
connection_servicescopes by org+workspace and enforces role; metadata endpoints authorize through the connection (the cascade root) viaapp/api/v1/deps.py(require_connection_read/write,require_column_read/write). Non-request entry points — startup auto-setup, the MCP server, the seed script viaDISABLE_AUTH— act underidentity_service.system_context()(admin in the default workspace). - Endpoints:
/auth/*(login, register, magic-link request/verify, logout, me, providers),/teams+/teams/{id}/members(admin-managed),/api-keys(per-user, plaintext shown once). - Heads-up: once auth is enforced, the current (pre-auth) frontend gets 401s — run with
DISABLE_AUTH=trueuntil the Phase 1 frontend (login + auth context + workspace switcher) lands.
One-shot answers become saved, owned, re-runnable, shareable objects. Two milestones; migrations 005 (artifacts) and 006 (dashboards).
- Models (
app/db/models/):SavedQuery(pinned SQL + typedparams+version/status),Chart(viz config per saved query),ResultSnapshot(result persistence that doubles as the cache),Dashboard+DashboardTile. - Scoping: saved queries / charts / snapshots are connection-scoped (carry
organization_id+connection_id, authorize through the connection viarequire_connection_read/write), matching the semantic-layer convention.Dashboardis the first workspace-scoped artifact (workspace_id); its endpoints useget_org_context+ctx.require_role(...)directly (liketeams.py), anddashboard_service._assert_accessmirrorsconnection_service._assert_access. - Re-runs & cache (
app/services/saved_query_service.py):render_sqlsubstitutes{{param}}placeholders with type-safe, escaped SQL literals (defense-in-depth atop the read-only blocklist), thenrun_saved_queryis cache-first — aResultSnapshotkeyed bysha256(final_sql + params + connection_id)withinRESULT_CACHE_TTL_SECONDS(default 300), with arefreshoverride. Execution reusesquery_service.execute_raw_sql. - Dashboards (
app/services/dashboard_service.py): tiles run viarun_saved_query(so they inherit connection auth + the cache). Dashboard-level filters reuse the param system — filter values are passed as supplied params and a tile only consumes the{{name}}s its SQL references._finalizerefreshes server-sideonupdatetimestamps after UPDATEs to avoid async lazy-load errors during response serialization. - Export: client-side CSV/JSON in the frontend; backend CSV/JSON/XLSX for saved queries (XLSX needs the optional
exportextra →openpyxl). - Endpoints:
/connections/{id}/saved-queries(+/run,/clone,/export,/charts),/dashboards(+/tiles,/layout,/tiles/{id}/run). - Frontend: Recharts (
components/charts/ChartView.tsx) for viz;react-grid-layoutfor the dashboard grid; shared typedcomponents/common/ParamInputs.tsxfor params/filters. Charts are managed inside the saved-query view (no separate Charts page). Note: the frontend container's anonymousnode_modulesvolume means new deps (recharts, react-grid-layout) needdocker compose exec frontend npm installor an image rebuild.
Makes the semantic layer discoverable and trustworthy. Two milestones; migrations 007 (certification + versioning) and 008 (catalog lineage). Column profiling is deferred to a later milestone.
- Certification lifecycle (
app/services/versioning_service.py): metrics, glossary terms, sample queries, and saved queries carrystatus(draft|in_review|certified|deprecated), an integerversion, andcertified_by_id/certified_at. Transitions go through one governed endpoint per entity (POST /connections/{id}/{entity}/{eid}/status); the state machine (_ALLOWED_TRANSITIONS) and role gate (_ROLE_FOR_TARGET) live in the service — editor submits-for-review/reverts, admin certifies/deprecates. Certifying runs a lightweight SQL check (check_sql_safety+ a sqlglot parse). One service handles all four entity types via_SNAPSHOT_FIELDS/_SQL_FIELDmaps. - Versioning & changelog (
SemanticVersionmodel): every content edit (PUT →record_edit, bumps version) and status transition appends an append-only snapshot. Exposed atGET .../{entity}/{eid}/versions(+/{version});versioning_service.diffgives a field-level diff. UI: sharedfrontend/src/components/common/{CertificationBadge,StatusActions,VersionHistory}.tsx, wired into the Metrics/Glossary/SavedQueries pages. - Catalog search (
app/services/catalog_service.py,app/api/v1/endpoints/catalog.py):GET /connections/{id}/catalog/searchruns a hybrid search across tables, columns, metrics, glossary, sample/saved queries, and knowledge — reusing the existing pgvector embeddings + the keyword scorer (semantic/relevance_scorer.py), no tsvector. Hits merge into a uniformCatalogHit; certified items are boosted (rank_hits).GET .../catalog/facetsreturns schemas/owners/tags/type+status counts. Connection-scoped viarequire_connection_read. Frontend:pages/CatalogPage.tsx(search + facet sidebar + detail/lineage drawer). - Lineage (
app/services/lineage_service.py,ArtifactDependencymodel): saved-querypinned_sqland metricsql_expressionare parsed with sqlglot (optional[lineage]extra; lazy import, degrades to a no-op if absent) into table/column edges, recomputed on create/update (best-effort, never blocks the write). Per-artifact "what this touches" atGET .../{saved-queries|metrics}/{id}/lineage; impact view "what depends on this table" atGET .../catalog/lineage?table=&column=. Connector type → sqlglot dialect viadialect_for. - Endpoints:
/connections/{id}/catalog/{search,facets,lineage}, plus/status,/versions,/versions/{v}, and/lineagesub-resources on the metric/glossary/sample-query/saved-query routers. - Heads-up: existing rows migrate to
status='draft',version=1. The saved-query PUT routes anystatuschange through the governed lifecycle (no raw status writes). sqlglot is a new optional dep — install the[lineage]extra (or rebuild the backend image) for lineage to populate.
Attacks the cold-start problem: point QueryWise at an operational DB with an empty semantic layer and get reviewable draft objects. Migration 013.
- Engine (
app/semantic_compiler/): self-contained package (dataclasses + pure functions, no FastAPI/ORM imports — standalone-CLI extractable). Collectors gather evidence (catalog via the connector,pg_stats/CHECK/enums/unique indexes,pg_get_viewdef,pg_stat_statements);sqlmeta.py(sqlglot, graceful degradation) extracts join pairs/aggregates/GROUP BY/WHERE; inference modules emitFindings with evidence + confidence: join inference without FKs (naming + value-overlap probe + log co-occurrence; failed probe kills the candidate), dictionaries (enum/CHECK/lookup-table labels/most_common_vals — note pg_statsn_distinctis negative when it scales with rows), view→metric extraction, recurring log aggregates, dead tables, tenant scoping (call-weighted log confirmation required), PII (name + sampled value shape), fan-out warnings (1:N parent-measure double-count). The LLM pass (app/llm/agents/semantic_annotator.py) only names/describes — output merges onto naming fields, never structure; runs fine without a provider. Output is hard-capped per kind (Thresholds) — review fatigue kills draft tools. - Staging, not drafts (
CompilationRun/CompilationFinding,app/services/compilation_service.py): findings never touch semantic tables until accepted (draft metrics/glossary feed the context builder today). Accept dispatches per kind through existing creation paths (embed + lineage), landing asstatus='draft'; policies (PII masking,dead tables, row filters) are created disabled; fan-out guidance becomes a knowledge doc (so the prompt assembler picks it up via RAG). Runs as a background job (semantic_compilation) with progress (compilation_progress.py). - Rematerialization:
introspect_and_cachewipes cached tables (cascading to inferred relationships + dictionary entries), so accepted findings are name-keyed andrematerialize_acceptedre-creates them after every introspect.cached_relationshipsgainedorigin(fk|inferred),confidence,cardinality,evidence. - Endpoints:
/connections/{id}/compilation/runs(+/runs/{rid}),/compilation/findings(+/{fid}/accept,/{fid}/dismiss,/bulk). Frontend:pages/CompilerPage.tsx(run button, progress, findings grouped by kind with evidence + confidence, bulk accept/dismiss). - Eval:
python backend/scripts/eval_compiler_ifrs9.pyscores recovery of the IFRS 9 seed metadata with FKs hidden (ignore_declared_fks). Baseline: relationships 5/5 @ 100% precision, dictionary 79%/89%, glossary table-coverage 10/10; metrics need views/logs (sampledb has neither — expected 0). - Heads-up:
pg_statsis empty until ANALYZE;pg_stat_statementsneeds the extension + read rights (pg_read_all_stats). Every collector degrades to empty and the run recordssources_availableso the UI explains reduced confidence. Collectors are Postgres-only for now — other connectors compile catalog-only.
Production deployment artifacts under deploy/ (+ root prod compose), separate from the dev docker-compose.yml / Dockerfiles (which stay untouched for local work). The whole Packaging & deployability parallel track from planfull.md is complete: hardened images, prod compose, Helm chart, Terraform for AWS + GCP + Azure, CI/CD (build/push/deploy), and ops (backup/restore, DR runbook, config reference). The only deferred item is the SaaS control plane (provisioning/billing/fleet upgrades), which is additive and build-on-demand. Overview: deploy/README.md.
- Hardened images:
backend/Dockerfile.prod(multi-stage: builder venv → slim runtime, non-root uid 1001,curlhealthcheck on/api/v1/health/live,uvicorn --workers ${UVICORN_WORKERS:-4}, prod extras only — no[dev]) andfrontend/Dockerfile.prod(Vite build →nginxinc/nginx-unprivileged:1.27-alpine, non-root uid 101, listens 8080)..dockerignorein both dirs. - Edge:
frontend/nginx.confserves the SPA bundle (with client-route fallback) and reverse-proxies/api,/mcp(buffering off for SSE), and health to the backend same-origin. Uses Docker's embedded resolver (127.0.0.11) + aset $backendvariableproxy_passso the edge boots even while the backend is starting (a staticupstreamwould make nginx refuse to start). Internal/healthzfor the container healthcheck. TLS terminates here (mount certs + add a 443 block) or upstream at a LB. - Same-origin build:
frontend/src/api/client.tsuses?? 'http://localhost:8000'(not||) so the prod build withVITE_API_URL=""calls the API at relative/api/v1; unset (dev) still falls back to the local backend. - Prod stack:
docker-compose.prod.yml—app-db(pgvector, no host port),redis(cache + arq), one-shotmigrate(alembic upgrade head, gated byservice_completed_successfullyso backend replicas never race),backend(uvicorn,JOB_BACKEND=arq),worker(arq app.jobs.worker.WorkerSettings),frontend(edge, the only published port). Run:cp .env.prod.example .env.prod→ edit →docker compose -f docker-compose.prod.yml --env-file .env.prod up -d --build. - Config:
.env.prod.exampleis the prod-tuned template (CHANGE_ME secrets,DISABLE_AUTH=false,AUTH_COOKIE_SECURE=true,LOG_FORMAT=json,AUTO_SETUP_SAMPLE_DB=false)..env.prodis gitignored. - Helm chart (
deploy/helm/querywise/, EKS/GKE/AKS): backend Deployment (uvicorn) + HPA + PDB, dedicated arqworkerDeployment, frontend Deployment + PDB, two Services, ingress (path-based:/api+/mcp→backend,/→frontend SPA — same-origin), ServiceAccount (IRSA/Workload-Identity annotations). Managed Postgres+pgvector and Redis are expected out-of-cluster (supply DSNs via the release Secret). Config split: non-secret env → ConfigMap, secrets → chart-created Secret orsecrets.existingSecret(external-secrets/sealed-secrets seam). Migration:alembic upgrade headruns as apre-install/pre-upgradehook Job (weight-5); the ConfigMap+Secret are also hooks (weight-10) so they exist first, and the Job gates new backend pods so replicas never race. Validate:helm lint+helm template ... | kubeconform -strict(both pass).values-production.example.yamlis a realistic override;deploy/README.mdhas the install flow. - Terraform (
deploy/terraform/{aws,gcp,azure}/): each provisions the data plane + secrets the chart consumes, in the customer's own account/VPC, with the same shape — managed Postgres 16 (pgvector via app migrations) + managed Redis (cache + arq) + a secret store holding the assembled DSNs+keys (keys map 1:1 to backend env → external-secretsdataFrominto thequerywise-secretsk8s Secret) + object storage (exports/backups) + optional network + an identity/policy for external-secrets to read the secret. DB password + JWT secret auto-generate if unset;ENCRYPTION_KEYis required (Fernet). Compute (EKS/GKE/AKS) is deliberately out of scope — BYO or the upstream cluster module, kept in a separate state so cluster rebuilds never risk the DB.- AWS: RDS (Multi-AZ, gp3,
rds.force_ssl) + ElastiCache + Secrets Manager + S3; IAM policy for the external-secrets IRSA role. - GCP: Cloud SQL (private IP via PSA peering) + Memorystore + Secret Manager + GCS; a service account with
secretAccessorfor Workload Identity. - Azure: Postgres flexible server (VNet-integrated,
azure.extensions=VECTORallow-list) + Cache for Redis + Key Vault + Blob; a user-assigned managed identity with Key Vault read for Workload Identity. - All three pass
tofu/terraform validate+fmt.*.tfvarsgitignored; lockfiles committed.
- AWS: RDS (Multi-AZ, gp3,
- CI/CD (
.github/workflows/):ci.yml(existing — backend tests gating + advisory lint/type, frontend lint/build) is unchanged.deploy-validate.ymlruns on PRs touchingdeploy/**—helm lint+helm template | kubeconform -strict, andterraform fmt -check/validateacross aws/gcp/azure (matrix).release.ymlbuilds+pushes both images to GHCR (querywise-{backend,frontend}, tagged SHA/branch/semver/latest, gha cache) then deploys via the.github/actions/helm-deploycomposite action: push tomain→ staging, tagv*→ production (gate with environment required-reviewers). Deploys pin the release to the commit SHA with--wait --atomic(auto-rollback) and inject only image coords; per-env overlayvalues-<environment>.yaml(committed, non-secret) is applied if present. Each environment needs aKUBE_CONFIGsecret (base64 kubeconfig); clusters run external-secrets to syncquerywise-secrets. Lint withactionlint. - Ops (
deploy/ops/):backup.sh(pg_dumpcustom format → AES-256/openssl PBKDF2 →querywise-<ts>.dump.enc, optional S3/GCS upload, local retention prune) +restore.sh(decrypt →pg_restore --clean --if-exists, guarded byRESTORE_CONFIRM=yes); both strip the+asyncpgsuffix fromDATABASE_URL, shellcheck-clean.backup-cronjob.example.yamlschedules backups in-cluster (postgres:16 image, script via ConfigMap,BACKUP_PASSPHRASE+DATABASE_URLfromquerywise-secrets).RUNBOOK.mdcovers backup/restore, managed-DB PITR, full-region DR rebuild, the Alembic upgrade path (migrations only run via the Helm/compose migrate hook), and quarterly credential rotation —ENCRYPTION_KEYmust not be blind-rotated (it Fernet-encrypts stored connection strings; re-encrypt each connection before swapping).config-reference.mdis the production-focused settings catalogue (the full list is in the env-vars table above /.env.example).