Mark browser behavior in YAML, replay it, and hand the trace to any coding agent.
Cairntrace is a local-first behavioral browser-spec layer for coding agents.
Specs define intent + outcomes as the behavior contract and steps as
repairable hints for reaching that state. The same spec can run from the CLI,
through the MCP server, or later be exported to Playwright.
Cairntrace is agent-neutral: there are no Claude, Codex, Cursor, or OpenCode branches in core. The stable interface is the CLI, MCP tools, and run artifact format.
- Give agents a real browser acceptance check while they build a feature.
- Replace manual "click through this workflow" smoke tests with YAML specs.
- Capture DOM snapshots, screenshots, console, network, and outcome evidence into one agent-readable artifact pack.
- Heal common locator drift without changing the behavior contract.
- Start with
agent-browserfor agent-in-session work and switch to Playwright when you need Playwright-native traces or exported tests.
Cairntrace is not published to npm or GitHub Packages. The supported
install path is cloning this repository and running it from source with Bun —
there is no build or compile step. Pin the
latest release
or use main.
- Bun
>=1.3.0 - A browser backend:
agent-browseron$PATHfor the default backend- or Playwright Chromium for
--backend playwright
Check Bun first:
bun --versiongit clone https://github.com/abdul-hamid-achik/cairntrace
cd cairntrace
bun installTo pin the newest release tag instead of tracking main:
git checkout "$(git tag --sort=-v:refname | head -1)"Updating later is git pull (or git fetch and re-run the checkout above)
followed by bun install — nothing to rebuild.
agent-browser is the default backend and the recommended path for
agent-in-session runs:
brew install vercel-labs/agent-browser/agent-browser
agent-browser --versionPlaywright is optional. Install its Chromium browser only if you plan to run
with --backend playwright, inspect Playwright traces, or export specs to
@playwright/test:
bunx playwright install chromiumOn CI, the Playwright backend launches Chromium with --no-sandbox and
--disable-dev-shm-usage when CI is truthy. Set
CAIRN_PLAYWRIGHT_LAUNCH_ARGS to override the launch args for a runner.
./bin/cairn doctor./bin/cairn is a Bun shebang launcher for development, so there is no compile
step for local use.
You can always run Cairntrace from this repo with ./bin/cairn. To use
cairn from any directory, symlink the launcher into a directory already on
your $PATH:
ln -sf "$PWD/bin/cairn" /usr/local/bin/cairn
cairn doctorIf cairn doctor reports bun or agent-browser missing, confirm those
commands work in the same shell and that their install directories are on
$PATH.
Start the tiny demo app in one terminal:
bun examples/demo-app/server.tsRun a real browser spec in another terminal:
./bin/cairn run examples/flows/01-dashboard-nav.ymlThen inspect the agent handoff summary:
./bin/cairn context latestUseful variants:
./bin/cairn run examples/flows/01-dashboard-nav.yml --backend playwright
./bin/cairn run examples/flows/01-dashboard-nav.yml --mock
./bin/cairn run examples/flows/01-dashboard-nav.yml examples/flows/02-row-count.yml --parallel 2 --json
./bin/cairn run examples/flows/01-dashboard-nav.yml examples/flows/02-row-count.yml --junit ./.cairntrace/junit.xml --json
./bin/cairn snapshot /dashboard.html --config examples/cairntrace.config.yml --json
./bin/cairn spec heal examples/flows/06-drifted-link.ymlSee examples/README.md for the full demo walkthrough,
including the intentionally failing spec, the heal demo, config-backed specs,
downloads, transforms, and xlsx verification.
This is the shape of a Cairntrace spec:
version: 1
name: dashboard_nav
intent: |
A user can open the demo dashboard from the home page.
outcomes:
- id: url_is_dashboard
description: browser lands on the dashboard page
verify:
url: { endsWith: /dashboard.html }
- id: dashboard_heading_visible
description: dashboard heading is visible
verify:
text: { contains: "Inventory Dashboard" }
- id: no_console_errors
description: page has no console errors
verify:
console: { errorsMax: 0 }
steps:
- open: /
- click: { by: role, role: link, name: Open dashboard }
- wait: { text: "Inventory Dashboard" }The example above matches the first demo flow. Run that checked-in spec:
./bin/cairn run examples/flows/01-dashboard-nav.yml --cold-start --jsonFor your own specs, validate and stamp the behavior contract:
./bin/cairn spec verify flows/dashboard_nav.yml --json
./bin/cairn spec verify flows/dashboard_nav.yml --stamp
./bin/cairn run flows/dashboard_nav.yml --cold-start --stamp-if-greenintent + outcomes are the contract. steps are hints. cairn spec heal
can patch drifted steps, but the contract hash prevents accidental changes to
what the spec asserts.
Cold-start contract
Finished specs must replay from a fresh browser session. Use one of:
- imported login actions:
imports: [actions/login_admin.yml]plussteps: [{ use: login_admin }] - checkpoint restore:
session: { resume: <checkpoint-name> } - deterministic setup:
preconditions.commands
Run cairn run <spec> --cold-start --json before declaring a spec done.
Steps
Current step keys:
open, click, hover, fill, upload, download, transform,
request, wait, press, scroll, snapshot, use, batch, eval.
Interactive steps use locators with by: role, by: label, by: text, or
by: selector. Prefer role and label locators when possible; they are clearer
for humans and easier to heal.
Semantic locators are strict: they match accessible names (whole-name,
case-insensitive, visible elements only), scroll the target into view before
acting, fail at the step with candidate diagnostics when nothing matches, and
error on ambiguity. Disambiguate with exact: true (case-sensitive),
nth: <index> (0-based), or a more specific name.
open also takes an object form to wait out SPA hydration:
- open: { path: /admin, waitUntil: networkidle, timeoutMs: 45000 }request makes an authenticated API call with the browser session's cookies
and captures the response for later steps. On the Playwright backend, request
steps run out of page through a browser-context cookie transport
(APIRequestContext when safe; an isolated Bun cookie bridge with a
parent-enforced timeout under Bun), so they share the browser context's cookie
jar and apply a real timeout. Backends without a native request primitive use a
bounded page-fetch fallback. Relative request URLs resolve against config
baseUrl when present, so request-first setup actions can run before any
open when baseUrl is configured:
- request:
method: POST
url: /api/qr-token
timeoutMs: 15000 # default: 30000
expectStatus: 200
assign: qr
- fill: { by: label, name: Scanner code, value: "${requests.qr.body.token}" }Request-step calls are mirrored into run network evidence, so network and
noFailedRequests outcomes can assert on API calls made by the spec itself.
batch runs a chain of selector interactions in one backend invocation
(agent-browser batch --bail), so transient UI state survives — e.g. a hover
that reveals a popover stays open long enough to click the button inside it.
Sub-steps are selector-only (click, hover, fill, upload, press,
scroll, wait); the first failing sub-step fails the step:
- batch:
- hover: { by: selector, selector: "#subcontractor-table" }
- click:
by: selector
selector: '.table-header-hover-actions button[aria-label="Upload data"]'Verifiers
Outcome verifier keys:
text, notText, url, network, noFailedRequests, console, count,
xlsx, file, httpJson, script.
Use typed verifiers for normal UI, URL, network, console, count, workbook,
on-disk checks (file polls a glob, e.g. a local email driver's capture
files), and backend JSON state (httpJson fetches with browser cookies and
asserts a simple JSON path). Use script when the assertion is
product-specific or needs browser or Node code.
Scope text / notText checks with nested region:
verify:
text:
contains: dead
region: '[data-testid="objective-ticker"]'When a step fails before producing an artifact, outcomes that reference the
missing ${artifacts.<name>.…} / ${requests.<name>.…} report skipped
("blocked") instead of a misleading missing-file failure — fix the failed
step first.
Timeouts and interrupts
Cairn enforces a hard deadline on every browser-backend invocation (60s
default; a step's own timeoutMs plus a 5s grace period when set). A hung
browser command is killed and the step fails with a normal timeout error.
Playwright wait and browser evaluate paths also have a Cairntrace-side
deadline, defaulting to 30000ms unless the step supplies timeoutMs. For real
Chromium runs, an external watchdog process kills the browser at the deadline,
so navigation churn cannot starve the in-process timer and leave the run
waiting on Playwright forever.
Ctrl-C / SIGTERM during a run tears down the run's own agent-browser session
(daemon and Chrome) before exiting with the conventional 130/143 exit code.
Config
cairntrace.config.yml can provide baseUrl, environment vars, artifact root,
project settings, and the services lifecycle block. Validate it before use:
./bin/cairn config validate --config cairntrace.config.yml --jsonThe services block lets cairn run own the full multi-service environment
lifecycle — docker, conditional data seeding, tmux session management, and
teardown — all config-driven, started once before the spec pool and stopped
after the last spec. Skip it with --no-services.
# cairntrace.config.yml
version: 1
defaultEnvironment: local
environments:
local:
baseUrl: http://localhost:8080
secrets:
provider: tvault
required: [MONGO_PASSWORD, ES_PASSWORD]
tvault: { project: myapp }
services:
docker:
command: "docker compose up -d"
reuseExisting: true
readinessCheck: "curl -sf http://localhost:27017"
healthcheck:
command: "curl -sf http://localhost:9200/_cluster/health | grep -q green"
intervalSeconds: 15
retries: 5
seed:
command: "yarn demo-import --mongoSourceUri mongodb://admin:${MONGO_PASSWORD}@host/db"
ttlSeconds: 21600
freshnessCheck: "mongosh --quiet --eval 'db.count()' mongodb://localhost:27017/db"
tmux:
session: myapp
reuseExisting: true
options:
- { key: mouse, value: "on" }
env:
NODE_ENV: development
windows:
- name: web
cwd: web-app
command: "yarn serve"
readyOn: { url: http://localhost:8080 }
healthcheck:
command: "curl -sf http://localhost:8080/healthz"
intervalSeconds: 20
retries: 3
- name: api
cwd: web-api
command: "yarn dev-watch"
env: { PORT: "3001" }
readyOn: { text: "listening on" }
stash:
enabled: true
autoStash: always
capture: [tmux, docker, seed]
tags: [services, myapp]
teardown:
- "tmux kill-session -t myapp"
- "docker compose down"Seed freshness is tracked at ~/.cairntrace/services/<project>.seed.json
with a three-layer check (fingerprint + TTL + optional data-level command).
The seed only re-runs when the command changed, the TTL expired, or the
freshness check failed. Placeholders such as ${vars.connectionPath} resolve
before spec validation, so they can appear in required fields. Vars merge as
config environment vars < top-level spec vars: < repeatable CLI
--var key=value. Built-ins ${worker.index} and ${run.token} can derive
isolated users or tenants for realtime/stateful backends.
version: 1
defaultEnvironment: local
retention:
keepRuns: 20 # newest N runs per spec; pruned after every run
report:
theme: cairn # cairn | graphite | midnight | contrast
colors:
accent: "#0f766e"
surface: "#fbfdf9"
environments:
local:
baseUrl: http://localhost:${env.APP_PORT} # ${env.X} works in config text
viewport: { width: 1280, height: 800 }
vars:
dashboardPath: /dashboard.html
testUser: player-${worker.index}-${run.token}Specs can also set a top-level viewport: { width, height }, which wins over
the environment's.
Per-environment services & secrets. The services and secrets blocks
can be overridden per-environment inside environments.<name>. This lets you
run the full local stack (docker + seed + tmux) for local, but skip all
services for dev or test where the app is already deployed remotely:
environments:
local:
baseUrl: http://localhost:8080
dev:
baseUrl: https://dev.example.com
services: false # no docker/seed/tmux — app is remote
test:
baseUrl: https://test.example.com
services: false
secrets: # different tvault project for test env
provider: tvault
tvault: { project: test-project }When services: false, cairn run --env dev skips the entire lifecycle — no
need for --no-services. A partial services: block deep-merges over the
top-level one (e.g. override just the seed command, keep docker and tmux). An
env-level secrets: block replaces the top-level one entirely.
Use the same config for validation and runs:
./bin/cairn spec verify flows/dashboard.yml --config cairntrace.config.yml --json
./bin/cairn run flows/dashboard.yml --config cairntrace.config.yml --cold-start --json
./bin/cairn snapshot /dashboard --config cairntrace.config.yml --jsonOverride vars per invocation without touching YAML:
./bin/cairn run flows/dashboard.yml --var baseUrl=http://localhost:3123 --var apiBase=http://localhost:3123/apiArtifacts
Every run writes a self-contained directory under ~/.cairntrace/runs/<run-id>/
unless config or flags override the artifact root. The important files are:
run.json | run.yaml | run.md
report.html
report.json
agent_context.md
events.ndjson
spec.resolved.yml
outcomes/<outcome-id>.md
snapshots/
screenshots/
console/
network/
downloads/
transforms/
requests/
evals/
diagnostics/
traces/
videos/
report.html is a self-contained, print-friendly report for sharing or saving
as PDF. It includes summary cards, outcome/step tables, artifact links, and a
theme switcher. report.json exposes the same redacted report model for custom
renderers, including selected theme tokens and the built-in theme catalog.
Configure styling with report.theme and report.colors in
cairntrace.config.yml; Cairntrace does not require a separate report theme
configuration file.
agent_context.md is the compact handoff file for coding agents. Use
./bin/cairn context latest to print it. context and diff resolve
latest/previous inside --artifact-root, config artifactRoot, or the
global default, in that order.
Disk usage is bounded by retention.keepRuns in the config (pruned after
every run) and by cairn clean [--keep N | --all]. Traces follow the
artifacts.capture.trace policy — the on-failure default deletes the trace
zip when the run passes. Videos follow artifacts.capture.video (default
never) — opt in with always or on-failure for audit-grade .webm
recordings. When steps execute too quickly to audit, set
artifacts.video.slowMo (delay in ms between actions) and
artifacts.video.speed (playback speed multiplier 0.25–4; values < 1 slow
down via ffmpeg). The Playwright backend supports video natively; feed the
recording to vidtrace extract for timestamped evidence extraction.
Cairntrace run directories are self-contained — perfect for stashing to
fcheap for persistence,
sharing, and cross-run search. Requires fcheap on $PATH.
# Stash the latest run
./bin/cairn stash save latest --tag regression
# List stashes
./bin/cairn stash list --tool cairntrace
# Search across all stashed runs
./bin/cairn stash search "redirected to /error"
# Restore a stash to a directory
./bin/cairn stash restore <stash-id> --to /tmp/run-restoreAuto-stash failed runs with --stash-on-failure:
./bin/cairn run flows/login.yml --stash-on-failure --cold-startOr enable via config:
# cairntrace.config.yml
stash:
enabled: true
autoStash: on-failure # or never (default)
tags: [regression, audit]The MCP server exposes cairn_stash_save, cairn_stash_list, and
cairn_stash_search tools that mirror the CLI. All degrade gracefully when
fcheap isn't installed.
When a spec fails, cairn investigate stashes the run and runs fcheap connect
to surface file:line code candidates responsible for the failure using
vecgrep semantic code search.
Requires fcheap and vecgrep on $PATH.
# After a failed run, find the responsible code
./bin/cairn investigate latest --codebase ~/projects/myapp
# With specific search mode and limit
./bin/cairn investigate latest --codebase ~/projects/myapp --mode semantic --limit 5cairn audit is a convenience wrapper that runs a spec with video, extracts
vidtrace evidence from the
recording, and connects it to the codebase — all in one command:
# Run spec with video, extract evidence, connect to code
./bin/cairn audit flows/login.yml --codebase ~/projects/myapp --speed 0.5Configure defaults via config:
# cairntrace.config.yml
investigate:
codebase: ./src # default codebase path
mode: hybrid # semantic | keyword | hybrid
limit: 10 # max code matches
keepStash: false # keep fcheap stash after investigateThe MCP server exposes cairn_investigate and cairn_audit tools that mirror
the CLI. Both degrade gracefully when fcheap/vecgrep/vidtrace aren't installed.
cairn annotate pins run evidence to a code graph symbol using
codemap annotations, building a
knowledge layer of failure points that persists across reindexes. Requires
codemap on $PATH.
# After investigate surfaces src/auth/login.ts:42 as a failure point
./bin/cairn annotate src/auth/login.ts:42 \
--source cairntrace \
--note "login_flow spec fails: redirect to /error instead of /dashboard" \
--run-id latest
# Annotate without a run reference
./bin/cairn annotate handleSubmit --note "flaky on cold start"
# Auto-annotate every run (pass + fail) into codemap with run context
./bin/cairn run flows/login.yml --auto-annotate on-runcairn secrets manages TinyVault
secrets for authenticated specs — checking projects, listing keys, and
exporting secrets without exposing values in the conversation. Requires tvault
on $PATH.
# List vault projects
./bin/cairn secrets list-projects
# List secret keys in a project
./bin/cairn secrets list-keys --project myapp-test
# Export secrets to a .env file (values never printed)
./bin/cairn secrets export-env --project myapp-test --output .env.testConfigure defaults via config:
# cairntrace.config.yml
codemap:
enabled: true
path: ~/projects/myapp # default codebase path for annotate
secrets:
provider: tvault # tvault | none
project: myapp-test # default vault projectThe MCP server exposes cairn_annotate, cairn_secrets_list_projects,
cairn_secrets_list_keys, and cairn_secrets_export_env tools that mirror the
CLI. All degrade gracefully when codemap/tvault aren't installed.
Common commands:
| Command | Purpose |
|---|---|
cairn run <spec...> |
Run one or more specs or directories. Supports --backend, --mock, --parallel, --cold-start, --config, --artifact-root, --var k=v, --junit, and --stamp-if-green. Directory inputs expand *.yml/*.yaml recursively, skipping imported actions/ directories and _*.yml / _*.yaml drafts. |
cairn clean |
Prune old run directories (--keep N per spec, or --all; honors --config and --artifact-root). |
cairn spec verify <spec> |
Lint a spec and optionally stamp contractHash with --stamp. |
cairn spec heal <spec> |
Run a spec and propose locator-drift fixes. Add --apply to write them. |
cairn snapshot <url> |
Open a page and print role and data-testid locator inventory. Relative URLs resolve through config baseUrl. |
cairn context <run|latest> |
Print the run's agent_context.md; add --path, --config, or --artifact-root. |
cairn docs [topic] |
Return focused docs for overview, authoring, steps, verifiers, downloads, scripts, artifacts, services, stash, investigate, annotate, mcp, or backends. |
cairn explain |
Return the current agent-facing command, step, verifier, and rule surface. |
cairn diff <runA> <runB> |
Compare two runs by outcomes, steps, console, and network; supports --config and --artifact-root. |
cairn checkpoint list/show/delete |
Manage saved browser-state checkpoints. |
cairn checkpoint capture-from-session <name> |
Save state from an existing agent-browser session. |
cairn login <name> --url <url> |
Open a headed login flow and save a checkpoint. |
cairn export playwright <spec> |
Emit an @playwright/test spec from a Cairntrace spec. |
cairn import playwright <file> |
Convert common Playwright steps and assertions into reviewable Cairntrace YAML with TODO comments for unmapped lines. |
cairn stash save <run-id> |
Stash a run directory to the fcheap vault for persistence and search. Supports --tag, --tool, --source. |
cairn stash list |
List stashes, optionally filtered by --tag or --tool. |
cairn stash info <stash-id> |
Show detailed metadata and file list for a stash. |
cairn stash restore <stash-id> |
Restore a stash to a directory (--to <dir>). |
cairn stash search <query> |
Search across all stashed runs. Supports --mode keyword|semantic|hybrid and --limit. |
cairn investigate <run-id> |
Stash a run and run fcheap connect to find code responsible for failures. Supports --codebase, --mode, --limit, --keep-stash. |
cairn audit <spec> |
Run a spec with video, extract vidtrace evidence, and connect to code. Supports --codebase, --speed, --slow-mo, --mode, --limit. |
cairn annotate <symbol> |
Pin run evidence to a codemap code graph symbol. Supports --source, --note, --data, --run-id, --codebase. |
cairn secrets list-projects |
List TinyVault projects. |
cairn secrets list-keys |
List secret keys in a TinyVault project (--project). |
cairn secrets export-env |
Export TinyVault secrets to a .env file without exposing values (--project, --output). |
cairn config validate |
Validate cairntrace.config.yml structure and cross-field rules. Supports --config, --format json|yaml|md. Exit 0 = valid, 4 = invalid. |
cairn mcp |
Start the MCP server on stdio. |
Structured output is available on commands wired with format flags:
./bin/cairn run examples/flows/01-dashboard-nav.yml --json
./bin/cairn snapshot /dashboard.html --config examples/cairntrace.config.yml --json
./bin/cairn import playwright tests/example.spec.ts --json
./bin/cairn spec verify examples/flows/01-dashboard-nav.yml --format yaml
./bin/cairn docs verifiers --json
./bin/cairn diff previous latest --format mdCommands with structured output today: run, doctor, clean, explain,
docs, snapshot, diff, import playwright, spec verify, spec heal,
checkpoint list, and checkpoint show.
Stable exit codes:
| Code | Meaning |
|---|---|
| 0 | success |
| 1 | outcome failure |
| 2 | errored |
| 3 | cold-start gate |
| 4 | lint failure |
| 5 | heal made no progress |
| 6 | contract-hash mismatch |
Run the stdio MCP server:
./bin/cairn mcpExample MCP client config:
{
"mcpServers": {
"cairntrace": {
"command": "cairn",
"args": ["mcp"]
}
}
}The MCP server exposes these tools:
cairn_explain, cairn_docs, cairn_doctor, cairn_run, cairn_context,
cairn_spec_scaffold, cairn_spec_verify, cairn_spec_heal,
cairn_checkpoint_list, cairn_checkpoint_show, cairn_checkpoint_delete,
cairn_config_validate, cairn_stash_save, cairn_stash_list, cairn_stash_search,
cairn_investigate, cairn_audit, cairn_annotate, cairn_secrets_status.
Agents should call cairn_explain once at session start, then cairn_docs
for the focused topic they need.
spec YAML
-> parseSpec + zod validation + config substitution + imports
-> contract-hash check
-> Runner
-> BrowserBackend
-> AgentBrowserAdapter
-> PlaywrightAdapter
-> MockBrowserBackend
-> OutcomeEvaluator
-> ArtifactWriter
The parser, runner, browser adapters, verifiers, and artifact writer are kept separate so the core stays deterministic and testable.
- Hybrid API + UI flows:
requestuses the browser session's cookies, resolves relative URLs through configbaseUrlwhen present, captures the response, and later steps splice fields via${requests.<name>.body.<field>}— e.g. fetch a QR token via API, thenfillit into the scanner UI. Playwright runs request steps out of page with browser-context cookie sharing and a 30000ms default timeout; under Bun the cookie bridge runs in a subprocess so a stalled fetch cannot wedge the run. - Realtime/stateful isolation: use
${worker.index}and${run.token}invars:to derive a unique user or tenant per spec run, e.g.testUser: player-${worker.index}-${run.token}. - Download artifacts:
downloadclicks a locator and saves the file underdownloads/, optionally assigning it as${artifacts.<name>.path}. - Transform artifacts:
transformruns a Node-side script to turn a downloaded file into a new upload fixture undertransforms/. - Workbook assertions:
xlsxverifies workbook sheet text and Excel data validation metadata. - Custom assertions:
scriptruns browser or Node code and returns{ ok, evidence }. - Locator inventory:
cairn snapshot <url> --jsonreturns role anddata-testidlocators before you author or repair steps. - Suite CI:
cairn run flows --junit reports/cairn.xmlexpands a directory of specs recursively, skips importedactions/and_-prefixed drafts, and writes JUnit XML for CI dashboards. - Contract stamping after proof:
cairn run <spec-or-dir> --stamp-if-greenstampscontractHashonly when every requested spec passes. - Playwright import:
cairn import playwright <file>converts common Playwright actions and assertions to Cairntrace YAML, preserving TODO comments for unmapped lines that need human review. - Playwright export:
cairn export playwright <spec>emits a normal@playwright/testfile when a Cairntrace spec is stable enough for CI.
bun install
bun run typecheck
bun run lint
bun run test
bun run format
bun run verifyRun bun run verify before pushing. If you touched the runner, heal flow, or
browser adapters, also smoke-test against the demo app in examples/.
More contributor guidance lives in AGENTS.md. That file is the canonical instruction set for coding agents working in this repo.
Cairntrace is distributed through git tags and GitHub release pages only; it is
not published to npm or GitHub Packages. The install guide intentionally
doesn't hardcode a version because users can pin the newest tag with
git tag --sort=-v:refname.
The project follows SemVer tags (vX.Y.Z). All v1.x.y releases are
Cairntrace v1, so normal maintenance should add the next patch or minor tag
instead of rewriting old releases. Use patch releases for fixes, docs, and
polish; use minor releases for new non-breaking CLI/schema behavior; reserve
major releases for breaking contracts.
For a release, bump package.json's version, run bun run verify, create an
annotated vX.Y.Z tag, push main and the tag, then create the GitHub release
with gh release create. Do not create a floating latest tag — GitHub keeps
/releases/latest pointed at the newest release automatically.
See SECURITY.md. Short version: Cairntrace specs are trusted code, like Playwright tests or shell scripts. Do not run specs from untrusted sources, and only connect MCP clients you trust.