autonomous

"You sleep. It ships."

You close your laptop at midnight. 47 TODOs in your backlog.
You open it at 8am. 38 of them are done, tested, committed, on a clean branch.
Total cost: $4.20. No meetings required.

That's autonomous.

A self-driving project agent for Claude Code. Drop it into any git repo, run /autonomous, go to sleep.

Quickstart · Skills · Architecture · How It Works · Configuration · Safety · Testing

Install — 10 seconds

Requirements: Claude Code, Git, Python 3.9+

Optional: tmux (visible worker windows), jq (persona generation)

Paste this into Claude Code:

Install autonomous: git clone https://github.com/Sma1lboy/autonomous.git ~/.claude/skills/autonomous-skill && cd ~/.claude/skills/autonomous-skill && ./setup

That's it. Open any git repo and run /autonomous or /quickdo.

Skills

This package ships two public skills:

`/autonomous` — Full multi-sprint orchestration

The complete pipeline: Conductor → Sprint Master → Worker. Runs multiple sprints, transitions between directed work and autonomous exploration, manages sprint branches, evaluates results between sprints.

# Default: 10 sprints
/autonomous

# Quick: 3 sprints
/autonomous 3

# With direction: focus on a specific area
/autonomous 5 build REST API

# Direction only (default 10 sprints)
/autonomous fix all auth bugs

`/quickdo` — Fast single-sprint execution

Lightweight mode. Skips the conductor, runs one sprint master directly via blocking claude -p. No tmux, no multi-sprint state, no monitor polling. One direction, one sprint, done.

# Single task
/quickdo add login page with GitHub OAuth

# Quick fix
/quickdo fix the broken unit tests

Best for tasks that fit in a single sprint — a full page, a complete feature stage, a test suite, a refactor.

Standalone (outside Claude Code)

# Direct CLI invocation via loop.py
AUTONOMOUS_DIRECTION="fix auth bugs" python3 scripts/loop.py /path/to/project

Architecture

Three-layer hierarchy with full context isolation between layers:

Conductor (autonomous/SKILL.md — runs in user's Claude Code session)
  │
  ├── Plans sprint directions (directed phase or exploration phase)
  ├── Dispatches sprint masters via claude -p
  ├── Evaluates sprint results, manages phase transitions
  │
  └── Sprint Master (SPRINT.md — separate claude -p session)
        │
        ├── Sense → Direct → Respond → Summarize loop
        ├── Dispatches workers via claude -p
        ├── Answers worker questions via comms.json protocol
        │
        └── Worker (full Claude session with all tools)
              │
              └── Executes the actual work: reads code, edits files,
                  runs tests, commits changes

Each layer runs in its own Claude session — fresh context per sprint, no bleed between layers.

/quickdo flattens this to two layers: it skips the conductor and runs the sprint master directly.

Backlog — A persistent work queue (.autonomous/backlog.json) that survives across sessions. Workers log out-of-scope discoveries, the conductor decomposes large missions into deferred items. When exploration runs dry, idle sprints pick from the backlog. Progressive disclosure: sprint masters only see one-line titles, the conductor sees full descriptions.

Templates

Worker-task suggestions and boundary blacklists are driven by swappable templates at templates/<name>/template.md. Ships with two: gstack (default — uses /office-hours, /qa, /investigate, blocks /ship etc.) and default (generic, no toolchain commands).

Select a template per project by writing .autonomous/skill-config.json in your project root:

{ "template": "default" }

The project-level override beats the skill-root default at ~/.claude/skills/autonomous-skill/skill-config.json. Unknown template names fall through to default. To add a new template, create templates/<name>/template.md with ## Allow and ## Block sections and point the config at it.

How It Works

Persona — persona.py reads your git history + project docs to understand your coding style. Writes OWNER.md.
Discovery — The conductor talks to you to understand the mission. If you passed a direction in args, it confirms and moves on.
Session — Creates an auto/session-TIMESTAMP branch and initializes conductor-state.json.
Conductor loop — Plan → Dispatch → Monitor → Evaluate → Repeat:
- Directed phase: breaks your mission into sprint-sized tasks, dispatches one sprint master per task
- Phase transition: when direction is complete (2 consecutive signals + commits, max sprints reached, or 2 zero-commit sprints)
- Exploration phase: scans the project across 8 dimensions, picks the weakest, generates improvement sprints
Sprint execution — Each sprint master gets a fresh claude -p session, dispatches a worker, answers questions via comms.json, and writes sprint-summary.json when done.
Merge/discard — Successful sprints merge back to the session branch. Failed sprints are discarded.
Backlog pickup — When exploration dimensions are all solid, the conductor checks the backlog for deferred work items before stopping.
Session ends when all sprints are used up, the project feels solid, and the backlog is empty.

Exploration Dimensions

When the directed mission is complete, the conductor autonomously explores 8 dimensions:

Dimension	What it audits
`test_coverage`	Untested code paths, missing edge cases
`error_handling`	Missing error messages, unhandled failures
`security`	Hardcoded secrets, injection vulnerabilities, input validation
`code_quality`	Dead code, duplication, overly complex functions
`documentation`	README accuracy, missing docstrings, stale docs
`architecture`	Module boundaries, dependency directions, separation of concerns
`performance`	N+1 queries, blocking I/O, missing caching
`dx`	CLI help text, error messages, setup instructions

Dimensions are scored via fast Python heuristics (explore-scan.py), and the weakest is selected for each exploration sprint.

Comms Protocol

Workers can't use AskUserQuestion in subagent context. Instead, they write questions to .autonomous/comms.json:

{"status": "waiting", "questions": [{"question": "...", "options": [...]}], "rec": "A"}

The sprint master polls, decides using product intuition (or OWNER.md guidance), and writes back:

{"status": "answered", "answers": ["A"]}

Valid statuses: idle, waiting, answered, done.

Worker safety hook (opt-in)

Set AUTONOMOUS_WORKER_CAREFUL=1 to install a PreToolUse hook on every dispatched worker that blocks catastrophic Bash commands:

AUTONOMOUS_WORKER_CAREFUL=1 /autonomous 5 build REST API

Blocks: rm -rf /, rm -rf $HOME, rm -rf /Users|/home, mkfs, dd of=/dev/sd*, fork bombs, device redirects (>, >>, >|, tee, cp to /dev/*), shutdown/reboot, git push --force (all variants), DROP TABLE/DATABASE/SCHEMA, TRUNCATE TABLE. Also catches interpreter wrappers like python3 -c 'os.system("rm -rf /")' and chaining bypasses like echo ok; rm -rf /.

Configured per-sprint via claude --settings <file> — no global settings change. Blocks are exit-2 with a stderr message; the worker reads "BLOCKED: ..." and adapts.

Checkpoints

Take a human-readable snapshot of the current session anytime:

python3 ~/.claude/skills/autonomous-skill/scripts/checkpoint.py save .
python3 ~/.claude/skills/autonomous-skill/scripts/checkpoint.py save . --title "pre-refactor"
python3 ~/.claude/skills/autonomous-skill/scripts/checkpoint.py list .
python3 ~/.claude/skills/autonomous-skill/scripts/checkpoint.py latest .

Each checkpoint is a markdown file at .autonomous/checkpoints/<ts>-<slug>.md capturing mission, phase, sprint history, backlog summary, exploration dimension scores, git state, and resume guidance. History is retained — old checkpoints stay until manually deleted.

Useful for context switching ("where was I yesterday?"), sharing session state with a teammate, or reviewing sprint output before resuming.

Sprint worktrees (opt-in)

Set AUTONOMOUS_SPRINT_WORKTREES=1 to run each sprint in its own git worktree under .worktrees/sprint-N/:

AUTONOMOUS_SPRINT_WORKTREES=1 /autonomous 5 build REST API

Main tree stays on the session branch the whole time — no git checkout -b churn
Each sprint works on its own branch in its own directory, fully file-isolated
.autonomous/ is symlinked from each worktree back to the main tree, so comms, state, summaries, and backlog all go through one source of truth
.worktrees/ is auto-added to .gitignore on first sprint
On sprint completion: merge runs first (with --keep-branch), then the worktree is removed, then the branch is deleted. If merge conflicts, worktree and branch are preserved for forensic inspection.
.worktrees/ or .autonomous/ pre-existing as symlinks are refused (repo-escape prevention).

V1 is serial-only — one sprint at a time. Parallel sprint dispatch is deferred to a future PR.

Configuration

Variable	Default	Description
`MAX_SPRINTS` (via args)	`10`	Max conductor sprints
`MAX_ITERATIONS`	`50`	Max iterations for loop.py standalone mode
`CC_TIMEOUT`	`900`	Timeout per CC invocation (seconds)
`AUTONOMOUS_DIRECTION`	(none)	Session focus (e.g., "fix auth bugs")
`MAX_COST_USD`	(none)	Stop when total cost exceeds this
`DISPATCH_MODE`	(auto)	`blocking` (no tmux), `headless` (background), or auto (tmux if available)

Project Structure

autonomous-skill/
├── autonomous/SKILL.md               # /autonomous — multi-sprint conductor
├── quickdo/SKILL.md                  # /quickdo — fast single-sprint mode
├── SPRINT.md                         # Sprint master: per-sprint execution (inlined into prompt)
├── CLAUDE.md                         # Project instructions for Claude
├── OWNER.md.template                 # Persona template for manual config
├── skill-config.json                 # Default template selector (per-project override at .autonomous/)
├── templates/
│   ├── gstack/template.md            # Allow/Block sections for gstack toolchain
│   └── default/template.md           # Generic fallback, no toolchain assumptions
├── scripts/
│   ├── startup.py                    # SCRIPT_DIR resolution + project context (shared)
│   ├── parse-args.py                 # Parse ARGS → _MAX_SPRINTS + _DIRECTION
│   ├── session-init.py               # Create session branch, init state + backlog
│   ├── build-sprint-prompt.py        # Inline SPRINT.md + params → sprint-prompt.md
│   ├── dispatch.py                   # Blocking / tmux / headless session dispatch
│   ├── monitor-sprint.py             # Poll for sprint-summary.json
│   ├── monitor-worker.py             # Poll comms.json + tmux/process liveness
│   ├── evaluate-sprint.py            # Read summary JSON, update conductor state
│   ├── merge-sprint.py               # Merge or discard sprint branch
│   ├── write-summary.py              # Generate sprint-summary.json
│   ├── conductor-state.py            # State management (atomic writes, PID lock)
│   ├── explore-scan.py               # 8-dimension project scanner
│   ├── backlog.py                    # Cross-session persistent backlog
│   ├── persona.py                    # OWNER.md auto-generation
│   ├── loop.py                       # Standalone launcher (outside CC)
│   ├── master-poll.py                # Manual master polling for comms.json
│   └── master-watch.py               # Dual-channel monitor (comms + JSONL)
├── tests/
│   ├── test_helpers.sh               # Shared test framework
│   ├── test_conductor.sh             # 99 tests: state, phase transitions, exploration
│   ├── test_comms.sh                 # 34 tests: comms.json protocol
│   ├── test_persona.sh               # 20 tests: OWNER.md generation
│   ├── test_explore_scan.sh          # 45 tests: dimension scoring heuristics
│   ├── test_loop.sh                  # 20 tests: standalone launcher
│   ├── test_backlog.sh               # 76 tests: CRUD, progressive disclosure
│   ├── test_build_sprint_prompt.sh   # 25 tests: template resolution, allow/block injection
│   ├── test_eval_output.sh           # 35 tests: eval safety, tmux cleanup
│   └── claude                        # Mock CC binary for testing
├── .claude/skills/                   # Internal dev/test skills
│   ├── smoke-test/SKILL.md           # E2E pipeline smoke test
│   ├── test-worker/SKILL.md          # Spawns worker + auto-answering master
│   ├── capture-worker/SKILL.md       # Capture worker JSONL for inspection
│   ├── diff-sessions/SKILL.md        # Compare two sessions side-by-side
│   ├── clean-sandbox/SKILL.md        # Reset test sandbox
│   └── clean-gstack/SKILL.md         # Delete gstack design doc archives
└── README.md

Generated at runtime (gitignored):

OWNER.md — your persona, auto-generated from git + docs
.autonomous/conductor-state.json — multi-sprint state machine
.autonomous/comms.json — worker↔master IPC
.autonomous/sprint-summary.json — per-sprint results

Safety

Guard	How
Branch isolation	All work on `auto/session-` or `auto/quickdo-` branches. Never touches main.
Per-sprint branches	Each sprint works on its own branch; merged on success, discarded on failure.
Timeout	Each CC invocation capped at 15 min (configurable via `CC_TIMEOUT`).
Cost budget	`MAX_COST_USD` env var stops the session when exceeded.
Excluded workflows	Configured per template (see `templates/<name>/template.md` `## Block` section).
Graceful shutdown	SIGINT + sentinel file for clean exit across all layers.
3-strike rule	Same approach fails 3 times → stop and report.
Atomic state	Conductor state uses tmp+mv writes, PID lock for concurrency safety.

Testing

329 tests across 7 suites, all pure bash:

bash tests/test_conductor.sh    # 99 tests
bash tests/test_comms.sh        # 34 tests
bash tests/test_persona.sh      # 20 tests
bash tests/test_explore_scan.sh # 45 tests
bash tests/test_loop.sh         # 20 tests
bash tests/test_backlog.sh      # 76 tests
bash tests/test_eval_output.sh  # 35 tests
python3 -m compileall scripts   # quick syntax check for Python helpers

The test harness uses tests/claude (a mock CC binary) controlled by env vars:

Variable	Effect
`MOCK_CLAUDE_COST`	Reported cost per invocation
`MOCK_CLAUDE_COMMIT=1`	Make a git commit during the mock run
`MOCK_CLAUDE_DELAY`	Sleep N seconds (for timeout tests)
`MOCK_CLAUDE_EXIT`	Exit code to return

Reviewing & Merging

# See what the agent did
git log main..auto/session-TIMESTAMP --oneline

# Detailed diff
git diff main..auto/session-TIMESTAMP --stat

# Merge if satisfied
git checkout main && git merge auto/session-TIMESTAMP

# Or cherry-pick specific commits
git cherry-pick COMMIT_HASH

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

autonomous

Install — 10 seconds

Skills

`/autonomous` — Full multi-sprint orchestration

`/quickdo` — Fast single-sprint execution

Standalone (outside Claude Code)

Architecture

Templates

How It Works

Exploration Dimensions

Comms Protocol

Worker safety hook (opt-in)

Checkpoints

Sprint worktrees (opt-in)

Configuration

Project Structure

Safety

Testing

Reviewing & Merging

License

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 133 Commits
.claude/skills		.claude/skills
assets		assets
autonomous		autonomous
explore-ralph-loop		explore-ralph-loop
modes/dev		modes/dev
quickdo		quickdo
schemas		schemas
scripts		scripts
site		site
templates		templates
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
OWNER.md.template		OWNER.md.template
README.md		README.md
SPRINT.md		SPRINT.md
VERSION		VERSION
setup		setup
vercel.json		vercel.json

Folders and files

Latest commit

History

Repository files navigation

autonomous

Install — 10 seconds

Skills

/autonomous — Full multi-sprint orchestration

/quickdo — Fast single-sprint execution

Standalone (outside Claude Code)

Architecture

Templates

How It Works

Exploration Dimensions

Comms Protocol

Worker safety hook (opt-in)

Checkpoints

Sprint worktrees (opt-in)

Configuration

Project Structure

Safety

Testing

Reviewing & Merging

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`/autonomous` — Full multi-sprint orchestration

`/quickdo` — Fast single-sprint execution

Packages