esengine
diff --git a/‎README.md‎
Lines changed: 5 additions & 0 deletions b/‎README.md‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎README.zh-CN.md‎
Lines changed: 4 additions & 0 deletions b/‎README.zh-CN.md‎
Lines changed: 4 additions & 0 deletions
diff --git a/‎docs/GUIDE.md‎
Lines changed: 6 additions & 0 deletions b/‎docs/GUIDE.md‎
Lines changed: 6 additions & 0 deletions
diff --git a/‎docs/GUIDE.zh-CN.md‎
Lines changed: 6 additions & 0 deletions b/‎docs/GUIDE.zh-CN.md‎
Lines changed: 6 additions & 0 deletions
diff --git a/‎docs/SPEC.md‎
Lines changed: 30 additions & 8 deletions b/‎docs/SPEC.md‎
Lines changed: 30 additions & 8 deletions
diff --git a/‎docs/TOOL_CONTRACT.md‎
Lines changed: 64 additions & 0 deletions b/‎docs/TOOL_CONTRACT.md‎
Lines changed: 64 additions & 0 deletions
diff --git a/‎docs/TOOL_CONTRACT.zh-CN.md‎
Lines changed: 58 additions & 0 deletions b/‎docs/TOOL_CONTRACT.zh-CN.md‎
Lines changed: 58 additions & 0 deletions
@@ -60,6 +60,9 @@
   two models together (executor + planner) in separate, cache-stable sessions.
 - **Plugin-driven.** External tools run as subprocesses over stdio JSON-RPC
   (MCP-compatible). Built-in tools self-register at compile time.
+- **Cache-aware context maintenance.** Startup injects a small stable environment
+  summary, stale tool output is snipped/pruned before summary compaction, and the
+  built-in tool schema contract is documented for regression review.
 - **Zero-friction distribution.** `CGO_ENABLED=0` single binary; cross-compile
   to six targets with one command. The only dependency is a TOML parser.
 
@@ -134,6 +137,8 @@ commands, `@` references, and two-model setup are all in the
   from the desktop app, then use approvals, YOLO, and commands from IM.
 - **[Spec](./docs/SPEC.md)** — engineering contract: architecture, registries,
   data types, and roadmap.
+- **[Tool contract](./docs/TOOL_CONTRACT.md)** — provider-visible built-in tool
+  names, read-only flags, and schema snapshot guard.
 - **[Migrating from 0.x](./docs/MIGRATING.md)** — moving from the legacy
   TypeScript releases to the 1.0 Go rewrite.
 - **[Checkpoints & rewind](./docs/CHECKPOINTS.md)** — the snapshot-based edit
 
@@ -57,6 +57,8 @@
   端点都只是一条配置。可选让两个模型协同（执行器 + 规划器），各自独立、缓存稳定的 session。
 - **插件驱动**：外部工具以子进程形式运行，通过 stdio JSON-RPC 通信（MCP 兼容）；
   内置工具在编译期自注册。
+- **缓存友好的上下文维护**：启动时注入稳定的环境摘要；旧工具输出会先 snip/prune，
+  再进入摘要 compaction；内置工具 schema 合约有文档和回归测试保护。
 - **零摩擦分发**：`CGO_ENABLED=0` 单二进制；一条命令交叉编译到六个目标平台。
   唯一依赖是一个 TOML 解析库。
 
@@ -124,6 +126,8 @@ provider key 的运行时 fallback，但仍会作为当前 workspace 范围内
 - **[机器人使用指南](./docs/BOT_GUIDE.zh-CN.md)** —— 桌面端连接飞书、Lark、微信
   Bot，以及 IM 里的审批、YOLO 和命令交互。
 - **[规格](./docs/SPEC.md)** —— 工程契约:架构、registry、数据类型与路线图。
+- **[工具合约](./docs/TOOL_CONTRACT.zh-CN.md)** —— provider 可见的内置工具名、
+  read-only 标记和 schema 快照保护。
 - **[从 0.x 迁移](./docs/MIGRATING.md)** —— 从 legacy TypeScript 版本迁到 1.0 Go 重写版。
 - **[Checkpoints 与 rewind](./docs/CHECKPOINTS.md)** —— 基于快照的编辑安全网
   (Esc-Esc / `/rewind`)。
 
@@ -63,6 +63,7 @@ reasoning_language = "auto"      # visible reasoning text: auto|zh|en
 # subagent_models = { review = "deepseek-pro", security_review = "deepseek-pro" }
 auto_plan = "off"                  # user-level only; off|on; off keeps plan mode manual
 # auto_plan_classifier = "deepseek-flash"   # optional; only borderline tasks call it
+tool_result_snip_ratio = 0.6       # shorten stale tool output before summary compaction
 
 [[providers]]
 name        = "deepseek-flash"
@@ -77,6 +78,11 @@ enabled = []   # omit/empty = all built-ins
 bash_timeout_seconds = 120   # foreground safety cap; set 0 for no tool-local cap
 mcp_call_timeout_seconds = 300   # default MCP call safety cap; per-plugin/tool overrides may raise it
 
+[environment]
+enabled = true   # inject a stable startup summary of OS, shell, and common tools
+# [environment.tools]
+# go = "/opt/homebrew/bin/go"   # optional explicit trusted path; workspace-local paths are not auto-executed
+
 [skills]
 # paths = ["~/my-skills", "../shared/skills"]   # extra custom skill roots
 # excluded_paths = ["~/.agents/skills"]         # hide convention roots without deleting folders
 
@@ -55,6 +55,7 @@ reasoning_language = "auto"      # 可见思考过程语言：auto|zh|en
 # subagent_models = { review = "deepseek-pro", security_review = "deepseek-pro" }
 auto_plan = "off"                  # 仅用户级生效；off|on；off 表示计划模式仅手动开启
 # auto_plan_classifier = "deepseek-flash"   # 可选；只在边界任务上调用
+tool_result_snip_ratio = 0.6       # 在摘要 compaction 前先缩短旧工具输出
 
 [[providers]]
 name        = "deepseek-flash"
@@ -69,6 +70,11 @@ enabled = []   # 省略/为空 = 全部内置工具
 bash_timeout_seconds = 120   # 前台安全上限；设为 0 表示不设工具层超时
 mcp_call_timeout_seconds = 300   # MCP 调用默认安全上限；可用 plugin/tool 覆盖
 
+[environment]
+enabled = true   # 启动时把 OS、shell 和常见工具摘要稳定注入 prompt
+# [environment.tools]
+# go = "/opt/homebrew/bin/go"   # 可选：显式可信路径；workspace 内路径不会在启动时自动执行
+
 [skills]
 # paths = ["~/my-skills", "../shared/skills"]   # 额外的自定义技能目录
 # excluded_paths = ["~/.agents/skills"]         # 隐藏约定来源，不删除目录
 
@@ -110,6 +110,9 @@ type Tool interface {
   (`tool.RegisterBuiltin(t)`); `tool.Builtins()` lists them.
 - A runtime `*Registry` is assembled per run: enabled built-ins (filtered by
   config) **plus** plugin-provided tools. The agent only sees the `*Registry`.
+- Tool schemas are canonicalized on registry insertion. The built-in contract is
+  documented in [`TOOL_CONTRACT.md`](TOOL_CONTRACT.md) and backed by tests that
+  compare the documented surface against the same canonical schema path.
 - `Execute` parses raw JSON args itself. Errors are returned, not fatal — the
   agent feeds them back so the model can self-correct.
 
@@ -182,14 +185,25 @@ prefix cache-stable:
 Long tasks eventually fill the model's context window. Reasonix manages this with
 **low-frequency compaction** that respects the cache-first design:
 
-- Each provider declares its `context_window` (tokens). When a turn's reported
-  `prompt_tokens` reach `compactRatio` (default `0.8`) of that window, the
-  executor compacts **once** before the next turn.
-- Compaction folds only the assistant/tool work. Every **user turn** small
-  enough to be a brief and every **prior digest** is kept verbatim; the foldable
-  remainder is summarized — using the executor's own provider, no tools — in
-  place. The boundary is aligned backward off any tool result so the recent tail
-  never begins with an orphan tool message whose `tool_calls` were summarized away.
+- Each provider declares its `context_window` (tokens). Context maintenance is
+  tiered: below `agent.tool_result_snip_ratio` (default `0.6`) the session is
+  left untouched apart from the soft notice; at the snip ratio, stale tool
+  results before the recent tail are archived and shortened with deterministic
+  head/tail markers; at `agent.compact_ratio` (default `0.8`) stale tool results
+  are archived and pruned to short placeholders before any summary call; only if
+  pruning still leaves the prompt above the threshold does summary compaction
+  run. At `agent.compact_force_ratio` (default `0.9`), the existing forced fold
+  may proceed even when the fold economics would normally skip it.
+- Tool-result snip/prune never removes messages, so assistant `tool_calls` and
+  tool results stay paired. `KeepErrors` preserves error/blocked tool outputs,
+  and the recent tail is not rewritten. Snipped results can later be upgraded to
+  pruned placeholders; already-pruned results are left alone.
+- When summary compaction runs, it folds only the assistant/tool work. Every
+  **user turn** small enough to be a brief and every **prior digest** is kept
+  verbatim; the foldable remainder is summarized — using the executor's own
+  provider, no tools — in place. The boundary is aligned backward off any tool
+  result so the recent tail never begins with an orphan tool message whose
+  `tool_calls` were summarized away.
 - The dropped originals are archived under the user config dir
   (`reasonix/archive/<timestamp>.jsonl`; see §5 for its per-OS location), one
   message per line, so the full history stays traceable.
@@ -516,6 +530,14 @@ context_window = 1000000   # tokens; harness compacts older history near this li
 
 # A single-model entry still works for custom OpenAI-compatible endpoints.
 
+[environment]
+enabled = true   # inject a stable startup summary of OS, shell, and common tool versions
+
+# Optional trusted executable paths shown to the model when PATH probing is not enough.
+# Workspace-local paths are listed but not auto-executed during startup probing.
+# [environment.tools]
+# go = "/opt/homebrew/bin/go"
+
 [tools]
 enabled = []   # omit/empty = all built-ins
 bash_timeout_seconds = 120   # foreground safety cap; set 0 for no tool-local cap
 
@@ -0,0 +1,64 @@
+# Tool Contract
+
+<a href="./TOOL_CONTRACT.zh-CN.md">简体中文</a>
+
+This document records the provider-visible contract for Reasonix compile-time built-in tools. It is generated from the same canonical schema path used by the runtime registry.
+
+| Tool | Read-only | Description |
+| --- | --- | --- |
+| `bash` | false | Execute a command in the shell and return combined stdout/stderr. Use for builds, tests, git, package managers, etc. To search/read/list/edit/move files, prefer the dedicated tools (grep, read_file, ls, glob, edit_file, move_file) over shell grep/cat/ls/find/sed/mv/Move-Item - they behave identically on every OS. For symbol search or architecture questions, prefer LSP/read tools and targeted grep before shell commands. |
+| `bash_output` | true | Read new output from a background job started with bash(run_in_background=true) or task(run_in_background=true). Returns the output produced since the last bash_output call for that job, plus its status (running/done/failed/killed). Does not block. |
+| `code_index` | true | Lightweight built-in code symbol index. Prefer lsp_* for language semantics and installed code graph MCP tools for call graph, impact, and architecture relationships; use this as the local fallback for file outlines and symbol definition candidates, then verify with read_file or grep. |
+| `complete_step` | true | Record the evidence-backed completion of ONE step of an approved plan. Call it as you finish each step instead of silently moving on: it signs the step off with PROOF it is done - the verification you ran (command + result), the diff/files you changed, or a manual check. A completion with no evidence is REJECTED, so don't claim a step is done until you can show why. The host advances the task list for you when you sign off - it marks this step completed and moves the next to in_progress, so you don't need a separate todo_write to mark completions. Fields: `step` (which step - its title or number, matching the task list), `result` (what is now true/changed), `evidence` (>=1 item, each with `kind` = verification\|diff\|files\|manual and a `summary`, plus optional `command`/`paths`), and optional `notes`. |
+| `delete_range` | false | Delete a contiguous text range from a file using exact start/end text anchors. Each anchor must match exactly one line. Returns unified diff on success. Use for large deletions - smaller changes should use edit_file. |
+| `delete_symbol` | false | Delete a named symbol (function, method, type, interface, const, var) from a Go source file using AST parsing. For non-Go files, use delete_range with manual anchors. |
+| `edit_file` | false | Replace an exact string in a file with another. old_string must occur exactly once; add surrounding context to disambiguate. Use for targeted edits instead of rewriting the whole file. |
+| `glob` | true | Find files matching a glob pattern (e.g. "*.go", "internal/*/*.go", "**/*.test.ts"). Supports shell metacharacters * ? [] and the recursive ** pattern. |
+| `grep` | true | Search for a regular expression in a file, or recursively under a directory (skips hidden files and files matched by .gitignore). Returns matching lines as path:line:text, capped at 200 matches. |
+| `kill_shell` | false | Terminate a running background job (bash or task) started with run_in_background. A no-op if the job has already finished or the id is unknown. |
+| `ls` | true | List the entries of a directory. Directories are shown with a trailing slash; files show their byte size. Set recursive=true to list all nested files depth-first (skips .git/node_modules). |
+| `move_file` | false | Move or rename a file from source_path to destination_path. Creates the destination parent directory as needed. Use instead of shell mv, Move-Item, or ren for file moves so workspace confinement and file-edit permissions apply. |
+| `multi_edit` | false | Apply a list of edits to a single file atomically: each edit runs against the result of the previous one, all in memory; the file is rewritten only if every edit succeeds. Cheaper and safer than chaining edit_file calls - a failure in step 3 leaves the file untouched instead of half-edited. |
+| `notebook_edit` | false | Edit one cell of a Jupyter notebook (.ipynb). Target a cell by 0-based cell_number (or cell_id). edit_mode: "replace" (default) swaps the cell's source; "insert" adds a new cell after cell_number (use -1 to prepend at the top), taking cell_type and new_source; "delete" removes the cell. cell_type is "code" or "markdown" (required for insert). Editing a code cell clears its outputs. Prefer this over edit_file for notebooks - it keeps the JSON valid. |
+| `read_file` | true | Read a text file with optional line offset/limit. Output prefixes each line with its 1-based number so subsequent edit_file calls can target exact lines. Use `offset` and `limit` to page through large files; the tool reports total length and pagination hints in a trailer. |
+| `todo_write` | true | Record and update a structured task list for the current work. Send the COMPLETE list every call - it replaces the previous one. Use it to plan multi-step work and show progress: keep exactly one item in_progress at a time, and flip an item to completed the moment it's done (don't batch completions). Skip it for trivial single-step tasks. |
+| `wait` | true | Block until background jobs finish, then return each job's status and final output/answer. Use to collect the result of a task(run_in_background) or bash(run_in_background) before continuing. Omit job_ids to wait for every running job. |
+| `web_fetch` | true | Fetch a URL over HTTPS/HTTP and return its text content. HTML pages are reduced to readable text; JSON / plain text / markdown bodies come back verbatim. Use to read documentation pages, API responses, or source files hosted somewhere the local filesystem can't reach. |
+| `write_file` | false | Write content to a file at the given path (overwriting existing content). Creates parent directories as needed. |
+
+## Schema Snapshot
+
+The exact canonical schemas are intentionally tested in code rather than copied by hand here. Run:
+
+```bash
+go test ./internal/tool -run TestBuiltinToolContractDocumentation
+```
+
+The test checks that every registered built-in tool has a documented name, read-only flag, description row, and canonical schema generated by `tool.BuiltinContractEntries`.
+
+## Default Full Boot Surface
+
+In a default full-token boot, Reasonix sends the built-in tools above plus the
+session, memory, skill, subagent, LSP, install, and slash-command tools below:
+
+`ask`, `explore`, `forget`, `history`, `install_skill`, `install_source`,
+`list_sessions`, `lsp_definition`, `lsp_diagnostics`, `lsp_hover`,
+`lsp_references`, `memory`, `parallel_tasks`, `read_only_skill`,
+`read_only_task`, `read_session`, `read_skill`, `remember`, `research`,
+`review`, `run_skill`, `security_review`, `slash_command`, `task`.
+
+`internal/boot.TestBootToolContractMatchesProviderVisibleSurface` verifies the
+actual boot registry contract against the provider request, including read-only
+flags and canonical schemas.
+
+## Token Economy Boot Surface
+
+In token economy mode, Reasonix starts with the core coding/session/memory tools
+and the connector used to enable optional sources on demand:
+
+`ask`, `connect_tool_source`, `forget`, `history`, `list_sessions`, `memory`,
+`read_session`, `remember`, `slash_command`.
+
+Core built-in tools such as `bash`, `read_file`, `grep`, file writers, job tools,
+and `todo_write` remain available in economy mode and are listed in the built-in
+table above.
@@ -0,0 +1,58 @@
+# 工具合约
+
+<a href="./TOOL_CONTRACT.md">English</a>
+
+本文记录 Reasonix 编译期内置工具的 provider-visible 合约。运行时 registry 使用同一条 canonical schema 路径；测试会校验这里列出的工具名、read-only 标记和 schema 快照不会漂移。
+
+| 工具 | Read-only | 说明 |
+| --- | --- | --- |
+| `bash` | false | 执行 shell 命令并返回 stdout/stderr。构建、测试、git、包管理器等使用它；读写查找文件优先使用专用工具。 |
+| `bash_output` | true | 读取后台 `bash` 或 `task` job 自上次读取后的新增输出和状态。 |
+| `code_index` | true | 轻量内置代码符号索引；优先使用 `lsp_*` 或代码图 MCP，缺失时用它兜底。 |
+| `complete_step` | true | 用证据记录已批准计划中一个步骤的完成情况。 |
+| `delete_range` | false | 用精确 start/end 文本锚点删除文件中的连续范围。 |
+| `delete_symbol` | false | 用 Go AST 删除 Go 源文件中的命名符号。 |
+| `edit_file` | false | 将文件中的唯一精确字符串替换为另一个字符串。 |
+| `glob` | true | 查找匹配 glob pattern 的文件。 |
+| `grep` | true | 在文件或目录下按正则搜索文本。 |
+| `kill_shell` | false | 终止后台 `bash` 或 `task` job。 |
+| `ls` | true | 列出目录条目，可递归。 |
+| `move_file` | false | 移动或重命名文件。 |
+| `multi_edit` | false | 对单个文件原子应用多个编辑。 |
+| `notebook_edit` | false | 编辑 Jupyter notebook 的单个 cell。 |
+| `read_file` | true | 按可分页的行号格式读取文本文件。 |
+| `todo_write` | true | 记录并替换当前工作的结构化任务列表。 |
+| `wait` | true | 等待后台 job 完成并返回最终输出。 |
+| `web_fetch` | true | 通过 HTTP/HTTPS 获取 URL 文本内容。 |
+| `write_file` | false | 写入文件内容，必要时创建父目录。 |
+
+## Schema 快照
+
+完整 canonical schema 不在文档中手写，避免文档和代码手工漂移。运行：
+
+```bash
+go test ./internal/tool -run TestBuiltinToolContractDocumentation
+```
+
+该测试会用 `tool.BuiltinContractEntries` 校验每个内置工具都有文档行、read-only 标记、非空 description 和 canonical JSON schema。
+
+## 默认 Full Boot Surface
+
+默认 full-token boot 会发送上面的内置工具，并额外发送 session、memory、skill、subagent、LSP、install 和 slash-command 工具：
+
+`ask`, `explore`, `forget`, `history`, `install_skill`, `install_source`,
+`list_sessions`, `lsp_definition`, `lsp_diagnostics`, `lsp_hover`,
+`lsp_references`, `memory`, `parallel_tasks`, `read_only_skill`,
+`read_only_task`, `read_session`, `read_skill`, `remember`, `research`,
+`review`, `run_skill`, `security_review`, `slash_command`, `task`.
+
+`internal/boot.TestBootToolContractMatchesProviderVisibleSurface` 会校验真实 boot registry 合约和 provider request 一致，包括 read-only 标记和 canonical schema。
+
+## Token Economy Boot Surface
+
+token economy 模式启动时保留核心编码、session、memory 工具，以及按需启用可选来源的 connector：
+
+`ask`, `connect_tool_source`, `forget`, `history`, `list_sessions`, `memory`,
+`read_session`, `remember`, `slash_command`.
+
+`bash`、`read_file`、`grep`、文件写工具、后台 job 工具和 `todo_write` 等核心内置工具在 economy 模式下仍可用，见上方内置工具表。