docs: CLIにフォーカスしたコーディングエージェントの推奨リストを更新し、評価方法とスコアを追加

roflsunriz · roflsunriz · commit 73779fae9a7b · 2026-04-29T15:01:05.000+09:00
diff --git a/docs/CODING-AGENTS-en-US.md b/docs/CODING-AGENTS-en-US.md
@@ -13,41 +13,113 @@ Quick summary
 ## Top recommendations (CLI perspective)
 
 1. OpenCode — Broad multi-provider support and rich extension surface. Easy to integrate local models and custom providers; strong CLI primitives for sessions and webfetch.
-2. Codex CLI — OpenAI's official CLI with plugin/marketplace momentum and Ollama/local integrations suitable for commercial deployments.
-3. Aider — Lightweight, low-friction CLI-first assistant for everyday edits and pair-programming in the terminal.
-4. Crush — Polished TUI and multi-model/LSP support for users who spend most of their time in a terminal.
-5. Goose / Qwen Code — Strong MCP/extension ecosystems and multi-provider capabilities for building complex CLI workflows.
+2. Goose — Strong MCP/extension ecosystem, skills marketplace, session persistence, and automation-oriented workflow design.
+3. Codex CLI — OpenAI's official CLI with plugin/marketplace momentum and Ollama/local integrations suitable for commercial deployments.
+4. Qwen Code — Fast-moving multi-provider agent with MCP, extensions, skills, WebSearch/WebFetch, and checkpointing.
+5. Crush — Polished TUI and multi-model/LSP support for users who spend most of their time in a terminal.
+6. Plandex — Strong long-running planning, branching, and context-management workflow for larger implementation plans.
+7. Aider — Lightweight, low-friction CLI-first assistant for everyday edits and pair-programming in the terminal.
+
+## Evaluation method
+
+The comparison scope is intentionally narrow: publicly available coding agents that can be used from the CLI and can switch models or providers at the API/configuration level.
+
+Scores prioritize agent-platform capabilities, not only editing quality. Tools with MCP, plugins, skills, marketplace/distribution, session management, auto-compaction, web search/fetch, and flexible provider support score higher than editor-only tools.
+
+Feature cells use `C/M/S = Completeness / Maturity / Stability`, each on a 0-5 scale.
+
+- Completeness: how fully the feature exists as a user-facing product capability.
+- Maturity: documentation quality, release history, and implementation age.
+- Stability: recent issue pattern, known rough edges, and operational predictability.
+
+## Overall score
+
+| Rank | Tool | Score | Subscription bonus | Best fit |
+|---:|---|---:|---:|---|
+| 1 | OpenCode | 88 | +5 | Best overall CLI agent platform with broad provider, MCP, skills, plugin, web, and session coverage |
+| 2 | Goose | 86 | +5 | Best for MCP/extension-driven automation workflows |
+| 3 | Codex CLI | 84 | +5 | Best official/commercial ecosystem choice for OpenAI-oriented teams |
+| 4 | Qwen Code | 79 | +4 | Fast-moving multi-provider agent with strong extension potential |
+| 5 | Crush | 77 | +5 | Best terminal UX/TUI choice with multi-model and LSP support |
+| 6 | Plandex | 74 | +0 | Strongest long-running planning and context-management workflow |
+| 7 | Aider | 71 | +3 | Most practical lightweight terminal editor assistant |
+
+```mermaid
+xychart-beta
+    title "Overall CLI Coding Agent Score"
+    x-axis [OpenCode, Goose, Codex, QwenCode, Crush, Plandex, Aider]
+    y-axis "score" 0 --> 100
+    bar [88, 86, 84, 79, 77, 74, 71]
+```
+
+## Feature comparison
+
+| Tool | MCP | Plugin | Skills | Marketplace | Resume | Sessions |
+|---|---|---|---|---|---|---|
+| OpenCode | 5/4/4 | 5/4/4 | 5/4/4 | 1/1/3 | 5/4/3 | 5/4/3 |
+| Goose | 5/4/4 | 5/4/4 | 5/4/4 | 5/4/4 | 5/4/4 | 5/4/4 |
+| Codex CLI | 5/4/4 | 5/4/3 | 5/4/3 | 5/4/3 | 5/4/3 | 5/4/3 |
+| Qwen Code | 5/3/3 | 5/3/3 | 5/3/4 | 4/3/3 | 4/3/3 | 2/2/3 |
+| Crush | 5/4/3 | 0/0/0 | 4/4/4 | 0/0/0 | 4/4/3 | 4/4/3 |
+| Plandex | 0/0/0 | 0/0/0 | 0/0/0 | 0/0/0 | 4/5/4 | 5/5/4 |
+| Aider | 0/0/0 | 0/0/0 | 0/0/0 | 0/0/0 | 4/5/4 | 2/4/4 |
+
+| Tool | Auto-compaction | Tools | WebSearch | WebFetch | Read | Edit | Subscription usable |
+|---|---|---|---|---|---|---|---|
+| OpenCode | 5/4/4 | 5/4/4 | 5/4/4 | 5/4/4 | 5/4/4 | 5/4/4 | Yes |
+| Goose | 5/4/4 | 5/4/4 | 4/4/3 | 4/4/4 | 5/4/4 | 5/4/4 | Yes |
+| Codex CLI | 5/4/3 | 5/4/3 | 4/4/3 | 3/3/3 | 5/4/3 | 5/4/3 | Yes |
+| Qwen Code | 2/2/3 | 5/3/3 | 5/3/3 | 4/3/3 | 5/3/3 | 5/3/3 | Yes |
+| Crush | 4/3/3 | 4/4/3 | 1/1/2 | 0/0/0 | 4/4/4 | 4/4/4 | Yes |
+| Plandex | 5/5/4 | 3/4/4 | 0/0/0 | 3/4/4 | 5/5/4 | 4/5/4 | Unconfirmed |
+| Aider | 5/5/4 | 4/5/4 | 0/0/0 | 4/5/4 | 5/5/4 | 5/5/4 | Yes |
 
 ## Per-tool CLI notes
 
 OpenCode
 
 - CLI-first extensibility: supports 75+ providers, plugins/skills, native websearch/webfetch.
 - Full session features (continue/fork/share) and auto-compaction policies.
-- Caveat: compaction, model discovery, and some TUI areas still show rough edges — plan operational monitoring.
+- Best when you want one terminal-first agent platform that can cover provider switching, MCP, skills, web tools, and long-running sessions.
+- Caveat: marketplace-style distribution is less mature than Goose or Codex CLI, and compaction/model discovery/TUI issues still need operational monitoring.
+
+Goose
+
+- Strong MCP-centered architecture with extension directory, skills marketplace, session persistence, memory-oriented features, and auto-compaction.
+- Best when you want to build repeatable automation workflows rather than only edit code interactively.
+- Caveat: some provider bridge modes may not expose the full Goose extension ecosystem, and the project is still changing quickly.
 
 Codex CLI
 
 - Official OpenAI CLI. `config.toml` and Ollama integration enable custom provider/local model workflows.
 - Growing plugin/marketplace ecosystem and favorable for commercial/enterprise use.
-- Caveat: verify history visibility and compact behavior after provider switches.
+- Best when your team already uses ChatGPT/Codex subscriptions or needs an official commercial ecosystem.
+- Caveat: custom provider and local-model switching still have rough edges, especially around history visibility, model switchers, and compact behavior.
+
+Qwen Code
+
+- Fast-growing multi-provider CLI with `modelProviders`, MCP, extensions, skills, WebSearch/WebFetch, checkpointing, and read/edit tools.
+- Best when you want a broad modern agent surface and can tolerate a younger implementation.
+- Caveat: recent issues around auth, provider display, and MCP/provider connections mean environment-level validation is important.
 
 Aider
 
 - Practical CLI modes (`/ask`, `/architect`, `/web`, `--read`) and strong Git/diff editing flow compatibility.
 - Low adoption cost and predictable behavior for everyday edits.
-- Caveat: limited MCP/plugins/marketplace extensibility compared to OpenCode or Codex CLI.
+- Best when you mainly want fast, predictable terminal-based edits rather than a full agent platform.
+- Caveat: limited MCP/plugins/skills/marketplace extensibility compared to OpenCode, Goose, Codex CLI, or Qwen Code.
 
 Crush
 
 - Polished TUI from Charmbracelet, with multi-model and LSP support for terminal-first power users.
-- Caveat: websearch/webfetch and marketplace features appear limited.
+- Agent skills, LSP integration, and session-based context make it a strong terminal UX option.
+- Caveat: plugin, marketplace, and native websearch/webfetch capabilities are limited compared to the platform-style agents.
 
-Goose / Qwen Code
+Plandex
 
-- Goose: strong MCP/extension directory/skills marketplace and session persistence; can bridge CLI providers and existing subscriptions.
-- Qwen Code: multi-provider `modelProviders` switching, MCP, Skills, WebSearch/WebFetch, and checkpointing rapidly expanding.
-- Caveat: recent issues around auth and provider connections — validate stability in your environment.
+- Strong long-running planning model with plans, branches, model packs, role-based models, and context-window management.
+- Best when the work is a long multi-step implementation plan rather than ad-hoc pair editing.
+- Caveat: weak MCP/plugin/skills/marketplace/native websearch coverage under this evaluation model.
 
 ## How to choose (CLI-focused)
 
diff --git a/docs/CODING-AGENTS-ja-JP.md b/docs/CODING-AGENTS-ja-JP.md
@@ -13,41 +13,113 @@
 ## 推奨トップ（CLI観点）
 
 1. OpenCode — 多プロバイダ対応と豊富な拡張が魅力。ローカルモデルやカスタム provider を組み込みやすく、セッション管理や webfetch 等の CLI 機能が充実しています。
-2. Codex CLI — OpenAI 系の公式 CLI。plugin / marketplace エコシステムと Ollama 等のローカル統合が強みで、商用環境に向きます。
-3. Aider — 学習コストが小さく安定して使える軽量 CLI ツール。日常の編集やペアプロ向けに最短距離で使えます。
-4. Crush — Charmbracelet 流の優れた TUI を持ち、端末での操作感を重視するユーザーに合います。
-5. Goose / Qwen Code — MCP・Extensions・Skills を重視する場合の有力候補。multi-provider 機能で複数サブスクやプロバイダーをまたいだ運用に向きます。
+2. Goose — MCP / extension ecosystem、skills marketplace、session persistence、自動化ワークフローの設計が強みです。
+3. Codex CLI — OpenAI 系の公式 CLI。plugin / marketplace エコシステムと Ollama 等のローカル統合が強みで、商用環境に向きます。
+4. Qwen Code — MCP、Extensions、Skills、WebSearch/WebFetch、checkpointing を備えた成長の速い multi-provider agent です。
+5. Crush — Charmbracelet 流の優れた TUI を持ち、端末での操作感を重視するユーザーに合います。
+6. Plandex — 長時間の計画実行、branch、context 管理を重視する大きめの実装計画に向きます。
+7. Aider — 学習コストが小さく安定して使える軽量 CLI ツール。日常の編集やペアプロ向けに最短距離で使えます。
+
+## 評価方法
+
+比較対象は「公開配布されている」「CLI から使える」「API または設定レベルでモデルや provider を差し替えられる」Coding Agent に絞っています。
+
+採点では、単純なコード編集品質だけでなく、エージェント基盤としての広さを重視しました。具体的には MCP、plugin、skills、marketplace/distribution、session 管理、auto-compaction、WebSearch/WebFetch、provider 切替の柔軟性が強いツールほど高得点になります。
+
+機能比較のセルは `C/M/S = 完成度 / 成熟度 / 安定度` で、各 0〜5 点です。
+
+- 完成度: その機能が製品機能としてどこまで揃っているか。
+- 成熟度: ドキュメント、release 運用、実装年齢、継続更新の強さ。
+- 安定度: recent issue、既知バグ、実運用時の予測可能性。
+
+## 総合スコア
+
+| 順位 | ツール | スコア | サブスク加点 | 向いている用途 |
+|---:|---|---:|---:|---|
+| 1 | OpenCode | 88 | +5 | provider、MCP、skills、plugin、Web、session を広く備えた総合型 CLI agent |
+| 2 | Goose | 86 | +5 | MCP / extension を中心に自動化ワークフローを組みたい場合 |
+| 3 | Codex CLI | 84 | +5 | OpenAI 系の公式・商用エコシステムを重視するチーム |
+| 4 | Qwen Code | 79 | +4 | 拡張性が高く成長の速い multi-provider agent を試したい場合 |
+| 5 | Crush | 77 | +5 | terminal UX / TUI と multi-model / LSP を重視する場合 |
+| 6 | Plandex | 74 | +0 | 長時間の計画実行と context 管理を重視する場合 |
+| 7 | Aider | 71 | +3 | 軽量で実用的な端末ベースの編集支援が欲しい場合 |
+
+```mermaid
+xychart-beta
+    title "CLI Coding Agent 総合スコア"
+    x-axis [OpenCode, Goose, Codex, QwenCode, Crush, Plandex, Aider]
+    y-axis "score" 0 --> 100
+    bar [88, 86, 84, 79, 77, 74, 71]
+```
+
+## 機能比較
+
+| ツール | MCP | Plugin | Skills | Marketplace | Resume | Sessions |
+|---|---|---|---|---|---|---|
+| OpenCode | 5/4/4 | 5/4/4 | 5/4/4 | 1/1/3 | 5/4/3 | 5/4/3 |
+| Goose | 5/4/4 | 5/4/4 | 5/4/4 | 5/4/4 | 5/4/4 | 5/4/4 |
+| Codex CLI | 5/4/4 | 5/4/3 | 5/4/3 | 5/4/3 | 5/4/3 | 5/4/3 |
+| Qwen Code | 5/3/3 | 5/3/3 | 5/3/4 | 4/3/3 | 4/3/3 | 2/2/3 |
+| Crush | 5/4/3 | 0/0/0 | 4/4/4 | 0/0/0 | 4/4/3 | 4/4/3 |
+| Plandex | 0/0/0 | 0/0/0 | 0/0/0 | 0/0/0 | 4/5/4 | 5/5/4 |
+| Aider | 0/0/0 | 0/0/0 | 0/0/0 | 0/0/0 | 4/5/4 | 2/4/4 |
+
+| ツール | Auto Compaction | Tools | WebSearch | WebFetch | Read | Edit | サブスク利用 |
+|---|---|---|---|---|---|---|---|
+| OpenCode | 5/4/4 | 5/4/4 | 5/4/4 | 5/4/4 | 5/4/4 | 5/4/4 | あり |
+| Goose | 5/4/4 | 5/4/4 | 4/4/3 | 4/4/4 | 5/4/4 | 5/4/4 | あり |
+| Codex CLI | 5/4/3 | 5/4/3 | 4/4/3 | 3/3/3 | 5/4/3 | 5/4/3 | あり |
+| Qwen Code | 2/2/3 | 5/3/3 | 5/3/3 | 4/3/3 | 5/3/3 | 5/3/3 | あり |
+| Crush | 4/3/3 | 4/4/3 | 1/1/2 | 0/0/0 | 4/4/4 | 4/4/4 | あり |
+| Plandex | 5/5/4 | 3/4/4 | 0/0/0 | 3/4/4 | 5/5/4 | 4/5/4 | 未確認 |
+| Aider | 5/5/4 | 4/5/4 | 0/0/0 | 4/5/4 | 5/5/4 | 5/5/4 | あり |
 
 ## 各ツールのCLI観点ポイント
 
 OpenCode
 
 - CLI-first な拡張性: 75+ provider、plugins/skills、websearch/webfetch をネイティブでサポート。
 - セッションの継続・フォーク・共有、auto-compaction などエージェント基盤が充実。
-- 注意: compaction や model discovery、TUI の一部に粗さがあるため、運用ポリシーと監視が必要。
+- provider 切替、MCP、skills、Web Tools、長いセッションを1本で広く扱いたい場合に最も向く。
+- 注意: Goose や Codex CLI と比べると marketplace 的な配布面はまだ薄く、compaction / model discovery / TUI 周辺は運用監視が必要。
+
+Goose
+
+- MCP を中核にした設計が強く、extension directory、skills marketplace、session persistence、memory 系機能、auto-compaction が揃う。
+- 単なる対話編集ではなく、再利用できる自動化ワークフローを CLI で組みたい場合に向く。
+- 注意: CLI provider bridge の使い方によっては Goose 側の extension ecosystem をフルに使えない場合があり、プロジェクト自体の変化も速い。
 
 Codex CLI
 
 - OpenAI 公式の CLI。`config.toml` や Ollama 統合でカスタム provider / ローカルモデルの運用が可能。
 - plugin / marketplace の成長が速く、商用や組織利用での互換性が取りやすい。
-- 注意: provider 切替後の履歴表示や compact 周りの挙動は事前検証を推奨。
+- ChatGPT / Codex 系サブスクリプションを既に使っているチームや、公式・商用エコシステムを重視する場合に向く。
+- 注意: custom provider / local model 切替は使える一方、model switcher、履歴表示、compact 周辺にはまだ粗さが残る。
+
+Qwen Code
+
+- `modelProviders` による multi-provider 切替、MCP、Extensions、Skills、WebSearch/WebFetch、checkpointing、read/edit tool が急速に揃っている。
+- 現代的な agent 機能を広く試したいが、上位勢より若い実装でも許容できる場合に向く。
+- 注意: auth、provider 表示、MCP / provider connection 周辺の recent issues があるため、導入前の環境検証が重要。
 
 Aider
 
 - `/ask`、`/architect`、`/web`、`--read` 等、端末操作で使いやすいモードが揃う。
 - Git/diff ベースの編集に馴染むため、既存の開発フローに入りやすい。
-- 注意: MCP・プラグイン・Marketplaceなど拡張機構は弱め。
+- agent platform というより、素早く安定した端末ベース編集支援が欲しい場合に強い。
+- 注意: MCP・plugin・skills・marketplace は OpenCode / Goose / Codex CLI / Qwen Code と比べて弱い。
 
 Crush
 
 - 優れた TUI と multi-model/LSP 対応が特徴。端末上で快適な操作感を提供。
-- 注意: websearch/webfetch や marketplace は限定的に見える。
+- agent skills、LSP 統合、session-based context により、terminal UX 重視の選択肢として強い。
+- 注意: plugin、marketplace、native websearch/webfetch は platform 型 agent と比べて限定的。
 
-Goose / Qwen Code
+Plandex
 
-- Goose: MCP/extension directory/skills marketplace が強く、session persistence や provider ブリッジが充実。
-- Qwen Code: `modelProviders` による multi-provider 切替、WebSearch/WebFetch、checkpointing などの CLI 機能を拡充中。
-- 注意: auth・provider 接続周りに recent issues があるため運用テストは必須。
+- plans、branches、model packs、role-based models、context-window management など、長時間の計画実行に強い。
+- ad-hoc なペア編集より、複数ステップの長い実装計画を管理したい場合に向く。
+- 注意: MCP・plugin・skills・marketplace・native websearch は今回の評価軸では弱い。
 
 ## 選び方（CLI向け）