Small, reviewable, validation-gated agent skills for Codex-style project work.
ABVX Agent Skills is a small, auditable skillpack for coding agents that helps them write smaller diffs, debug from evidence, compact noisy shell output, and verify work before saying done.
These are not prompt dumps. They are compact SKILL.md workflows with clear triggers, attribution, risk notes, and validation. They are portable, versioned agent capabilities meant to be previewed, inspected, and loaded on demand through the Agent Skills progressive-disclosure model.
Preview before installing:
gh skill preview markoblogo/abvx-agent-skills minimal-diff-builderInstall one skill:
gh skill install markoblogo/abvx-agent-skills minimal-diff-builder --agent codex --scope userThen ask your coding agent:
Use minimal-diff-builder. Implement the smallest correct fix for this issue.
The newer bet in this pack is LoopOps: useful skills should not compete with stronger base models by restating generic advice. They should capture repo-specific context, tool adapters, verification gates, and supervisor contracts that can promote repeated work into scripts, workflows, and cost-bounded agent loops.
This repository assumes that many public AI skills are net-negative. The bar here is not novelty or stars. The bar is whether a skill adds usable structure without degrading behavior.
Video context: I scraped AI skills from GitHub and tested whether they actually help models
Browse the searchable catalog at lab.abvx.xyz/tools/abvx-agent-skills/. The page is powered by the generated catalog data in docs/catalog.json, so the repository remains the source of truth while the published catalog lives on ABVX Lab.
If you want a scan-friendly text catalog for browsing or indexing, use CATALOG.md.
| Job | Install | Use when |
|---|---|---|
| Write smaller patches | minimal-diff-builder |
The agent keeps refactoring too much, widening blast radius, or adding abstractions you did not ask for. |
| Debug from evidence | diagnose |
The agent keeps guessing fixes without reproducing the failure and verifying the result. |
| Save tokens in shell-heavy work | rtk-assisted-shell, shell-output-compaction, token-efficient-execution |
Logs, diffs, tests, and command output are burning context and hiding the real signal. |
| Verify frontend work | browser-verification, design-critique-polish |
The agent says "done" without checking real browser behavior, layout, states, or console errors. |
LoopOps is the framework layer in this repo: it decides when a repeated prompt should remain a prompt and when it should become a checklist, skill, script, or bounded loop.
See:
- docs/loopops-guide.md
loopops-protocoldynamic-workflow-packetsskillopt-evolve-skills
- Need to save tokens? Start with
rtk-assisted-shell,shell-output-compaction,token-efficient-execution, andlean-context-layout. Addcompaction-survivalif your sessions run long enough to forget their own state. - Need to debug a repo? Start with
diagnose,repo-debugging-ledger, andgraph-guided-code-reading. - Need the smallest correct implementation path? Start with
minimal-diff-builder, then adddelivery-preflight-gatewhen the task is long or risky enough that baseline verification matters. - Need to cut bloat from an existing diff or repo slice? Start with
overengineering-review, and switch tominimal-diff-builderwhen you want the cuts implemented as the smallest correct patch. - Need to build frontend? Start with
frontend-product-builder,designmd-brand-kit, andbrowser-verification. - Need a small Lottie or SVG-driven motion asset? Start with
lottie-motion-builder, then pair withfrontend-product-builderwhen the animation needs to land inside a real UI surface. - Need a standalone HTML artifact? Start with
html-diagram-artifactfor SVG-first architecture explainers, orhtml-brief-artifactfor plans, summaries, reports, and research notes. - Need stronger UI taste or design setup? Start with
design-register-bootstrap,frontend-taste-layer, anddesign-critique-polish. - Need long-session continuity? Start with
handoff,compaction-survival, andtoken-usage-audit. - Need to onboard a new repo? Start with
project-context-bootstrapand follow withdurable-context-maintenance. - Need discovery or product shaping? Start with
rapid-grilling,doc-grounded-grilling, andspec-to-prd. - Need to turn plans into execution? Start with
plan-to-issues,repo-issue-triage, andtest-driven-execution. - Need safer long delivery runs? Start with
delivery-preflight-gate,phase-spec-execution,recovery-loop-3strike, anddelivery-baseline-audit. - Need a full multi-track workflow? Start with
dynamic-workflow-packets. - Need to turn repeated prompts into loops? Start with
loopops-protocol, then useskillopt-evolve-skillsto capture durable lessons. - Need to build reusable assistant packs? Start with
role-skill-pack-design,workflow-policy-layering,brief-first-execution, andprivate-vs-publishable-skill-audit.
These skills are grouped by the job they do. The token-economy layer is intentionally visible first: for many teams, the easiest win is not “a smarter prompt”, but less wasted context.
| Skill | What It Does |
|---|---|
rtk-assisted-shell |
Routes noisy shell workflows through RTK-style filtering. On shell-heavy tasks this can cut command-output tokens dramatically, often in the same range as RTK's reported 60-90% savings on common dev commands. |
shell-output-compaction |
Shrinks logs, diffs, and repo search output into counts, slices, and error-first excerpts. Usually the fastest way to turn multi-screen stdout into a small, usable artifact. |
graph-guided-code-reading |
Replaces broad repo reading with entrypoints, symbols, dependencies, and blast radius. On large codebases this can turn “read everything” into a much smaller focus set. |
token-efficient-execution |
Cuts waste from repeated reads, broad rewrites, and low-value narration. Best for long coding sessions where the loop, not the final answer, is burning the budget. |
token-frugal-mode |
Compresses final answers without dropping the decisive technical signal. Useful when the session is tight and you want shorter replies without caveman-style degradation. |
lean-context-layout |
Shrinks always-loaded agent docs into a compact startup core and pushes the rest on demand. Best for bloated AGENTS.md, CLAUDE.md, and repo runbooks. |
compaction-survival |
Preserves the high-value working state before long sessions collapse into compaction. Saves the turns you would otherwise spend reconstructing “what were we doing?”. |
token-usage-audit |
Diagnoses where the budget is really going: startup bloat, shell noise, repeated reads, oversized summaries, or compaction loss. Use this before over-optimizing the wrong layer. |
| Skill | What It Does |
|---|---|
diagnose |
Runs a disciplined debugging loop around one reproducible signal, ranked hypotheses, and narrow verification. |
repo-debugging-ledger |
Keeps a checked-location ledger so debugging does not keep reopening the same code and repeating the same dead ends. |
complexity-optimizer |
Finds safe complexity and performance simplifications without turning the codebase into a refactor festival. |
minimal-diff-builder |
Builds the smallest correct implementation path using a YAGNI, stdlib-first, native-first, minimal-diff ladder with explicit safety exceptions. |
overengineering-review |
Reviews code specifically for needless abstractions, replaceable dependencies, dead flexibility, and wrappers over stdlib or platform behavior. |
architecture-deepening-review |
Reviews deeper module seams, coupling, change surfaces, and testability, not just top-level architecture slogans. |
test-driven-execution |
Builds features and fixes through one-behavior-at-a-time red-green-refactor loops instead of broad speculative implementation. |
system-zoom-out |
Pulls a local code area back into its wider system map so you can reason about callers, modules, boundaries, and blast radius. |
agents-best-practices |
Hardens agent harnesses around permissions, context shape, safety, and evaluation discipline. |
skillopt-evolve-skills |
Improves agent instructions and skills from real task evidence rather than from theory alone. |
| Skill | What It Does |
|---|---|
design-register-bootstrap |
Establishes compact design context before implementation: brand vs product register, audience, anti-references, color strategy, and PRODUCT.md / DESIGN.md direction. |
frontend-taste-layer |
Adds a stronger anti-slop design layer to frontend work so outputs stop looking templated, generic, or visually under-committed. |
design-critique-polish |
Runs a focused critique-and-polish pass to rank frontend issues, identify ship blockers, and tighten hierarchy, typography, color, and states. |
frontend-product-builder |
Builds usable frontends, landing pages, pitch pages, dashboards, and prototypes with a product-first interaction model. |
lottie-motion-builder |
Builds small production-ready Lottie assets from SVGs, logos, loaders, and UI states with a local preview harness and output verification. |
designmd-brand-kit |
Turns a website or brand surface into an agent-usable design system: structure, identity, and reusable UI cues. |
browser-verification |
Verifies real browser rendering, responsive layout, and interaction behavior instead of trusting static code inspection. |
web-quality-audit |
Audits accessibility, performance, UX, privacy, and browser security as one practical web quality pass. |
prototype-lab |
Rapid throwaway builds for testing interaction, logic, and product direction before committing to heavier implementation. |
| Skill | What It Does |
|---|---|
html-diagram-artifact |
Creates standalone HTML/SVG diagrams for architecture, request paths, component relationships, and system explainers with minimal prose and browser-verifiable dark mode. |
html-brief-artifact |
Creates standalone HTML briefs for plans, status updates, PR summaries, incident notes, and research explainers without drifting into a full frontend build. |
For design-heavy repos, pair this section with design-register-bootstrap from the frontend section.
| Skill | What It Does |
|---|---|
project-context-bootstrap |
Detects the stack, asks the right project questions, and turns a weakly documented repo into a compact, agent-usable context surface. |
durable-context-maintenance |
Keeps repo-local context current after architecture, workflow, and test-flow changes so agents stop rediscovering the same facts. |
| Skill | What It Does |
|---|---|
rapid-grilling |
Quickly sharpens vague ideas through one-question-at-a-time alignment before heavier planning starts. |
doc-grounded-grilling |
Stress-tests a plan against repo docs, ADRs, design assets, and domain language so discovery stays grounded in reality. |
spec-to-prd |
Turns clarified context into a durable PRD for product, client, and internal roadmap work. |
plan-to-issues |
Breaks PRDs and plans into thin end-to-end slices that agents or humans can actually pick up. |
repo-issue-triage |
Moves bugs and enhancements through a compact state machine so backlog items become actionable instead of vague. |
| Skill | What It Does |
|---|---|
evidence-ledger-research |
Keeps claims, sources, calculations, and open questions in a disciplined evidence ledger. |
loopops-protocol |
Chooses when repeated agent work should stay a prompt or be promoted into a skill, checklist, script, workflow, or cost-bounded loop. |
book-to-skill |
Converts books, papers, and long documents into reusable, progressive-disclosure agent skills. |
role-skill-pack-design |
Designs compact role/workflow skill packs with base layers, difference layers, boundaries, and rollout order. |
workflow-policy-layering |
Separates workflow from authority, escalation, forbidden actions, and validation so assistant specs stop contradicting themselves. |
brief-first-execution |
Starts non-trivial work with one live brief for scope, non-goals, risks, verification, and done criteria. |
private-vs-publishable-skill-audit |
Audits private skill packs before publication and extracts only the reusable layer. |
| Skill | What It Does |
|---|---|
dynamic-workflow-packets |
Orchestrates large coding, research, audit, or client-search tracks without losing verification and risk gates. |
handoff |
Produces compact continuation briefs for long-running work, agent resumes, and human handoffs. |
| Skill | What It Does |
|---|---|
delivery-preflight-gate |
Runs the minimum useful baseline checks before a long implementation loop starts, so pre-existing breakage does not poison later verification. |
phase-spec-execution |
Breaks larger delivery into explicit phases with acceptance criteria, verification commands, and lightweight state updates. |
recovery-loop-3strike |
Bounds execution failure handling to one evidence-bearing retry, one focused fix-spec, and then an honest blocker handoff. |
delivery-baseline-audit |
Re-checks declared deliverables and final verification against the starting baseline and full working tree before calling the task complete. |
| Skill | What It Does |
|---|---|
spreadsheet-workbook-forensics |
Repairs and edits spreadsheets where workbook structure, formulas, and cell-level verification matter. |
Fastest path for most users:
pip install abvx-agent-skills
abvx-skills installInstall with GitHub CLI agent-skills support:
gh skill install markoblogo/abvx-agent-skills minimal-diff-builderTarget a specific host or scope when needed:
gh skill install markoblogo/abvx-agent-skills minimal-diff-builder --agent codex --scope user
gh skill install markoblogo/abvx-agent-skills diagnose --agent cursor --scope projectgh skill is currently a GitHub CLI preview feature. Use GitHub CLI v2.90.0+. The command set and flags are documented in the official gh skill manual and the GitHub changelog announcement for GitHub CLI agent skills.
Published package pages:
- PyPI: https://pypi.org/project/abvx-agent-skills/
- TestPyPI: https://test.pypi.org/project/abvx-agent-skills/
Current distribution channels:
- PyPI: published
- TestPyPI: published
- Homebrew tap: published at https://github.com/markoblogo/homebrew-tap
- conda-forge: staged-recipes submission currently open at conda-forge/staged-recipes#33719
homebrew-core: not accepted for now under the Homebrew core notability policy; use the tap instead
Install one skill into Codex:
git clone https://github.com/markoblogo/abvx-agent-skills
cp -R abvx-agent-skills/skills/dynamic-workflow-packets ~/.codex/skills/Install all skills:
git clone https://github.com/markoblogo/abvx-agent-skills
cp -R abvx-agent-skills/skills/* ~/.codex/skills/Start a new agent session after installation so the skill descriptions are discovered.
Install one packaged skill into Codex:
abvx-skills install dynamic-workflow-packetsInstall to a custom destination:
abvx-skills install --destination ./tmp-skillsInstall via Homebrew tap:
brew tap markoblogo/tap
brew install abvx-agent-skillshomebrew-core is not the current install path for this project. The upstream submission was closed under the repository's notability policy, so the maintained Homebrew channel is the ABVX tap.
Smoke-test the published package from PyPI:
python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install abvx-agent-skills
abvx-skills list
abvx-skills validateBefore installing a skill, inspect it:
gh skill preview markoblogo/abvx-agent-skills minimal-diff-builderValidate local or packaged skills:
abvx-skills validate
gh skill publish --dry-runRun the static security audit:
abvx-skills audit-security ./skills --no-llmThis repository is intentionally optimized for inspection before trust: compact skill files, reviewable metadata, structural validation, and a publish dry-run that catches naming and metadata drift before release.
- Solo dev in Codex / Cursor / Claude Code / Gemini CLI: use docs/solo-dev-quickstart.md for a short install path plus a recommended starter stack.
- Team lead standardizing repo work: use docs/team-rollout-playbook.md for the minimum shared-skill rollout and repo hygiene path.
If you are listing the repo in curated skill directories, agent catalogs, or install surfaces, use docs/outreach/submission-kit.md for positioning and docs/outreach/targets.md for target tracking.
For the current first-wave outreach set, use docs/outreach/first-wave-submissions.md.
Each public skill includes:
SKILL.md- executable agent instructionsSKILL_CARD.md- intended use, attribution, risks, evaluation, and versionagents/openai.yaml- Codex UI metadata
The project follows the open Agent Skills shape: SKILL.md plus optional scripts/, references/, and assets/. For Codex compatibility, top-level frontmatter is kept conservative: name, description, license, metadata, and supported fields only.
The HTML artifact skills intentionally keep their deliverables single-file and dependency-light. Use them for explainers and briefs, not as substitutes for production frontend implementation.
Use this repo when a workflow has repeated often enough that it deserves a sharper portable behavior layer, not when you just have a long prompt.
Contribution path:
- Submit your own skill: draft it against docs/abvx-skillpack-profile.md, mirror the shape of an existing skill, and open a PR with the smallest useful slice.
- Request a missing skill: open a Skill Request when the repeated workflow is real but the right skill does not exist yet.
- Autopsy a broken skill: open a Skill Autopsy when an internal or external skill added noise, abstractions, or fake process and should be reduced into something stronger.
Good submissions usually have:
- a narrow trigger, not a vague domain
- one clear behavior change
- explicit anti-patterns or stop conditions
- honest verification instead of broad motivational prose
Use docs/solo-dev-quickstart.md and docs/team-rollout-playbook.md as examples of opinionated packaging aimed at real adoption paths rather than generic documentation.
python scripts/validate.pyOr validate the packaged skills through the CLI:
abvx-skills validateRun a static security audit with SkillSpector:
pip install git+https://github.com/NVIDIA/SkillSpector.git
abvx-skills audit-security ./skills --no-llmEvaluate reports against the repo policy and baseline:
python scripts/evaluate_skillspector.py \
--reports-dir artifacts/skillspector \
--policy .abvx/skillspector-policy.yaml \
--baseline .abvx/skillspector-baseline.jsonValidate a local skills directory:
abvx-skills validate ~/.codex/skillsStructural validation and security audit are separate gates. The validator checks required files, frontmatter, directory/name alignment, TODO placeholders, cards, UI metadata, and basic secret patterns.
Benchmark scaffolding now lives under benchmarks/. It documents how to measure skill impact without publishing fake precision. Until the repo has stable reproducible runs across tasks and models, benchmark numbers should be treated as pending evidence rather than marketing copy.
Build and check the package locally:
python -m pip install --upgrade build twine
python -m build
python -m twine check dist/*Publish flow:
- Run the
publishGitHub Actions workflow withrepository=testpypifor a dry run against TestPyPI. - Create a GitHub release, or run the same workflow with
repository=pypi, to publish to PyPI. - Configure trusted publishing for both
pypiandtestpypienvironments in the package index before the first release. - Keep the released version aligned with
pyproject.tomland the skill inventory documented above.
- Keep always-loaded context small.
- Prefer procedural rules over vague advice.
- Make skills easy to audit in diffs.
- Attribute upstream inspiration.
- Pair useful automation with risk gates and verification.
See docs/abvx-skillpack-profile.md for the repository standard.
Several skills are inspired by public work from the broader agent tooling ecosystem. See ATTRIBUTION.md.
MIT. See LICENSE.

