Releases · Harperbot/metal-guard

19 May 14:33

github-actions

v1.1.0

ac42b83

v1.1.0 Latest

Latest

Added

First-run onboarding wizard: metal-guard with no arguments scans for
recent kernel panics, explains them in plain language, and offers to
install the shell guard.
metal-guard diagnose — panic scan + explanation only.
metal-guard guard install|uninstall|status — reversible shell guard
that routes interactive-shell python through mlx-safe-python.

Changed

metal-guard with no arguments now runs onboarding instead of the
status snapshot. Use metal-guard status for the snapshot.
Panic scanner (parse_panic_reports) now also scans
/var/db/PanicReporter and ~/Library/Logs/DiagnosticReports, and
reads modern .ips kernel-panic reports in addition to .panic.

Removed

Third-party runtime dependencies. metal-guard's core is now pure
stdlib — requests (Ollama HTTP fallback) and psutil (memory
detection) replaced with urllib and vm_stat. pip install metal-guard pulls nothing transitive.

Assets 4

19 May 02:17

Harperbot

v1.0.0

9c325a5

v1.0.0 — first stable release

First stable release.

v1.0.0 consolidates the complete L1–L13 defence stack and the KNOWN_PANIC_MODELS registry into a stable public API.

Subprocess safety wire — pre-spawn gate stack (memory load gate, crash-burst recovery lockout), in-flight child-death detection (≤200 ms via select on the process sentinel), a validated done-frame protocol, and prefill-exposure tracking with advisory worker rotation.
SpawnRefused spawn gate — hard-blocks runner construction for panic-tier models; override with METALGUARD_LOCAL_PANIC_MODEL_BLOCK_DISABLED=1.
Engine fallback chain — metal_guard.fallback_chain.call_with_fallback() walks a configured tier list (mlx / ollama), falling forward on any failure.
Observer mode — METALGUARD_MODE=observer.

No API removals relative to v0.24.0 — this release is the stable cut of the surface delivered across the W3–W14 work.

Full per-release history and the incident behind each layer: see CHANGELOG.md.

Assets 2

19 May 02:20

Harperbot

v0.24.0

c8614fd

v0.24.0 — Drift-audit wave W14 — de-Harper cleanup sweep.

[0.24.0] — 2026-05-19

Drift-audit wave W14 — de-Harper cleanup sweep.

Changed

The G1 panic-cooldown gate in authorize_mlx_spawn now uses the
package's own panic_gate.evaluate_panic_cooldown() instead of an
optional external integration — G1 is now an always-on gate.
metal_guard_cli.py is now English-only and writes its breadcrumb
fallback to ~/.cache/metal-guard/.

Removed

A stale internal PR working-draft file.

Notes

Whole-repo pre-publish audit: removed every remaining hardcoded
private path, private-module reference, internal project name, and
Traditional-Chinese comment. Package metadata still attributes the
project to its publisher — that is intentional.

Assets 2

19 May 02:20

Harperbot

v0.23.0

5df5547

v0.23.0 — Drift-audit wave W13 — final special reconcile.

[0.23.0] — 2026-05-19

Drift-audit wave W13 — final special reconcile.

Added

MetalGuard.dynamic_headroom_gb() — a load-adaptive VRAM-headroom
estimate (base headroom plus a per-concurrent-thread term and a
KV-growth bonus).
SpawnRefused exception + the local-panic-advisory spawn gate in
MLXSubprocessRunner.__init__. When a model's advisory tier is
panic, construction is hard-blocked with SpawnRefused; the
METALGUARD_LOCAL_PANIC_MODEL_BLOCK_DISABLED=1 env var downgrades
the block to a warning. Lower advisory tiers warn only. (This
completes the gate deferred from wave W11.) SpawnRefused is
exported from the top-level facade.

Notes

Verify-only findings (no code change): the general panic_cooldown
API is already fully present in panic_gate; the panic_cooldown
ack-reason hardening is Harper-deployment-specific and intentionally
not ported. system_audit GPU-family detection is equivalent on
both sides — the public-only bench_scoped_load is reverse drift
out of this project's scope.
W13 is the final reconcile wave. Phase Final (the v1.0.0 release)
is the only remaining step.

Assets 2

19 May 02:20

Harperbot

v0.22.0

cce5b1c

v0.22.0 — Drift-audit wave W12 — top-level dispatch reconcile.

[0.22.0] — 2026-05-19

Drift-audit wave W12 — top-level dispatch reconcile.

Added

metal_guard/fallback_chain.py — tier-by-tier dispatch over the
process-local backends (mlx, ollama). On any tier failure /
lockout / daemon-unreachable the chain falls forward to the next
tier; all tiers exhausted returns an empty result with
all_local_failed telemetry (no raise). Includes the
workload_chains.json loader with a sha256 tamper watchdog and the
L1/L2 layer guard (cloud backends are refused at construction time).
metal_guard/inference.py — the in-process MLX dispatch path:
call_model, bench_scoped_load, pre_inference_guard,
safe_generate, encode_image, plus the prefill / load-gate /
process-lock pre-flight checks.

Notes

Both modules are accessible as submodules
(metal_guard.fallback_chain / metal_guard.inference); the
top-level facade __all__ is unchanged — re-exporting the dispatch
layer through the facade is a deferred curation decision.
fallback_chain's mlx-tier dispatch is an unwired stub
(NotImplementedError), ported as-is from the source — an mlx
tier in a chain falls through to the next tier.
inference.py is ported as a single module; splitting a backend
submodule out is a future refactor, not part of this reconcile.

Assets 2

19 May 02:20

Harperbot

v0.21.0

43677fd

v0.21.0 — Drift-audit wave W11 — subprocess_runner reconcile.

[0.21.0] — 2026-05-19

Drift-audit wave W11 — subprocess_runner reconcile.

Added

subprocess_runner.py reconciled against the upstream source — the
public runner gains the full Wave A safety wire and the prefill /
rotation wire:
- load_gate + recovery_lockout pre-spawn gates in
  MLXSubprocessRunner.__init__.
- child-death detection in generate() — a select.select
  sigchld arm (≤200ms) plus the pipe-EOF arm, each emitting a
  subprocess_early_death event and triggering recovery_lockout.
- done-frame protocol dispatch — the parent validates
  protocol_version frames via done_frame.validate_frame; the
  worker emits the canonical done frame.
- prefill-exposure / worker-rotation wire —
  _record_prefill_and_check_rotation, the should_rotate /
  rotation_verdict poll surface, and auto-rotation in
  call_model_isolated with auto_rotation_* events.
- CadenceGuard + subprocess_inference_guard (B1) +
  gemma-4 generation flush in the worker.
- resource_tracker cold-restart warning, CircuitBreaker
  cooldown gate, peak-memory capture, and VLM base64-image handling.
Wire kill-switch environment variables for emergency rollback:
MLX_LOAD_GATE_DISABLED, MLX_RECOVERY_LOCKOUT_GATE_DISABLED,
MLX_CHILD_DEATH_DETECT_DISABLED, MLX_SENTINEL_DETECT_DISABLED,
MLX_DONE_FRAME_WIRE_DISABLED, METALGUARD_PREFILL_WIRE_DISABLED,
METALGUARD_AUTO_ROTATE_DISABLED.
New test module tests/test_subprocess_runner_w11.py — the first
dedicated regression coverage for the runner.

Notes

No new public API symbol — the runner's public surface
(MLXSubprocessRunner / SubprocessCrashError /
SubprocessTimeoutError / call_model_isolated /
shutdown_all_workers) is unchanged; the facade __all__ is unchanged.
Deferred to W13: the SpawnRefused exception and the
__init__ local-panic-advisory gate depend on the
legacy panic-model schema → panic_registry reconcile, which is
W13 scope. They are intentionally omitted from W11.
The worker's inline prefill-fit guard (prefill.require_prefill_fit)
is retained — W11 ports drift in, it does not remove an existing
public defense layer.

Assets 2

19 May 02:20

Harperbot

v0.20.0

d706e40

v0.20.0 — Drift-audit wave W10 — cross-process lock reconcile.

[0.20.0] — 2026-05-18

Drift-audit wave W10 — cross-process lock reconcile.

Added

Observer mode for acquire_mlx_lock. When METALGUARD_MODE=observer
(and the mlx ≥ 0.31.2 / mlx-lm ≥ 0.31.3 version gate in process_mode
passes), a lock conflict logs a warning and returns an advisory dict
(mode="observer", advisory=True, actual_holder=…) instead of
raising MLXLockConflict — and does not overwrite the lock file,
so the existing holder stays visible. Defensive mode (the default)
is unchanged: a conflict still raises.
Lock-file info dict now records three diagnostic keys — host
(socket.gethostname()), python (interpreter version), and
platform (platform.system()) — alongside the existing
pid/label/started_at/cmdline.
MLXLockConflict messages now show a human-readable elapsed time
("running for 5m 3s") instead of the raw ISO timestamp.
Granular logging in read_mlx_lock (corrupt / malformed / stale lock
files) and release_mlx_lock (success, and the not-our-lock path),
plus a success log in acquire_mlx_lock.

Internal

New private helpers in mlx_lock.py: _format_elapsed,
_current_cmdline, _try_unlink (the last also swallows
PermissionError, not just FileNotFoundError).
New test module tests/test_mlx_lock_w10.py (22 tests) — the first
dedicated coverage for the cross-process lock, including parity
regression guards for force override, stale reclaim, and the context
manager.

Notes

No new public API symbol — acquire_mlx_lock / read_mlx_lock /
release_mlx_lock / mlx_exclusive_lock / MLXLockConflict were
already exported; the facade __all__ is unchanged.
Zombie detection and force override were already public (carved
into mlx_lock.py at the Phase 1 package split) — W10 verified them
in place and did not re-port them.

Assets 2

19 May 02:20

Harperbot

v0.19.0

9d59b78

v0.19.0 — Drift-audit wave W9 — version advisory sync.

[0.19.0] — 2026-05-18

Drift-audit wave W9 — version advisory sync.

Added

Three version advisories synced from the upstream advisory database:
- mlx-lm#1090 (<0.31.3, info) — pre-0.31.3 global generation
  stream; observer-mode ThreadPool parallelization races on Metal.
- mlx-lm#1256 (>=0.31.3, high) — the #1090 thread-local stream
  fix is incomplete; generation on a non-import thread still crashes.
- transformers#1011 (>=5.5.0.dev0,<5.6, high) — transformers
  5.5.x removed ReasoningEffort, breaking Gemma 4 loads via mlx-vlm;
  registered at the transformers package level so it fires regardless
  of the mlx-vlm version.
check_version_advisories() now reports 16 advisories (was 13).

Notes

upstream_patches needed no port — install_upstream_defensive_patches
/ _patch_mlx_lm_1128 are already public in metal_guard.version_advisories
(regression-locked by tests/test_version_advisories_w9.py).
Two non-safety observability helpers from the upstream module
(_infer_fixed_in, log_active_advisories_at_startup) remain
un-synced — deferred as minor forward drift.

Assets 2

19 May 02:20

Harperbot

v0.18.1

9b169c5

v0.18.1 — Drift-audit wave W8 — forensics layer (verify-only)

[0.18.1] — 2026-05-18

Drift-audit wave W8 — forensics layer (verify-only, no functional change).

Notes

orphan_monitor, postmortem, and postmortem_collect needed no port:
the Phase 1 package split already carried the orphan-monitor /
postmortem / status-snapshot layer into metal_guard.forensics in
de-Harper-ified form, and metal-guard orphan-scan / metal-guard postmortem are the canonical CLI entry points (superseding the
python -m postmortem_collect shim). Regression-locked by
tests/test_w8_already_public.py.
No public API change; this is a patch release marking the W8
reconciliation milestone.

Assets 2

19 May 02:20

Harperbot

v0.18.0

751a60a

v0.18.0 — Drift-audit wave W7 — OOM / prefill layer.

[0.18.0] — 2026-05-18

Drift-audit wave W7 — OOM / prefill layer.

Added

metal_guard.oom_precheck — high-level pre-call OOM gate wrapping
the prefill-allocation math with model-dims / GPU-memory auto-fetch
(precheck_inference, precheck_inference_or_raise, OOMVerdict).
metal_guard.prefill_exposure_tracker — per-worker cumulative
prefill-tokens lifetime tracker for the delayed-trigger panic class
(record_prefill, get_state, forget_worker, all_states,
WorkerPrefillState).
metal_guard.worker_rotation — advisory prefill-budget worker
rotation gate (should_rotate_worker, RotationVerdict,
REASON_DISABLED / REASON_NO_STATE / REASON_UNDER_BUDGET /
REASON_PREFILL_BUDGET_EXCEEDED).

Changed

Public API surface grows by 14 names; tests/test_phase1_facade.py
surface lock updated 151 -> 165.

Notes

prefill_guard needed no port — its full surface already lives in
metal_guard.prefill since the Phase 1 refactor (regression-locked by
tests/test_w7_already_public.py).

Assets 2

Releases: Harperbot/metal-guard

v1.1.0

Added

Changed

Removed

Uh oh!

v1.0.0 — first stable release

Uh oh!

v0.24.0 — Drift-audit wave W14 — de-Harper cleanup sweep.

[0.24.0] — 2026-05-19

Changed

Removed

Notes

Uh oh!

v0.23.0 — Drift-audit wave W13 — final special reconcile.

[0.23.0] — 2026-05-19

Added

Notes

Uh oh!

v0.22.0 — Drift-audit wave W12 — top-level dispatch reconcile.

[0.22.0] — 2026-05-19

Added

Notes

Uh oh!

v0.21.0 — Drift-audit wave W11 — subprocess_runner reconcile.

[0.21.0] — 2026-05-19

Added

Notes

Uh oh!

v0.20.0 — Drift-audit wave W10 — cross-process lock reconcile.

[0.20.0] — 2026-05-18

Added

Internal

Notes

Uh oh!

v0.19.0 — Drift-audit wave W9 — version advisory sync.

[0.19.0] — 2026-05-18

Added

Notes

Uh oh!

v0.18.1 — Drift-audit wave W8 — forensics layer (verify-only)

[0.18.1] — 2026-05-18

Notes

Uh oh!

v0.18.0 — Drift-audit wave W7 — OOM / prefill layer.

[0.18.0] — 2026-05-18

Added

Changed

Notes

Uh oh!