Skip to content

Commit 751a60a

Browse files
HarperHarper
authored andcommitted
feat(v0.18.0): W7 — OOM / prefill layer
Phase 3 wave W7 of the metal-guard public reconciliation. New submodules oom_precheck / prefill_exposure_tracker / worker_rotation. prefill_guard confirmed already-public (no port). R1+R2 critic GO. Suite 517 -> 555.
2 parents 9a60ef2 + fd48ff3 commit 751a60a

12 files changed

Lines changed: 1473 additions & 6 deletions

CHANGELOG.md

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,35 @@ All notable changes to **metal-guard** are documented here.
55
The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/)
66
and this project adheres to [Semantic Versioning](https://semver.org/).
77

8+
## [0.18.0] — 2026-05-18
9+
10+
**Drift-audit wave W7** — OOM / prefill layer.
11+
12+
### Added
13+
14+
- **`metal_guard.oom_precheck`** — high-level pre-call OOM gate wrapping
15+
the prefill-allocation math with model-dims / GPU-memory auto-fetch
16+
(`precheck_inference`, `precheck_inference_or_raise`, `OOMVerdict`).
17+
- **`metal_guard.prefill_exposure_tracker`** — per-worker cumulative
18+
prefill-tokens lifetime tracker for the delayed-trigger panic class
19+
(`record_prefill`, `get_state`, `forget_worker`, `all_states`,
20+
`WorkerPrefillState`).
21+
- **`metal_guard.worker_rotation`** — advisory prefill-budget worker
22+
rotation gate (`should_rotate_worker`, `RotationVerdict`,
23+
`REASON_DISABLED` / `REASON_NO_STATE` / `REASON_UNDER_BUDGET` /
24+
`REASON_PREFILL_BUDGET_EXCEEDED`).
25+
26+
### Changed
27+
28+
- Public API surface grows by 14 names; `tests/test_phase1_facade.py`
29+
surface lock updated 151 -> 165.
30+
31+
### Notes
32+
33+
- `prefill_guard` needed no port — its full surface already lives in
34+
`metal_guard.prefill` since the Phase 1 refactor (regression-locked by
35+
`tests/test_w7_already_public.py`).
36+
837
## [0.17.0] — 2026-05-18
938

1039
**Drift-audit wave W6** — Wave-A safety nets.

metal_guard/__init__.py

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -222,6 +222,26 @@
222222
poll_for_early_death,
223223
)
224224
from metal_guard.safe_shutdown import ShutdownReport, safe_inference_shutdown
225+
from metal_guard.oom_precheck import (
226+
OOMVerdict,
227+
precheck_inference,
228+
precheck_inference_or_raise,
229+
)
230+
from metal_guard.prefill_exposure_tracker import (
231+
WorkerPrefillState,
232+
all_states,
233+
forget_worker,
234+
get_state,
235+
record_prefill,
236+
)
237+
from metal_guard.worker_rotation import (
238+
REASON_DISABLED,
239+
REASON_NO_STATE,
240+
REASON_PREFILL_BUDGET_EXCEEDED,
241+
REASON_UNDER_BUDGET,
242+
RotationVerdict,
243+
should_rotate_worker,
244+
)
225245

226246
# ---------------------------------------------------------------------------
227247
# Private re-exports — underscore-prefixed names that existing tests / callers
@@ -454,4 +474,21 @@
454474
# safe_shutdown (W6)
455475
"safe_inference_shutdown",
456476
"ShutdownReport",
477+
# oom_precheck (W7)
478+
"OOMVerdict",
479+
"precheck_inference",
480+
"precheck_inference_or_raise",
481+
# prefill_exposure_tracker (W7)
482+
"WorkerPrefillState",
483+
"record_prefill",
484+
"get_state",
485+
"forget_worker",
486+
"all_states",
487+
# worker_rotation (W7)
488+
"should_rotate_worker",
489+
"RotationVerdict",
490+
"REASON_DISABLED",
491+
"REASON_NO_STATE",
492+
"REASON_UNDER_BUDGET",
493+
"REASON_PREFILL_BUDGET_EXCEEDED",
457494
]

metal_guard/_constants.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
import os
1111
from typing import TypeVar
1212

13-
__version__ = "0.17.0"
13+
__version__ = "0.18.0"
1414

1515
log = logging.getLogger("metal_guard")
1616

0 commit comments

Comments
 (0)