ai-reliability
Here are 85 public repositories matching this topic...
zer0dex is a local dual-layer memory pattern for AI agents: a compressed, human-readable markdown index plus a vector store queried automatically before each message. Built for cross-project recall and cross-reference where flat memory files or vector-only RAG fall short. Local-first, low-latency. Reference implementation by Hermes Labs.
-
Updated
Jun 13, 2026 - Python
lintlang is a static linter for AI agent configs, tool descriptions, and system prompts that runs zero-LLM quality gating in CI. Catches language-level failures (vague tool descriptions, missing stop conditions, schema gaps) before they reach runtime, with deterministic regex + structural detectors and no model calls.
-
Updated
Jun 2, 2026 - Python
The open-source MultiAgentOps evaluation and verification harness for any industry business workflow.
-
Updated
Jun 14, 2026 - Python
Turn failed AI agent runs into replayable regression tests. Catch regressions before you ship.
-
Updated
Jun 4, 2026 - Python
The "Cloudflare for AI Agents". 7-layer security interceptor, real-time observability dashboard, and automated reliability testing for MCP and AI tool chains. Prevent hallucinations, prompt injection, and destructive tool calls.
-
Updated
May 4, 2026 - Python
Production-grade TypeScript AI runtime focused on reliability, governance, and reproducible LLM systems. Multi-provider gateway, agents, RAG, workflows, policy engine, audit trails, and deterministic testing — built for teams shipping AI in production.
-
Updated
Jun 4, 2026 - TypeScript
MCP server for the Ejentum API. 8 cognitive operations across 4 harnesses (reasoning, code, anti-deception, memory) in dynamic and adaptive modes.
-
Updated
Jun 11, 2026 - JavaScript
Open-source AI model evaluation and benchmarking framework for LLMs (OpenAI, Ollama, Claude, Gemini)
-
Updated
Jun 11, 2026 - Python
Architectural standards and best practices for building reliable AI Agents and LLM workflows. Defining the framework for AI Reliability Engineering (AIRE).
-
Updated
Feb 14, 2026 - Dockerfile
Context-compensation scaffold for LLM evaluation prompts. A short language prefix you prepend so the model discloses prior exposure, scores on quoted evidence only, and hedges on thin evidence — for scorers that can see your CLAUDE.md, memory, or session context. Backend-agnostic. Experimental: variance-reduction effect not yet measured.
-
Updated
May 27, 2026 - Python
Sheldon K. Salmon — AI Reliability Architect. Creator of the AION Constitutional Stack and the CERTUS certainty‑engineering methodology. He designed, directed, and red‑teamed VERITAS — applying epistemic scoring, Uncertainty Mass, and permanent STP seals to community crisis data. Code is open source. The judgment is not.
-
Updated
May 16, 2026 - JavaScript
AION Scaffold — Intelligent tree-to-filesystem generator. Built by Sheldon K. Salmon, AI Reliability Architect. Part of the AION Constitutional Stack. Free forever. No tracking.
-
Updated
May 6, 2026 - HTML
quick-gate-js (npm: quick-gate) is a deterministic JS/TS CI quality gate that unifies ESLint, TypeScript, build, and Lighthouse checks into one fail-fast result, with bounded auto-repair and structured escalation evidence for humans or agents. Works with Next.js, React, Vue, Svelte, or any Node project. A gate-and-escalate wrapper, not a dashboard.
-
Updated
Jun 1, 2026 - JavaScript
Benchmark for evaluating advanced reasoning, recursive dependency resolution, and robustness capabilities of large language models in dynamic, noisy, and structurally challenging environments.
-
Updated
May 15, 2026 - Python
Orchestration runtime for AI agent workflows that preserves task-state fidelity, prevents reasoning drift, and reduces wasted computation in long-horizon pipelines.
-
Updated
Mar 19, 2026 - JavaScript
Research archive — eight published papers, Mahdi Ledger, and empirical foundations of the LC-OS governance framework.
-
Updated
May 25, 2026
hermeneutic is an evidence-first drift gate for AI agents. It mines corrections from your AI chat logs (prior response, user correction, repair), classifies the drift, and runs a cheap-to-expensive pre-flight gate on the next response before drift ships. Regex, then structured scoring, then a pressure probe. MIT, zero dependencies, by Hermes Labs.
-
Updated
May 31, 2026 - Python
Enterprise AI system for decision intelligence — transforming research into scalable, context-aware insights at production scale | AditiKhare.com — AI Product Ecosystem
-
Updated
Apr 20, 2026
Span-level hallucination detection for LLM-generated business analysis on Online Retail transaction data.
-
Updated
May 26, 2026 - Python
Improve this page
Add a description, image, and links to the ai-reliability topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the ai-reliability topic, visit your repo's landing page and select "manage topics."