The unified observability layer for the AI Control Plane.
AITrace provides the unified observability layer for the AI Control Plane, transforming opaque, non-deterministic AI processes into fully interpretable and debuggable execution traces.
Its mission is to create an Elixir-native instrumentation library and a corresponding data model that captures the complete causal chain of an AI agent's reasoning process—from initial prompt to final output, including all thoughts, tool calls, and state changes—enabling a true "Execution Cinema" experience for developers and operators.
Standalone tracing may use the :aitrace, :exporters application config.
Governed callers must pass explicit exporters to AITrace.export/2; ambient
application env must not select trace export sinks for governed evidence.
AITrace is an evidence and observability library, not the authority layer. In the ranked stack it provides trace refs, spans, events, replay bundles, export receipts, and execution-cinema data that other owners can join to their own truth:
products, AppKit, Mezzanine, Citadel, Jido Integration, StackLab
-> AITrace spans/events/export receipts
-> Mezzanine owns durable audit truth
-> Citadel owns authority truth
-> StackLab owns assembled proof joins
This distinction matters. A trace can prove what was observed by an instrumented path. It does not by itself prove that the path was authorized, that a workflow reached durable terminal truth, or that a release claim is closed. Those claims need the owning authority, audit, and proof repos to link AITrace refs into their receipts.
The current package still supports the simple local tracing API shown below, but it now also carries stack-oriented evidence contracts:
- bounded export behavior that redacts raw prompt, provider, webhook, payload, and oversize fields into hash-backed spillover refs
- file-export receipts that can carry release-manifest or evidence-owner refs
- authority-trace classification helpers
- AI platform trace bounds for prompt, guard, replay, eval, cost, and provider identity evidence
- replay contracts and replay engine packages under
core/ - single-node proof trace fixtures used by StackLab
- persistence posture documentation for redacted memory/ref-only capture
Use ambient application config only for standalone tracing. Governed callers must pass explicit exporters and refs at the call site so export sinks are not silently selected by process configuration.
The current Extravaganza cutover returns complete route evidence through the
product command path, including trace refs. Its headless receipts currently
report trace_replay.status: not_emitted; that means the product proof has
receipt and route evidence, but has not exported a replay bundle through
AITrace for that command. When a product claims replay proof, Mezzanine or the
owning product path must emit AITrace events/exports that AITrace can replay.
The Synapse governed-effect lift adds AITrace.GovernedEffectEvidence, a
bounded evidence helper for effect refs, authority refs, dispatch refs, receipt
refs, trace refs, lifecycle entries, and metadata hashes. It supports the
staged-live diagnostic proof by making provider facts data in the receipt path,
not control-flow branches in product code. AITrace remains evidence/replay
infrastructure; authority and durable lifecycle truth remain owned by Citadel
and Mezzanine.
The cross-stack proof command is:
cd /home/home/p/g/n/stack_lab
MIX_ENV=test mix stack_lab.synapse.staged_live.v1 --jsonSee Generalized Stack Boundary for the current repo boundary and cutover-proof posture.
Maintainers should also read Code Smell Remediation before changing context propagation, export profiles, collector ownership, file export, replay reduction, or runtime identity.
flowchart TD
Operation["Instrumented<br/>operation"] --> Trace["Trace"]
Trace --> Span["Nested<br/>spans"]
Span --> Event["Point<br/>events"]
Event --> Attr["Bounded<br/>attributes"]
Attr --> Export["Explicit<br/>exporter"]
Export --> Receipt["Export<br/>receipt"]
Receipt --> Owner["Evidence<br/>owner"]
flowchart LR
Prompt["Prompt<br/>evidence"] --> Bounds["Trace<br/>bounds"]
Guard["Guard<br/>evidence"] --> Bounds
Replay["Replay<br/>evidence"] --> Bounds
Eval["Eval<br/>evidence"] --> Bounds
Cost["Cost<br/>evidence"] --> Bounds
Provider["Provider<br/>identity"] --> Bounds
Bounds --> Redaction["Redaction<br/>refs"]
Redaction --> StackProof["StackLab<br/>proof joins"]
flowchart TD
API["Trace<br/>API"] --> Context["Trace<br/>context"]
Context --> Collector["Collector"]
Collector --> Spans["Spans"]
Collector --> Events["Events"]
Spans --> Bounds["Export<br/>bounds"]
Events --> Bounds
Bounds --> Exporter["Explicit<br/>exporter"]
Exporter --> Receipt["Export<br/>receipt"]
flowchart LR
TraceRef["Trace<br/>refs"] --> Bundle["Replay<br/>bundle"]
Bundle --> Engine["Replay<br/>engine"]
Engine --> Redaction["Redaction<br/>manifest"]
Engine --> Proof["Proof<br/>join"]
Proof --> StackLab["StackLab"]
Proof --> Authority["Authority<br/>owner"]
Proof --> Audit["Audit<br/>owner"]
Debugging a simple web request is a solved problem. We have structured logs, metrics, and distributed tracing (like OpenTelemetry) that show the path of a request through a series of stateless services.
Debugging an AI agent is fundamentally different. It is like performing forensic analysis on a dream. The challenges are unique:
- Non-Determinism: The same input can produce different outputs and, more importantly, different reasoning paths.
- Deeply Nested Causality: A final answer may be the result of a multi-step chain of thought, where an LLM hallucinates, calls the wrong tool with the wrong arguments, misinterprets the result, and then tries to correct itself.
- Stateful Complexity: Agents are not stateless. Their behavior is conditioned by memory, scratchpads, and the history of the conversation. A simple log line is insufficient to capture the state that led to a decision.
- Polyglot Execution: An agent's "thought" may happen in Elixir, but its "action" (e.g., running a code interpreter) happens in a sandboxed Python environment. Tracing this flow across language boundaries is notoriously difficult.
Logger.info/1 is inadequate. Traditional APM tools provide a high-level view but lack the granular, AI-specific context needed to answer the most important question: "Why did the agent do that?"
AITrace is built on a few simple but powerful concepts, heavily inspired by OpenTelemetry but adapted for AI workflows.
-
Trace: The complete, end-to-end record of a single transaction (e.g., one user message to an agent). It is identified by a unique
trace_id. A trace is composed of a rootSpanand many nestedSpansandEvents. -
Span: A record of a timed operation with a distinct start and end. A span represents a unit of work. Examples:
llm_call,tool_execution,prompt_rendering. Spans can be nested to represent a call graph. Each span has aname,start_time,end_time, and a key-value map ofattributes. -
Event: A point-in-time annotation within a
Span. It represents a notable occurrence that isn't a timed operation. Examples:agent_state_updated,validation_failed,tool_not_found. -
Context: An immutable Elixir map (
%AITrace.Context{}) that carries thetrace_idand the currentspan_id. This context is explicitly passed through the entire call stack of a traced operation, ensuring all telemetry is correctly correlated.
Add aitrace to your mix.exs dependencies:
def deps do
[
{:aitrace, "~> 0.1.0"}
]
enddefmodule MyApp.Agent do
require AITrace # Required to use the macros
def handle_user_message(message, state) do
# 1. Start a new trace for the entire transaction
AITrace.trace "agent.handle_message" do
# 2. Add point-in-time events with rich metadata
AITrace.add_event("request_received", %{message_length: String.length(message)})
# 3. Wrap discrete, timed operations in spans
response = AITrace.span "reasoning_loop" do
# Add attributes to the current span
AITrace.with_attributes(%{model: "gpt-4", temperature: 0.7})
# Perform reasoning
think_about(message)
end
AITrace.add_event("reasoning_complete", %{token_usage: response.tokens})
{:reply, response.answer, update_state(state)}
end
end
endAITrace.trace "operation_name" do
# Your code here - context is stored in process dictionary
endAITrace.span "span_name" do
# Timed operation - duration is automatically measured
endAITrace.add_event("event_name", %{key: "value"})
AITrace.add_event("simple_event") # No attributesAITrace.with_attributes(%{user_id: 42, region: "us-west"})ctx = AITrace.get_current_context()
IO.inspect(ctx.trace_id)
IO.inspect(ctx.span_id)trace =
%AITrace.Trace{trace_id: "trace_123", created_at: 1_712_345_678_000_000, spans: [], metadata: %{}}
AITrace.export(trace)This one-shot path is intended for integrations that already have a completed
AITrace.Trace value and want to run it through the configured exporters
without using the collector-backed macros.
Configure exporters in your application config:
# config/config.exs
config :aitrace,
exporters: [
{AITrace.Exporter.Console, verbose: true, color: true},
{AITrace.Exporter.File, directory: "./traces"}
]AITrace.Exporter.Console- Prints human-readable traces to stdout
- Options:
verbose(show attributes/events),color(ANSI colors)
AITrace.Exporter.File- Writes JSON traces to files
- Options:
directory(output directory, default: "./traces"),release_manifest_ref,evidence_owner_ref,source_node_ref,node_instance_id,boot_generation,node_role,deployment_ref,cluster_ref,commit_lsn,commit_hlc - The file exporter writes an adjacent
.evidence.jsonreceipt containing the trace artifact SHA-256, byte count, release-manifest or evidence-owner linkage, and proof posture. Trace data is not authoritative audit, incident, replay, review, or release-manifest proof unless the receipt is anchored byrelease_manifest_refor an existingevidence_owner_ref. - When
source_node_refis configured, the trace JSON, every exported span, and the adjacent evidence receipt include per-node evidence. Ifcommit_lsnandcommit_hlcare also configured, the receipt includesnode_order_evidencekeyed bytrace_idfor proof-token joins. - Exported metadata and attributes are bounded by
AITrace.ExportBounds; raw prompt/provider/webhook/payload-shaped fields and oversize values are replaced with SHA-256 spillover refs instead of being serialized inline. - Phase 7 capture posture defaults to a redacted memory/ref-only evidence
profile.
:offcapture disables trace retention without blocking provider effects, and debug tap failure records:failed_non_mutatingevidence without mutating trace, span, event, export, or replay-bundle state.
Implement the AITrace.Exporter behavior:
defmodule MyApp.CustomExporter do
@behaviour AITrace.Exporter
@impl true
def init(opts), do: {:ok, opts}
@impl true
def export(trace, state) do
# Send trace to your backend
IO.inspect(trace)
{:ok, state}
end
@impl true
def shutdown(_state), do: :ok
endSee examples/basic_usage.exs for a complete working example:
mix run examples/basic_usage.exsOutput:
Trace: b37b73325dbd626481e0ff3e89de02c8
▸ reasoning (10.84ms) ✓
Attributes: %{model: "gpt-4", temperature: 0.7}
• reasoning_complete
%{thought_count: 3}
▸ tool_execution (5.95ms) ✓
Attributes: %{tool: "web_search"}
▸ response_generation (8.98ms) ✓
Attributes: %{tokens: 150}
- AITrace.Context - Carries trace_id and span_id through the call stack
- AITrace.Span - Timed operations with start/end times, attributes, and events
- AITrace.Event - Point-in-time annotations within spans
- AITrace.Trace - Complete trace containing all spans
- AITrace.Collector - In-memory Agent storing active traces
- AITrace.Application - Supervision tree managing the collector
- Context stored in process dictionary for implicit propagation
AITrace is designed to integrate with other AI infrastructure:
- DSPex - Automatic instrumentation for LLM calls and prompt rendering
- Altar - Tool execution tracing with arguments and results
- Snakepit - Cross-language tracing via gRPC metadata
- Phoenix Channels - Real-time trace streaming to web UIs
- OpenTelemetry - Export to standard observability platforms
✅ Implemented (v0.1.0)
- Core data structures (Context, Span, Event, Trace)
- Trace and span macros with automatic timing
- Event and attribute APIs
- Console exporter (human-readable output)
- File exporter (JSON format)
- Comprehensive test suite (80 tests)
- Working examples
🚧 Planned
- Phoenix Channel exporter for real-time streaming
- OpenTelemetry exporter
- OTP integration helpers (GenServer, Oban)
- Cross-process context propagation
- "Execution Cinema" web UI with waterfall views
- DSPex, Altar, and Snakepit integrations
# Run all tests
mix test
# Run with coverage
mix test --cover
# Run example
mix run examples/basic_usage.exsMIT - See LICENSE for details.
AITrace is part of the AI Control Plane ecosystem. Contributions welcome!
See docs/persistence.md for tiers, defaults, adapters, unsupported selections, config examples, restart claims, durability claims, debug sidecar behavior, redaction guarantees, migration or preflight behavior, and no-bypass scope when applicable.
Chassis emits bounded spans for deployment, provisioning, mesh, health,
rollback, evolution, model materialization, hardware admission, and tensor
reload events. Attributes must pass AITrace.ExportBounds.profile/0 and carry
refs, digests, summaries, outcomes, and bounded counts instead of raw payloads.
Baseline Chassis span names include deployment accepted, adapter selected, provisioning started/completed, mesh joined, health checked, receipt emitted, and rollback triggered. Evolution and model spans are listed below.
Chassis Evolution span names include:
chassis.evolution.failure_batch.createdchassis.evolution.flag.recordedchassis.evolution.startedchassis.evolution.coding_agent.spawnedchassis.evolution.patch.proposedchassis.evolution.candidate.builtchassis.evolution.trial.provisionedchassis.evolution.trial.startedchassis.evolution.trial.completedchassis.evolution.scoring.completedchassis.evolution.blockedchassis.evolution.convergedchassis.evolution.promotion.requestedchassis.evolution.operator_consent.recordedchassis.evolution.swap.startedchassis.evolution.swap.committedchassis.evolution.swap.rolled_backchassis.evolution.failedchassis.evolution.status.readchassis.evolution.candidate.read
Model, hardware, and tensor span names include:
chassis.model.weight.materialization.startedchassis.model.weight.materialization.completedchassis.model.weight.verify.failedchassis.hardware.accelerator.validatedchassis.hardware.accelerator.rejectedchassis.tensor_patch.reload.startedchassis.tensor_patch.reload.completedchassis.tensor_patch.rollback.completed
Chassis events must not carry raw credentials, raw private transcript bodies,
raw prompt payloads, raw diffs, raw provider payloads, raw model weight bytes,
mutable filesystem state as authority, or unsafe atom values. Use refs,
bounded summaries, digest summaries, receipt refs, trace refs, and explicit
redaction posture fields. The detailed Chassis attribute filter rules are in
../j/jido_brainstorm/nshkrdotcom/docs/20260529/chassis_impl/0532_chassis_evolution_aitrace_and_observability.md.