Skip to main content
Every Noēsis episode produces a set of structured artifacts that capture the complete cognitive trace. These files enable replay, debugging, auditing, and analysis.

Artifact structure

runs/
  <label>/                     # e.g., "demo", "production"
    <episode_id>/              # e.g., "ep_20251108_120000_abc123_def4_s0"
      events.jsonl             # cognitive event timeline with lineage
      summary.json             # metrics and KPIs (insight.metrics)
      state.json               # plan, beliefs, memory, outcomes
      manifest.json            # SHA-256 catalog + optional HMAC
      learn.jsonl              # learning signals (optional)
      prompts.jsonl            # prompt provenance (opt-in, ADR-005)
Episode IDs use ULID format (monotonic, sortable, 48-bit timestamp + 80-bit entropy). Directive and governance IDs use deterministic UUIDv5 for reproducible lineage tracking.

summary.json

The summary captures episode outcomes, metrics, and cross-references.

Schema

{
  "schema_version": "1.2.0",
  "episode_id": "ep_2024_abc123_s0",
  "task": "Draft release notes for v1.2.0",
  "seed": 42,
  "started_at": "2024-01-15T10:30:00Z",
  "duration_sec": 5.12,
  "flags": {
    "intuition": true,
    "mode": "meta",
    "using": "langgraph",
    "direction": {
      "applied": 1,
      "vetoed": 0,
      "policy": "SafetyPolicy@1.0",
      "threshold": 0.75,
      "last_diff": [
        "plan.steps[0].params.limit: null → 100"
      ]
    }
  },
  "ports": {
    "model": "openai:gpt-4o-mini"
  },
  "agents_config_hash": "sha256:9f7d...",
  "answer": {},
  "metrics": {
    "success": 1,
    "plan_count": 2,
    "act_count": 3,
    "reflect_count": 1,
    "veto_count": 0,
    "latencies": {
      "first_action_ms": 150,
      "total_ms": 5000
    }
  },
  "insight": {
    "plan_adherence": 0.95,
    "tool_coverage": 1.0,
    "branching_factor": 2
  },
  "tags": {
    "environment": "staging",
    "team": "platform"
  }
}

Key fields

FieldDescription
schema_versionVersion of the summary schema
episode_idUnique identifier
taskThe original task/goal
seedSeed used for deterministic runs (if provided)
duration_secWall-clock duration of the episode
flagsConfiguration flags for the run
portsAdapter labels bound to the run (e.g., model provider)
agents_config_hashHash of adapters + intuition config for reproducibility
metricsComputed metrics
insightAdvanced insight metrics
tagsUser-provided metadata

Reading summaries

import noesis as ns

episode_id = ns.last()
summary = ns.summary.read(episode_id)

print(f"Task: {summary['task']}")
print(f"Success: {summary['metrics']['success']}")
print(f"Actions: {summary['metrics']['act_count']}")

state.json

The state captures the cognitive context at the end of the episode.

Schema

{
  "version": "1.0",
  "state_schema_version": "1.0.0",
  "episode": {
    "id": "ep_2024_abc123_s0",
    "seed": 0,
    "adapter": "baseline",
    "started_at": "2024-01-15T10:30:00Z",
    "completed_at": "2024-01-15T10:30:05Z"
  },
  "goal": {
    "task": "Draft release notes for v1.2.0",
    "context": {
      "changelog_path": "CHANGELOG.md"
    }
  },
  "beliefs": [
    {
      "statement": "Version 1.2.0 includes 3 new features",
      "confidence": 0.9,
      "provenance": "changelog_analysis"
    }
  ],
  "plan": {
    "steps": [
      {
        "id": "step_1",
        "kind": "detect",
        "description": "Read changelog entries",
        "status": "done",
        "inputs": {"file": "CHANGELOG.md"},
        "outputs": {"entries": 15}
      },
      {
        "id": "step_2",
        "kind": "act",
        "description": "Generate summary",
        "status": "done"
      }
    ]
  },
  "memory": {
    "facts": [
      {
        "key": "changelog_format",
        "value": "keep-a-changelog",
        "timestamp": "2024-01-15T10:30:02Z"
      }
    ],
    "scratchpad": "Found 3 breaking changes to highlight"
  },
  "outcomes": {
    "status": "ok",
    "actions": [
      {
        "tool": "changelog_reader",
        "input": "CHANGELOG.md",
        "output": {"entries": 15},
        "timestamp": "2024-01-15T10:30:01Z"
      }
    ],
    "metrics": {
      "task_score": 0.95
    },
    "artifacts": ["release_notes.md"]
  }
}

Step kinds

KindPurpose
detectGather information
analyzeProcess or categorize
planSub-planning
actExecute action
verifyCheck results
reviewHuman review point

Step statuses

StatusMeaning
pendingNot yet started
runningCurrently executing
doneCompleted successfully
skippedIntentionally skipped
failedFailed with error
vetoedBlocked by policy

Outcome statuses

StatusMeaning
okCompleted successfully
errorFailed with error
vetoedBlocked by policy
abortedManually stopped
partialPartially completed
pendingStill running

Reading state

import noesis as ns

episode_id = ns.last()
state = ns.state.read(episode_id)

print(f"Goal: {state['goal']['task']}")
print(f"Steps: {len(state['plan']['steps'])}")
print(f"Status: {state['outcomes']['status']}")

events.jsonl

The event timeline records every phase transition with timing and lineage.

Format

Events are stored as newline-delimited JSON (JSONL):
{"id": "evt_1", "phase": "observe", "payload": {...}, "metrics": {...}}
{"id": "evt_2", "phase": "interpret", "payload": {...}, "caused_by": "evt_1", "metrics": {...}}
{"id": "evt_3", "phase": "plan", "payload": {...}, "caused_by": "evt_2", "metrics": {...}}

Event structure

{
  "id": "evt_abc123",
  "phase": "plan",
  "agent_id": "meta_planner",
  "payload": {
    "steps": [
      {"kind": "detect", "description": "Gather data"}
    ]
  },
  "metrics": {
    "started_at": "2024-01-15T10:30:00.500Z",
    "completed_at": "2024-01-15T10:30:00.512Z",
    "duration_ms": 12.7
  },
  "caused_by": "evt_xyz789"
}
Event metrics always include started_at, completed_at, and duration_ms, which you can aggregate for per-phase latency.

Event phases

PhasePurpose
observeCapture raw input
interpretExtract signals
planDecide actions
directionPolicy directives
governanceAudit decisions
actExecute actions
reflectEvaluate outcomes
learnPropose updates
terminateEpisode end

Reading events

import noesis as ns

episode_id = ns.last()
events = list(ns.events.read(episode_id))

# Filter by phase
plan_events = [e for e in events if e["phase"] == "plan"]
act_events = [e for e in events if e["phase"] == "act"]

# Reconstruct causal chain
for event in events:
    print(f"{event['phase']}: caused_by={event.get('caused_by', 'none')}")

CLI access

# All events
noesis events ep_abc123

# Filter by phase
noesis events ep_abc123 --phase act

# As JSON
noesis events ep_abc123 -j | jq '.[] | select(.phase == "plan")'

manifest.json

The manifest provides integrity verification for all artifacts with SHA-256 hashes and optional signatures.

Schema

{
  "schema_version": "manifest/1.0",
  "episode_id": "ep_2024_abc123_s0",
  "created_at": "2024-01-15T10:30:05Z",
  "files": [
    {"name": "summary.json", "sha256": "sha256:abc123...", "size_bytes": 1234, "kind": "summary"},
    {"name": "state.json", "sha256": "sha256:def456...", "size_bytes": 5678, "kind": "state"},
    {"name": "events.jsonl", "sha256": "sha256:ghi789...", "size_bytes": 9012, "kind": "events"}
  ],
  "signature": {
    "alg": "hs256",
    "kid": "ops-key-1",
    "value": "base64sig",
    "ts": "2024-01-15T10:30:05Z"
  }
}
FieldDescription
schema_versionManifest schema (manifest/1.0)
filesArray of files with name, sha256, size_bytes, kind (summary, state, events, learn, attachment, custom)
signatureOptional signature block with algorithm, key ID, value, timestamp

Artifact immutability guarantees

All artifacts are written atomically using this pattern:
  1. Write to temporary file
  2. Call fsync() to ensure durability
  3. Atomic rename to final path
This ensures no partial writes survive system crashes.

Verifying integrity

import hashlib
import json
from pathlib import Path


def verify_manifest(episode_dir: Path) -> bool:
    """Verify artifact integrity using manifest."""
    manifest_path = episode_dir / "manifest.json"
    
    with open(manifest_path) as f:
        manifest = json.load(f)
    
    for filename, info in manifest["files"].items():
        file_path = episode_dir / filename
        
        # Check size
        if file_path.stat().st_size != info["size"]:
            return False
        
        # Check hash
        with open(file_path, "rb") as f:
            actual_hash = hashlib.sha256(f.read()).hexdigest()
        
        if actual_hash != info["sha256"]:
            return False
    
    return True
JSON serialization uses canonical_dumps() with sorted keys for byte-identical output, ensuring consistent hashes across runs.

learn.jsonl (optional)

Learning signals are stored separately when the learn phase emits proposals.

Schema

{"proposal_id": "prop_1", "type": "threshold", "target": "veto_confidence", "suggested_value": 0.8}
{"proposal_id": "prop_2", "type": "memory", "key": "pattern_detected", "value": "sql_injection_attempt"}

Reading learn signals

import json
from pathlib import Path

learn_path = Path(f"runs/demo/{episode_id}/learn.jsonl")

if learn_path.exists():
    with open(learn_path) as f:
        proposals = [json.loads(line) for line in f]
    
    for p in proposals:
        print(f"Proposal: {p['type']} -> {p.get('target', p.get('key'))}")

prompts.jsonl (opt-in)

When prompt provenance is enabled (ADR-005, experimental), all prompts are recorded for debugging and auditing.

Schema

{
  "prompt_id": "pmt_abc123",
  "timestamp": "2024-01-15T10:30:01Z",
  "model": "gpt-4",
  "template": "Generate release notes for:\n{changelog}",
  "variables": {"changelog": "..."},
  "response": "## Release Notes...",
  "tokens": {"input": 150, "output": 200},
  "latency_ms": 1250
}
Experimental feature (ADR-005): Enable prompt logging carefully—prompts may contain sensitive data. Use ns.set(log_prompts=True) only when needed for debugging or compliance.

Use cases

  • Debugging: Trace exactly what prompts were sent to LLMs
  • Compliance: Audit trail for regulated industries
  • Cost analysis: Track token usage per episode
  • Prompt optimization: Analyze prompt patterns across runs

Storage configuration

Runs directory

import noesis as ns

# Set custom runs directory
ns.set(runs_dir="./my-runs")

# Or via environment
# NOESIS_RUNS_DIR=./my-runs

Labels

Organize episodes with labels:
import noesis as ns

ns.set(label="production")
# Artifacts go to runs/production/ep_.../

ns.set(label="staging")
# Artifacts go to runs/staging/ep_.../

Retention

Control artifact retention:
from noesis.episode import EpisodeIndex

store = EpisodeIndex("./runs/_episodes", ttl_days=14)
store.vacuum()  # Remove episodes older than 14 days

Best practices

Back up production artifacts. The runs directory contains valuable audit data—include it in your backup strategy.
Redact sensitive data. Tasks and prompts may contain PII. Consider scrubbing before long-term storage.
Use manifests for compliance. The SHA-256 checksums in manifest.json provide tamper evidence for audits.

Next steps