The summary.json artifact provides a high-level view of episode outcomes, metrics, and flags. It’s designed for quick access to key information without parsing the full event timeline.
Schema overview
{
"schema_version" : "1.3.0" ,
"episode_id" : "ep_2024_abc123_s0" ,
"task" : "Draft release notes for v1.2.0" ,
"started_at" : "2024-01-15T10:30:00Z" ,
"duration_sec" : 5.0 ,
"adapter_result" : "success" ,
"outcome" : "success_unverified" ,
"verification" : { ... },
"flags" : { ... },
"metrics" : { ... },
"insight" : { ... },
"tags" : { ... }
}
Root fields
Version of the summary schema. Currently "1.3.0".
Unique episode identifier.
The original task or goal.
ISO 8601 start timestamp.
Total episode duration in seconds.
Adapter execution status: "success", "error", or "skipped".
Verification outcome: "success", "goal_not_achieved", "success_unverified", or "error".
Verification summary block. Show Verification properties
Whether verification assertions were provided.
Verification pass/fail status (null when unverified or unavailable).
Assertion results (empty when not provided).
Snapshot diff with added, modified, and deleted lists.
Snapshot paths for pre and post.
Snapshot policy (ignore list + symlink handling).
Verification error reason (e.g., "workspace_unavailable").
Flags
Configuration flags for the episode run.
{
"flags" : {
"intuition" : true ,
"mode" : "meta" ,
"direction" : {
"applied" : 2 ,
"vetoed" : 0 ,
"policy" : "SafetyPolicy@1.0" ,
"threshold" : 0.75 ,
"last_diff" : [ "plan.steps[0].parameters" ]
}
}
}
Whether intuition was enabled.
Execution mode: "meta" or "minimal".
Direction statistics (meta mode only). Show Direction properties
Number of directives applied.
Policy that issued directives.
Confidence threshold used.
Paths modified by the last directive.
Metrics
Core execution metrics.
{
"metrics" : {
"success" : 1 ,
"plan_count" : 2 ,
"act_count" : 3 ,
"reflect_count" : 1 ,
"veto_count" : 0 ,
"learn_proposals" : 1 ,
"learn_applied" : 0 ,
"latencies" : {
"first_action_ms" : 150 ,
"total_ms" : 5000 ,
"planning_ms" : 500 ,
"execution_ms" : 4000
}
}
}
Success indicator: 1 for success, 0 for failure.
Number of planning iterations.
Number of actions executed.
Number of reflection passes.
Number of learning proposals generated.
Number of learning proposals applied.
Timing metrics. Milliseconds to first action.
Total episode duration in milliseconds.
Time spent in planning phase.
Time spent in execution phase.
Insight
Advanced insight metrics (meta mode only; may be absent in minimal mode).
{
"insight" : {
"metrics" : {
"plan_adherence" : 0.95 ,
"tool_coverage" : 1.0 ,
"branching_factor" : 2 ,
"plan_revisions" : 1
}
}
}
Computed insight metrics. How closely execution matched the plan (0-1). Rounded to 4 decimal places.
Percentage of planned tools that were used (0-1).
Count of direction events emitted.
Number of times the plan was revised.
User-provided metadata.
{
"tags" : {
"environment" : "staging" ,
"team" : "platform" ,
"priority" : "high"
}
}
Key-value pairs of user-provided metadata.
Complete example
{
"schema_version" : "1.3.0" ,
"episode_id" : "ep_2024_abc123_s0" ,
"task" : "Draft release notes for v1.2.0" ,
"started_at" : "2024-01-15T10:30:00Z" ,
"duration_sec" : 5.0 ,
"adapter_result" : "success" ,
"outcome" : "success_unverified" ,
"verification" : {
"provided" : false ,
"passed" : null ,
"assertions" : [],
"workspace_diff" : null ,
"snapshots" : null ,
"policy" : {
"ignore" : [ ".git" , "__pycache__" , ".venv" , ".noesis" ],
"symlinks" : "skip"
}
},
"flags" : {
"intuition" : true ,
"mode" : "meta" ,
"direction" : {
"applied" : 1 ,
"vetoed" : 0 ,
"policy" : "SafetyPolicy@1.0" ,
"threshold" : 0.75 ,
"last_diff" : [ "plan.steps[0].description" ]
}
},
"metrics" : {
"success" : 1 ,
"plan_count" : 2 ,
"act_count" : 3 ,
"reflect_count" : 1 ,
"veto_count" : 0 ,
"learn_proposals" : 0 ,
"learn_applied" : 0 ,
"latencies" : {
"first_action_ms" : 150 ,
"total_ms" : 5000
}
},
"insight" : {
"metrics" : {
"plan_adherence" : 0.95 ,
"tool_coverage" : 1.0 ,
"branching_factor" : 2 ,
"plan_revisions" : 1
}
},
"tags" : {
"environment" : "staging" ,
"team" : "platform"
}
}
Reading summaries
Python
import noesis as ns
summary = ns.summary.read( "ep_abc123" )
# Basic info
print ( f "Task: { summary[ 'task' ] } " )
print ( f "Success: { summary[ 'metrics' ][ 'success' ] } " )
# Metrics
metrics = summary[ 'metrics' ]
print ( f "Actions: { metrics[ 'act_count' ] } " )
print ( f "Vetoes: { metrics[ 'veto_count' ] } " )
print ( f "First action: { metrics[ 'latencies' ][ 'first_action_ms' ] } ms" )
# Insight (if available)
insight = summary.get( 'insight' , {}).get( 'metrics' , {})
print ( f "Plan adherence: { insight.get( 'plan_adherence' , 'N/A' ) } " )
CLI
# Inspect episode dashboard
noesis view ep_abc123
# Resolve artifact directory from JSON output
EP_DIR = $( noesis view ep_abc123 --json | jq -r '.episode_dir' )
# Read summary fields directly
jq '.metrics.success' " $EP_DIR /summary.json"
File access
cat .noesis/episodes/ep_abc123/summary.json | jq .
Use cases
Check success rate
import noesis as ns
episodes = ns.list_runs( limit = 100 )
successes = sum (
ns.summary.read(ep[ 'episode_id' ])[ 'metrics' ][ 'success' ]
for ep in episodes
)
rate = successes / len (episodes)
print ( f "Success rate: { rate :.2%} " )
Find vetoed episodes
import noesis as ns
episodes = ns.list_runs( limit = 100 )
vetoed = [
ep for ep in episodes
if ns.summary.read(ep[ 'episode_id' ])[ 'metrics' ][ 'veto_count' ] > 0
]
print ( f "Vetoed episodes: { len (vetoed) } " )
Analyze latencies
import noesis as ns
import statistics
episodes = ns.list_runs( limit = 100 )
latencies = [
ns.summary.read(ep[ 'episode_id' ])[ 'metrics' ][ 'latencies' ].get( 'first_action_ms' , 0 )
for ep in episodes
]
print ( f "Median first action: { statistics.median(latencies) :.0f} ms" )
print ( f "P95 first action: { statistics.quantiles(latencies, n = 20 )[ - 1 ] :.0f} ms" )
Next steps
State schema State artifact reference.
Export metrics Export to observability tools.