Files
clawdbot/extensions/open-prose/skills/prose/state/sqlite.md
2026-01-23 00:49:40 +00:00

19 KiB

role, status, summary, requires, see-also
role status summary requires see-also
sqlite-state-management experimental SQLite-based state management for OpenProse programs. This approach persists execution state to a SQLite database, enabling structured queries, atomic transactions, and flexible schema evolution. sqlite3 CLI tool in PATH
../prose.md
VM execution semantics
filesystem.md
File-based state (default, more prescriptive)
in-context.md
In-context state (for simple programs)
../primitives/session.md
Session context and compaction guidelines

SQLite State Management (Experimental)

This document describes how the OpenProse VM tracks execution state using a SQLite database. This is an experimental alternative to file-based state (filesystem.md) and in-context state (in-context.md).

Prerequisites

Requires: The sqlite3 command-line tool must be available in your PATH.

Platform Installation
macOS Pre-installed
Linux apt install sqlite3 / dnf install sqlite3 / etc.
Windows winget install SQLite.SQLite or download from sqlite.org

If sqlite3 is not available, the VM will fall back to filesystem state and warn the user.


Overview

SQLite state provides:

  • Atomic transactions: State changes are ACID-compliant
  • Structured queries: Find specific bindings, filter by status, aggregate results
  • Flexible schema: Add columns and tables as needed
  • Single-file portability: The entire run state is one .db file
  • Concurrent access: SQLite handles locking automatically

Key principle: The database is a flexible workspace. The VM and subagents share it as a coordination mechanism, not a rigid contract.


Database Location

The database lives within the standard run directory:

.prose/runs/{YYYYMMDD}-{HHMMSS}-{random}/
├── state.db          # SQLite database (this file)
├── program.prose     # Copy of running program
└── attachments/      # Large outputs that don't fit in DB (optional)

Run ID format: Same as filesystem state: {YYYYMMDD}-{HHMMSS}-{random6}

Example: .prose/runs/20260116-143052-a7b3c9/state.db

Project-Scoped and User-Scoped Agents

Execution-scoped agents (the default) live in the per-run state.db. However, project-scoped agents (persist: project) and user-scoped agents (persist: user) must survive across runs.

For project-scoped agents, use a separate database:

.prose/
├── agents.db                 # Project-scoped agent memory (survives runs)
└── runs/
    └── {id}/
        └── state.db          # Execution-scoped state (dies with run)

For user-scoped agents, use a database in the home directory:

~/.prose/
└── agents.db                 # User-scoped agent memory (survives across projects)

The agents and agent_segments tables for project-scoped agents live in .prose/agents.db, and for user-scoped agents live in ~/.prose/agents.db. The VM initializes these databases on first use and provides the correct path to subagents.


Responsibility Separation

This section defines who does what. This is the contract between the VM and subagents.

VM Responsibilities

The VM (the orchestrating agent running the .prose program) is responsible for:

Responsibility Description
Database creation Create state.db and initialize core tables at run start
Program registration Store the program source and metadata
Execution tracking Update position, status, and timing as statements execute
Subagent spawning Spawn sessions via Task tool with database path and instructions
Parallel coordination Track branch status, implement join strategies
Loop management Track iteration counts, evaluate conditions
Error aggregation Record failures, manage retry state
Context preservation Maintain sufficient narration in the main conversation thread so execution can be understood and resumed
Completion detection Mark the run as complete when finished

Critical: The VM must preserve enough context in its own conversation to understand execution state without re-reading the entire database. The database is for coordination and persistence, not a replacement for working memory.

Subagent Responsibilities

Subagents (sessions spawned by the VM) are responsible for:

Responsibility Description
Writing own outputs Insert/update their binding in the bindings table
Memory management For persistent agents: read and update their memory record
Segment recording For persistent agents: append segment history
Attachment handling Write large outputs to attachments/ directory, store path in DB
Atomic writes Use transactions when updating multiple related records

Critical: Subagents write ONLY to bindings, agents, and agent_segments tables. The VM owns the execution table entirely. Completion signaling happens through the substrate (Task tool return), not database updates.

Critical: Subagents must write their outputs directly to the database. The VM does not write subagent outputs—it only reads them after the subagent completes.

What subagents return to the VM: A confirmation message with the binding location—not the full content:

Root scope:

Binding written: research
Location: .prose/runs/20260116-143052-a7b3c9/state.db (bindings table, name='research', execution_id=NULL)
Summary: AI safety research covering alignment, robustness, and interpretability with 15 citations.

Inside block invocation:

Binding written: result
Location: .prose/runs/20260116-143052-a7b3c9/state.db (bindings table, name='result', execution_id=43)
Execution ID: 43
Summary: Processed chunk into 3 sub-parts for recursive processing.

The VM tracks locations, not values. This keeps the VM's context lean and enables arbitrarily large intermediate values.

Shared Concerns

Concern Who Handles
Schema evolution Either (use CREATE TABLE IF NOT EXISTS, ALTER TABLE as needed)
Custom tables Either (prefix with x_ for extensions)
Indexing Either (add indexes for frequently-queried columns)
Cleanup VM (at run end, optionally vacuum)

Core Schema

The VM initializes these tables. This is a minimum viable schema—extend freely.

-- Run metadata
CREATE TABLE IF NOT EXISTS run (
    id TEXT PRIMARY KEY,
    program_path TEXT,
    program_source TEXT,
    started_at TEXT DEFAULT (datetime('now')),
    updated_at TEXT DEFAULT (datetime('now')),
    status TEXT DEFAULT 'running',  -- running, completed, failed, interrupted
    state_mode TEXT DEFAULT 'sqlite'
);

-- Execution position and history
CREATE TABLE IF NOT EXISTS execution (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    statement_index INTEGER,
    statement_text TEXT,
    status TEXT,  -- pending, executing, completed, failed, skipped
    started_at TEXT,
    completed_at TEXT,
    error_message TEXT,
    parent_id INTEGER REFERENCES execution(id),  -- for nested blocks
    metadata TEXT  -- JSON for construct-specific data (loop iteration, parallel branch, etc.)
);

-- All named values (input, output, let, const)
CREATE TABLE IF NOT EXISTS bindings (
    name TEXT,
    execution_id INTEGER,  -- NULL for root scope, non-null for block invocations
    kind TEXT,  -- input, output, let, const
    value TEXT,
    source_statement TEXT,
    created_at TEXT DEFAULT (datetime('now')),
    updated_at TEXT DEFAULT (datetime('now')),
    attachment_path TEXT,  -- if value is too large, store path to file
    PRIMARY KEY (name, IFNULL(execution_id, -1))  -- IFNULL handles NULL for root scope
);

-- Persistent agent memory
CREATE TABLE IF NOT EXISTS agents (
    name TEXT PRIMARY KEY,
    scope TEXT,  -- execution, project, user, custom
    memory TEXT,
    created_at TEXT DEFAULT (datetime('now')),
    updated_at TEXT DEFAULT (datetime('now'))
);

-- Agent invocation history
CREATE TABLE IF NOT EXISTS agent_segments (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    agent_name TEXT REFERENCES agents(name),
    segment_number INTEGER,
    timestamp TEXT DEFAULT (datetime('now')),
    prompt TEXT,
    summary TEXT,
    UNIQUE(agent_name, segment_number)
);

-- Import registry
CREATE TABLE IF NOT EXISTS imports (
    alias TEXT PRIMARY KEY,
    source_url TEXT,
    fetched_at TEXT,
    inputs_schema TEXT,  -- JSON
    outputs_schema TEXT  -- JSON
);

Schema Conventions

  • Timestamps: Use ISO 8601 format (datetime('now'))
  • JSON fields: Store structured data as JSON text in metadata, *_schema columns
  • Large values: If a binding value exceeds ~100KB, write to attachments/{name}.md and store path
  • Extension tables: Prefix with x_ (e.g., x_metrics, x_audit_log)
  • Anonymous bindings: Sessions without explicit capture (session "..." without let x =) use auto-generated names: anon_001, anon_002, etc.
  • Import bindings: Prefix with import alias for scoping: research.findings, research.sources
  • Scoped bindings: Use execution_id column—NULL for root scope, non-null for block invocations

Scope Resolution Query

For recursive blocks, bindings are scoped to their execution frame. Resolve variables by walking up the call stack:

-- Find binding 'result' starting from execution_id 43
WITH RECURSIVE scope_chain AS (
  -- Start with current execution
  SELECT id, parent_id FROM execution WHERE id = 43
  UNION ALL
  -- Walk up to parent
  SELECT e.id, e.parent_id
  FROM execution e
  JOIN scope_chain s ON e.id = s.parent_id
)
SELECT b.* FROM bindings b
LEFT JOIN scope_chain s ON b.execution_id = s.id
WHERE b.name = 'result'
  AND (b.execution_id IN (SELECT id FROM scope_chain) OR b.execution_id IS NULL)
ORDER BY
  CASE WHEN b.execution_id IS NULL THEN 1 ELSE 0 END,  -- Prefer scoped over root
  s.id DESC NULLS LAST  -- Prefer deeper (more local) scope
LIMIT 1;

Simpler version if you know the scope chain:

-- Direct lookup: check current scope, then parent, then root
SELECT * FROM bindings
WHERE name = 'result'
  AND (execution_id = 43 OR execution_id = 42 OR execution_id IS NULL)
ORDER BY execution_id DESC NULLS LAST
LIMIT 1;

Database Interaction

Both VM and subagents interact via the sqlite3 CLI.

From the VM

# Initialize database
sqlite3 .prose/runs/20260116-143052-a7b3c9/state.db "CREATE TABLE IF NOT EXISTS..."

# Update execution position
sqlite3 .prose/runs/20260116-143052-a7b3c9/state.db "
  INSERT INTO execution (statement_index, statement_text, status, started_at)
  VALUES (3, 'session \"Research AI safety\"', 'executing', datetime('now'))
"

# Read a binding
sqlite3 -json .prose/runs/20260116-143052-a7b3c9/state.db "
  SELECT value FROM bindings WHERE name = 'research'
"

# Check parallel branch status
sqlite3 .prose/runs/20260116-143052-a7b3c9/state.db "
  SELECT statement_text, status FROM execution
  WHERE json_extract(metadata, '$.parallel_id') = 'p1'
"

From Subagents

The VM provides the database path and instructions when spawning:

Root scope (outside block invocations):

Your output database is:
  .prose/runs/20260116-143052-a7b3c9/state.db

When complete, write your output:

sqlite3 .prose/runs/20260116-143052-a7b3c9/state.db "
  INSERT OR REPLACE INTO bindings (name, execution_id, kind, value, source_statement, updated_at)
  VALUES (
    'research',
    NULL,  -- root scope
    'let',
    'AI safety research covers alignment, robustness...',
    'let research = session: researcher',
    datetime('now')
  )
"

Inside block invocation (include execution_id):

Execution scope:
  execution_id: 43
  block: process
  depth: 3

Your output database is:
  .prose/runs/20260116-143052-a7b3c9/state.db

When complete, write your output:

sqlite3 .prose/runs/20260116-143052-a7b3c9/state.db "
  INSERT OR REPLACE INTO bindings (name, execution_id, kind, value, source_statement, updated_at)
  VALUES (
    'result',
    43,  -- scoped to this execution
    'let',
    'Processed chunk into 3 sub-parts...',
    'let result = session \"Process chunk\"',
    datetime('now')
  )
"

For persistent agents (execution-scoped):

Your memory is in the database:
  .prose/runs/20260116-143052-a7b3c9/state.db

Read your current state:
  sqlite3 -json .prose/runs/20260116-143052-a7b3c9/state.db "SELECT memory FROM agents WHERE name = 'captain'"

Update when done:
  sqlite3 .prose/runs/20260116-143052-a7b3c9/state.db "UPDATE agents SET memory = '...', updated_at = datetime('now') WHERE name = 'captain'"

Record this segment:
  sqlite3 .prose/runs/20260116-143052-a7b3c9/state.db "INSERT INTO agent_segments (agent_name, segment_number, prompt, summary) VALUES ('captain', 3, '...', '...')"

For project-scoped agents, use .prose/agents.db. For user-scoped agents, use ~/.prose/agents.db.


Context Preservation in Main Thread

This is critical. The database is for persistence and coordination, but the VM must still maintain conversational context.

What the VM Must Narrate

Even with SQLite state, the VM should narrate key events in its conversation:

[Position] Statement 3: let research = session: researcher
   Spawning session, will write to state.db
   [Task tool call]
[Success] Session complete, binding written to DB
[Binding] research = <stored in state.db>

Why Both?

Purpose Mechanism
Working memory Conversation narration (what the VM "remembers" without re-querying)
Durable state SQLite database (survives context limits, enables resumption)
Subagent coordination SQLite database (shared access point)
Debugging/inspection SQLite database (queryable history)

The narration is the VM's "mental model" of execution. The database is the "source of truth" for resumption and inspection.


Parallel Execution

For parallel blocks, the VM uses the metadata JSON field to track branches. Only the VM writes to the execution table.

-- VM marks parallel start
INSERT INTO execution (statement_index, statement_text, status, metadata)
VALUES (5, 'parallel:', 'executing', '{"parallel_id": "p1", "strategy": "all", "branches": ["a", "b", "c"]}');

-- VM creates execution record for each branch
INSERT INTO execution (statement_index, statement_text, status, parent_id, metadata)
VALUES (6, 'a = session "Task A"', 'executing', 5, '{"parallel_id": "p1", "branch": "a"}');

-- Subagent writes its output to bindings table (see "From Subagents" section)
-- Task tool signals completion to VM via substrate

-- VM marks branch complete after Task returns
UPDATE execution SET status = 'completed', completed_at = datetime('now')
WHERE json_extract(metadata, '$.parallel_id') = 'p1' AND json_extract(metadata, '$.branch') = 'a';

-- VM checks if all branches complete
SELECT COUNT(*) as pending FROM execution
WHERE json_extract(metadata, '$.parallel_id') = 'p1' AND status != 'completed';

Loop Tracking

-- Loop metadata tracks iteration state
INSERT INTO execution (statement_index, statement_text, status, metadata)
VALUES (10, 'loop until **analysis complete** (max: 5):', 'executing',
  '{"loop_id": "l1", "max_iterations": 5, "current_iteration": 0, "condition": "**analysis complete**"}');

-- Update iteration
UPDATE execution
SET metadata = json_set(metadata, '$.current_iteration', 2),
    updated_at = datetime('now')
WHERE json_extract(metadata, '$.loop_id') = 'l1';

Error Handling

-- Record failure
UPDATE execution
SET status = 'failed',
    error_message = 'Connection timeout after 30s',
    completed_at = datetime('now')
WHERE id = 15;

-- Track retry attempts in metadata
UPDATE execution
SET metadata = json_set(metadata, '$.retry_attempt', 2, '$.max_retries', 3)
WHERE id = 15;

Large Outputs

When a binding value is too large for comfortable database storage (>100KB):

  1. Write content to attachments/{binding_name}.md
  2. Store the path in the attachment_path column
  3. Leave value as a summary or null
INSERT INTO bindings (name, kind, value, attachment_path, source_statement)
VALUES (
  'full_report',
  'let',
  'Full analysis report (847KB) - see attachment',
  'attachments/full_report.md',
  'let full_report = session "Generate comprehensive report"'
);

Resuming Execution

To resume an interrupted run:

-- Find current position
SELECT statement_index, statement_text, status
FROM execution
WHERE status = 'executing'
ORDER BY id DESC LIMIT 1;

-- Get all completed bindings
SELECT name, kind, value, attachment_path FROM bindings;

-- Get agent memory states
SELECT name, memory FROM agents;

-- Check parallel block status
SELECT json_extract(metadata, '$.branch') as branch, status
FROM execution
WHERE json_extract(metadata, '$.parallel_id') IS NOT NULL
  AND parent_id = (SELECT id FROM execution WHERE status = 'executing' AND statement_text LIKE 'parallel:%');

Flexibility Encouragement

Unlike filesystem state, SQLite state is intentionally less prescriptive. The core schema is a starting point. You are encouraged to:

  • Add columns to existing tables as needed
  • Create extension tables (prefix with x_)
  • Store custom metrics (timing, token counts, model info)
  • Build indexes for your query patterns
  • Use JSON functions for semi-structured data

Example extensions:

-- Custom metrics table
CREATE TABLE x_metrics (
    execution_id INTEGER REFERENCES execution(id),
    metric_name TEXT,
    metric_value REAL,
    recorded_at TEXT DEFAULT (datetime('now'))
);

-- Add custom column
ALTER TABLE bindings ADD COLUMN token_count INTEGER;

-- Create index for common query
CREATE INDEX idx_execution_status ON execution(status);

The database is your workspace. Use it.


Comparison with Other Modes

Aspect filesystem.md in-context.md sqlite.md
State location .prose/runs/{id}/ files Conversation history .prose/runs/{id}/state.db
Queryable Via file reads No Yes (SQL)
Atomic updates No N/A Yes (transactions)
Schema flexibility Rigid file structure N/A Flexible (add tables/columns)
Resumption Read state.md Re-read conversation Query database
Complexity ceiling High Low (<30 statements) High
Dependency None None sqlite3 CLI
Status Stable Stable Experimental

Summary

SQLite state management:

  1. Uses a single database file per run
  2. Provides clear responsibility separation between VM and subagents
  3. Enables structured queries for state inspection
  4. Supports atomic transactions for reliable updates
  5. Allows flexible schema evolution as needed
  6. Requires the sqlite3 CLI tool
  7. Is experimental—expect changes

The core contract: the VM manages execution flow and spawns subagents; subagents write their own outputs directly to the database. Both maintain the principle that what happens is recorded, and what is recorded can be queried.