let5see/clawdbot

Fork 0

Files

Peter Steinberger 51a9053387 feat: add OpenProse plugin skills

2026-01-23 00:49:40 +00:00

19 KiB

Raw Blame History

role, status, summary, requires, see-also

role

status

summary

requires

see-also

sqlite-state-management

experimental

SQLite-based state management for OpenProse programs. This approach persists execution state to a SQLite database, enabling structured queries, atomic transactions, and flexible schema evolution.

sqlite3 CLI tool in PATH

../prose.md
VM execution semantics

filesystem.md
File-based state (default, more prescriptive)

in-context.md
In-context state (for simple programs)

../primitives/session.md
Session context and compaction guidelines

SQLite State Management (Experimental)

This document describes how the OpenProse VM tracks execution state using a SQLite database. This is an experimental alternative to file-based state (filesystem.md) and in-context state (in-context.md).

Prerequisites

Requires: The sqlite3 command-line tool must be available in your PATH.

Platform	Installation
macOS	Pre-installed
Linux	`apt install sqlite3` / `dnf install sqlite3` / etc.
Windows	`winget install SQLite.SQLite` or download from sqlite.org

If sqlite3 is not available, the VM will fall back to filesystem state and warn the user.

Overview

SQLite state provides:

Atomic transactions: State changes are ACID-compliant
Structured queries: Find specific bindings, filter by status, aggregate results
Flexible schema: Add columns and tables as needed
Single-file portability: The entire run state is one .db file
Concurrent access: SQLite handles locking automatically

Key principle: The database is a flexible workspace. The VM and subagents share it as a coordination mechanism, not a rigid contract.

Database Location

The database lives within the standard run directory:

.prose/runs/{YYYYMMDD}-{HHMMSS}-{random}/
├── state.db          # SQLite database (this file)
├── program.prose     # Copy of running program
└── attachments/      # Large outputs that don't fit in DB (optional)

Run ID format: Same as filesystem state: {YYYYMMDD}-{HHMMSS}-{random6}

Example: .prose/runs/20260116-143052-a7b3c9/state.db

Project-Scoped and User-Scoped Agents

Execution-scoped agents (the default) live in the per-run state.db. However, project-scoped agents (persist: project) and user-scoped agents (persist: user) must survive across runs.

For project-scoped agents, use a separate database:

.prose/
├── agents.db                 # Project-scoped agent memory (survives runs)
└── runs/
    └── {id}/
        └── state.db          # Execution-scoped state (dies with run)

For user-scoped agents, use a database in the home directory:

~/.prose/
└── agents.db                 # User-scoped agent memory (survives across projects)

The agents and agent_segments tables for project-scoped agents live in .prose/agents.db, and for user-scoped agents live in ~/.prose/agents.db. The VM initializes these databases on first use and provides the correct path to subagents.

Responsibility Separation

This section defines who does what. This is the contract between the VM and subagents.

VM Responsibilities

The VM (the orchestrating agent running the .prose program) is responsible for:

Responsibility	Description
Database creation	Create `state.db` and initialize core tables at run start
Program registration	Store the program source and metadata
Execution tracking	Update position, status, and timing as statements execute
Subagent spawning	Spawn sessions via Task tool with database path and instructions
Parallel coordination	Track branch status, implement join strategies
Loop management	Track iteration counts, evaluate conditions
Error aggregation	Record failures, manage retry state
Context preservation	Maintain sufficient narration in the main conversation thread so execution can be understood and resumed
Completion detection	Mark the run as complete when finished

Critical: The VM must preserve enough context in its own conversation to understand execution state without re-reading the entire database. The database is for coordination and persistence, not a replacement for working memory.

Subagent Responsibilities

Subagents (sessions spawned by the VM) are responsible for:

Responsibility	Description
Writing own outputs	Insert/update their binding in the `bindings` table
Memory management	For persistent agents: read and update their memory record
Segment recording	For persistent agents: append segment history
Attachment handling	Write large outputs to `attachments/` directory, store path in DB
Atomic writes	Use transactions when updating multiple related records

Critical: Subagents write ONLY to bindings, agents, and agent_segments tables. The VM owns the execution table entirely. Completion signaling happens through the substrate (Task tool return), not database updates.

Critical: Subagents must write their outputs directly to the database. The VM does not write subagent outputs—it only reads them after the subagent completes.

What subagents return to the VM: A confirmation message with the binding location—not the full content:

Root scope:

Binding written: research
Location: .prose/runs/20260116-143052-a7b3c9/state.db (bindings table, name='research', execution_id=NULL)
Summary: AI safety research covering alignment, robustness, and interpretability with 15 citations.

Inside block invocation:

Binding written: result
Location: .prose/runs/20260116-143052-a7b3c9/state.db (bindings table, name='result', execution_id=43)
Execution ID: 43
Summary: Processed chunk into 3 sub-parts for recursive processing.

The VM tracks locations, not values. This keeps the VM's context lean and enables arbitrarily large intermediate values.

Shared Concerns

Concern	Who Handles
Schema evolution	Either (use `CREATE TABLE IF NOT EXISTS`, `ALTER TABLE` as needed)
Custom tables	Either (prefix with `x_` for extensions)
Indexing	Either (add indexes for frequently-queried columns)
Cleanup	VM (at run end, optionally vacuum)

Core Schema

The VM initializes these tables. This is a minimum viable schema—extend freely.

-- Run metadata
CREATE TABLE IF NOT EXISTS run (
    id TEXT PRIMARY KEY,
    program_path TEXT,
    program_source TEXT,
    started_at TEXT DEFAULT (datetime('now')),
    updated_at TEXT DEFAULT (datetime('now')),
    status TEXT DEFAULT 'running',  -- running, completed, failed, interrupted
    state_mode TEXT DEFAULT 'sqlite'
);

-- Execution position and history
CREATE TABLE IF NOT EXISTS execution (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    statement_index INTEGER,
    statement_text TEXT,
    status TEXT,  -- pending, executing, completed, failed, skipped
    started_at TEXT,
    completed_at TEXT,
    error_message TEXT,
    parent_id INTEGER REFERENCES execution(id),  -- for nested blocks
    metadata TEXT  -- JSON for construct-specific data (loop iteration, parallel branch, etc.)
);

-- All named values (input, output, let, const)
CREATE TABLE IF NOT EXISTS bindings (
    name TEXT,
    execution_id INTEGER,  -- NULL for root scope, non-null for block invocations
    kind TEXT,  -- input, output, let, const
    value TEXT,
    source_statement TEXT,
    created_at TEXT DEFAULT (datetime('now')),
    updated_at TEXT DEFAULT (datetime('now')),
    attachment_path TEXT,  -- if value is too large, store path to file
    PRIMARY KEY (name, IFNULL(execution_id, -1))  -- IFNULL handles NULL for root scope
);

-- Persistent agent memory
CREATE TABLE IF NOT EXISTS agents (
    name TEXT PRIMARY KEY,
    scope TEXT,  -- execution, project, user, custom
    memory TEXT,
    created_at TEXT DEFAULT (datetime('now')),
    updated_at TEXT DEFAULT (datetime('now'))
);

-- Agent invocation history
CREATE TABLE IF NOT EXISTS agent_segments (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    agent_name TEXT REFERENCES agents(name),
    segment_number INTEGER,
    timestamp TEXT DEFAULT (datetime('now')),
    prompt TEXT,
    summary TEXT,
    UNIQUE(agent_name, segment_number)
);

-- Import registry
CREATE TABLE IF NOT EXISTS imports (
    alias TEXT PRIMARY KEY,
    source_url TEXT,
    fetched_at TEXT,
    inputs_schema TEXT,  -- JSON
    outputs_schema TEXT  -- JSON
);

Schema Conventions

Timestamps: Use ISO 8601 format (datetime('now'))
JSON fields: Store structured data as JSON text in metadata, *_schema columns
Large values: If a binding value exceeds ~100KB, write to attachments/{name}.md and store path
Extension tables: Prefix with x_ (e.g., x_metrics, x_audit_log)
Anonymous bindings: Sessions without explicit capture (session "..." without let x =) use auto-generated names: anon_001, anon_002, etc.
Import bindings: Prefix with import alias for scoping: research.findings, research.sources
Scoped bindings: Use execution_id column—NULL for root scope, non-null for block invocations

Scope Resolution Query

For recursive blocks, bindings are scoped to their execution frame. Resolve variables by walking up the call stack:

-- Find binding 'result' starting from execution_id 43
WITH RECURSIVE scope_chain AS (
  -- Start with current execution
  SELECT id, parent_id FROM execution WHERE id = 43
  UNION ALL
  -- Walk up to parent
  SELECT e.id, e.parent_id
  FROM execution e
  JOIN scope_chain s ON e.id = s.parent_id
)
SELECT b.* FROM bindings b
LEFT JOIN scope_chain s ON b.execution_id = s.id
WHERE b.name = 'result'
  AND (b.execution_id IN (SELECT id FROM scope_chain) OR b.execution_id IS NULL)
ORDER BY
  CASE WHEN b.execution_id IS NULL THEN 1 ELSE 0 END,  -- Prefer scoped over root
  s.id DESC NULLS LAST  -- Prefer deeper (more local) scope
LIMIT 1;

Simpler version if you know the scope chain:

-- Direct lookup: check current scope, then parent, then root
SELECT * FROM bindings
WHERE name = 'result'
  AND (execution_id = 43 OR execution_id = 42 OR execution_id IS NULL)
ORDER BY execution_id DESC NULLS LAST
LIMIT 1;

Database Interaction

Both VM and subagents interact via the sqlite3 CLI.

From the VM

# Initialize database
sqlite3 .prose/runs/20260116-143052-a7b3c9/state.db "CREATE TABLE IF NOT EXISTS..."

# Update execution position
sqlite3 .prose/runs/20260116-143052-a7b3c9/state.db "
  INSERT INTO execution (statement_index, statement_text, status, started_at)
  VALUES (3, 'session \"Research AI safety\"', 'executing', datetime('now'))
"

# Read a binding
sqlite3 -json .prose/runs/20260116-143052-a7b3c9/state.db "
  SELECT value FROM bindings WHERE name = 'research'
"

# Check parallel branch status
sqlite3 .prose/runs/20260116-143052-a7b3c9/state.db "
  SELECT statement_text, status FROM execution
  WHERE json_extract(metadata, '$.parallel_id') = 'p1'
"

From Subagents

The VM provides the database path and instructions when spawning:

Root scope (outside block invocations):

Your output database is:
  .prose/runs/20260116-143052-a7b3c9/state.db

When complete, write your output:

sqlite3 .prose/runs/20260116-143052-a7b3c9/state.db "
  INSERT OR REPLACE INTO bindings (name, execution_id, kind, value, source_statement, updated_at)
  VALUES (
    'research',
    NULL,  -- root scope
    'let',
    'AI safety research covers alignment, robustness...',
    'let research = session: researcher',
    datetime('now')
  )
"

Inside block invocation (include execution_id):

Execution scope:
  execution_id: 43
  block: process
  depth: 3

Your output database is:
  .prose/runs/20260116-143052-a7b3c9/state.db

When complete, write your output:

sqlite3 .prose/runs/20260116-143052-a7b3c9/state.db "
  INSERT OR REPLACE INTO bindings (name, execution_id, kind, value, source_statement, updated_at)
  VALUES (
    'result',
    43,  -- scoped to this execution
    'let',
    'Processed chunk into 3 sub-parts...',
    'let result = session \"Process chunk\"',
    datetime('now')
  )
"

For persistent agents (execution-scoped):

Your memory is in the database:
  .prose/runs/20260116-143052-a7b3c9/state.db

Read your current state:
  sqlite3 -json .prose/runs/20260116-143052-a7b3c9/state.db "SELECT memory FROM agents WHERE name = 'captain'"

Update when done:
  sqlite3 .prose/runs/20260116-143052-a7b3c9/state.db "UPDATE agents SET memory = '...', updated_at = datetime('now') WHERE name = 'captain'"

Record this segment:
  sqlite3 .prose/runs/20260116-143052-a7b3c9/state.db "INSERT INTO agent_segments (agent_name, segment_number, prompt, summary) VALUES ('captain', 3, '...', '...')"

For project-scoped agents, use .prose/agents.db. For user-scoped agents, use ~/.prose/agents.db.

Context Preservation in Main Thread

This is critical. The database is for persistence and coordination, but the VM must still maintain conversational context.

What the VM Must Narrate

Even with SQLite state, the VM should narrate key events in its conversation:

[Position] Statement 3: let research = session: researcher
   Spawning session, will write to state.db
   [Task tool call]
[Success] Session complete, binding written to DB
[Binding] research = <stored in state.db>

Why Both?

Purpose	Mechanism
Working memory	Conversation narration (what the VM "remembers" without re-querying)
Durable state	SQLite database (survives context limits, enables resumption)
Subagent coordination	SQLite database (shared access point)
Debugging/inspection	SQLite database (queryable history)

The narration is the VM's "mental model" of execution. The database is the "source of truth" for resumption and inspection.

Parallel Execution

For parallel blocks, the VM uses the metadata JSON field to track branches. Only the VM writes to the execution table.

-- VM marks parallel start
INSERT INTO execution (statement_index, statement_text, status, metadata)
VALUES (5, 'parallel:', 'executing', '{"parallel_id": "p1", "strategy": "all", "branches": ["a", "b", "c"]}');

-- VM creates execution record for each branch
INSERT INTO execution (statement_index, statement_text, status, parent_id, metadata)
VALUES (6, 'a = session "Task A"', 'executing', 5, '{"parallel_id": "p1", "branch": "a"}');

-- Subagent writes its output to bindings table (see "From Subagents" section)
-- Task tool signals completion to VM via substrate

-- VM marks branch complete after Task returns
UPDATE execution SET status = 'completed', completed_at = datetime('now')
WHERE json_extract(metadata, '$.parallel_id') = 'p1' AND json_extract(metadata, '$.branch') = 'a';

-- VM checks if all branches complete
SELECT COUNT(*) as pending FROM execution
WHERE json_extract(metadata, '$.parallel_id') = 'p1' AND status != 'completed';

Loop Tracking

-- Loop metadata tracks iteration state
INSERT INTO execution (statement_index, statement_text, status, metadata)
VALUES (10, 'loop until **analysis complete** (max: 5):', 'executing',
  '{"loop_id": "l1", "max_iterations": 5, "current_iteration": 0, "condition": "**analysis complete**"}');

-- Update iteration
UPDATE execution
SET metadata = json_set(metadata, '$.current_iteration', 2),
    updated_at = datetime('now')
WHERE json_extract(metadata, '$.loop_id') = 'l1';

Error Handling

-- Record failure
UPDATE execution
SET status = 'failed',
    error_message = 'Connection timeout after 30s',
    completed_at = datetime('now')
WHERE id = 15;

-- Track retry attempts in metadata
UPDATE execution
SET metadata = json_set(metadata, '$.retry_attempt', 2, '$.max_retries', 3)
WHERE id = 15;

Large Outputs

When a binding value is too large for comfortable database storage (>100KB):

Write content to attachments/{binding_name}.md
Store the path in the attachment_path column
Leave value as a summary or null

INSERT INTO bindings (name, kind, value, attachment_path, source_statement)
VALUES (
  'full_report',
  'let',
  'Full analysis report (847KB) - see attachment',
  'attachments/full_report.md',
  'let full_report = session "Generate comprehensive report"'
);

Resuming Execution

To resume an interrupted run:

-- Find current position
SELECT statement_index, statement_text, status
FROM execution
WHERE status = 'executing'
ORDER BY id DESC LIMIT 1;

-- Get all completed bindings
SELECT name, kind, value, attachment_path FROM bindings;

-- Get agent memory states
SELECT name, memory FROM agents;

-- Check parallel block status
SELECT json_extract(metadata, '$.branch') as branch, status
FROM execution
WHERE json_extract(metadata, '$.parallel_id') IS NOT NULL
  AND parent_id = (SELECT id FROM execution WHERE status = 'executing' AND statement_text LIKE 'parallel:%');

Flexibility Encouragement

Unlike filesystem state, SQLite state is intentionally less prescriptive. The core schema is a starting point. You are encouraged to:

Add columns to existing tables as needed
Create extension tables (prefix with x_)
Store custom metrics (timing, token counts, model info)
Build indexes for your query patterns
Use JSON functions for semi-structured data

Example extensions:

-- Custom metrics table
CREATE TABLE x_metrics (
    execution_id INTEGER REFERENCES execution(id),
    metric_name TEXT,
    metric_value REAL,
    recorded_at TEXT DEFAULT (datetime('now'))
);

-- Add custom column
ALTER TABLE bindings ADD COLUMN token_count INTEGER;

-- Create index for common query
CREATE INDEX idx_execution_status ON execution(status);

The database is your workspace. Use it.

Comparison with Other Modes

Aspect	filesystem.md	in-context.md	sqlite.md
State location	`.prose/runs/{id}/` files	Conversation history	`.prose/runs/{id}/state.db`
Queryable	Via file reads	No	Yes (SQL)
Atomic updates	No	N/A	Yes (transactions)
Schema flexibility	Rigid file structure	N/A	Flexible (add tables/columns)
Resumption	Read state.md	Re-read conversation	Query database
Complexity ceiling	High	Low (<30 statements)	High
Dependency	None	None	sqlite3 CLI
Status	Stable	Stable	Experimental

Summary

SQLite state management:

Uses a single database file per run
Provides clear responsibility separation between VM and subagents
Enables structured queries for state inspection
Supports atomic transactions for reliable updates
Allows flexible schema evolution as needed
Requires the sqlite3 CLI tool
Is experimental—expect changes

The core contract: the VM manages execution flow and spawns subagents; subagents write their own outputs directly to the database. Both maintain the principle that what happens is recorded, and what is recorded can be queried.

19 KiB Raw Blame History