876 lines
31 KiB
Markdown
876 lines
31 KiB
Markdown
---
|
|
role: postgres-state-management
|
|
status: experimental
|
|
summary: |
|
|
PostgreSQL-based state management for OpenProse programs. This approach persists
|
|
execution state to a PostgreSQL database, enabling true concurrent writes,
|
|
network access, team collaboration, and high-throughput workloads.
|
|
requires: psql CLI tool in PATH, running PostgreSQL server
|
|
see-also:
|
|
- ../prose.md: VM execution semantics
|
|
- filesystem.md: File-based state (default, simpler)
|
|
- sqlite.md: SQLite state (queryable, single-file)
|
|
- in-context.md: In-context state (for simple programs)
|
|
- ../primitives/session.md: Session context and compaction guidelines
|
|
---
|
|
|
|
# PostgreSQL State Management (Experimental)
|
|
|
|
This document describes how the OpenProse VM tracks execution state using a **PostgreSQL database**. This is an experimental alternative to file-based state (`filesystem.md`), SQLite state (`sqlite.md`), and in-context state (`in-context.md`).
|
|
|
|
## Prerequisites
|
|
|
|
**Requires:**
|
|
1. The `psql` command-line tool must be available in your PATH
|
|
2. A running PostgreSQL server (local, Docker, or cloud)
|
|
|
|
### Installing psql
|
|
|
|
| Platform | Command | Notes |
|
|
|----------|---------|-------|
|
|
| macOS (Homebrew) | `brew install libpq && brew link --force libpq` | Client-only; no server |
|
|
| macOS (Postgres.app) | Download from https://postgresapp.com | Full install with GUI |
|
|
| Debian/Ubuntu | `apt install postgresql-client` | Client-only |
|
|
| Fedora/RHEL | `dnf install postgresql` | Client-only |
|
|
| Arch Linux | `pacman -S postgresql-libs` | Client-only |
|
|
| Windows | `winget install PostgreSQL.PostgreSQL` | Full installer |
|
|
|
|
After installation, verify:
|
|
|
|
```bash
|
|
psql --version # Should output: psql (PostgreSQL) 16.x
|
|
```
|
|
|
|
If `psql` is not available, the VM will offer to fall back to SQLite state.
|
|
|
|
---
|
|
|
|
## Overview
|
|
|
|
PostgreSQL state provides:
|
|
|
|
- **True concurrent writes**: Row-level locking allows parallel branches to write simultaneously
|
|
- **Network access**: Query state from any machine, external tools, or dashboards
|
|
- **Team collaboration**: Multiple developers can share run state
|
|
- **Rich SQL**: JSONB queries, window functions, CTEs for complex state analysis
|
|
- **High throughput**: Handle 1000+ writes/minute, multi-GB outputs
|
|
- **Durability**: WAL-based recovery, point-in-time restore
|
|
|
|
**Key principle:** The database is a flexible, shared workspace. The VM and subagents coordinate through it, and external tools can observe and query execution state in real-time.
|
|
|
|
---
|
|
|
|
## Security Warning
|
|
|
|
**⚠️ Credentials are visible to subagents.** The `OPENPROSE_POSTGRES_URL` connection string is passed to spawned sessions so they can write their outputs. This means:
|
|
|
|
- Database credentials appear in subagent context and may be logged
|
|
- Treat these credentials as **non-sensitive**
|
|
- Use a **dedicated database** for OpenProse, not your production systems
|
|
- Create a **limited-privilege user** with access only to the `openprose` schema
|
|
|
|
**Recommended setup:**
|
|
```sql
|
|
-- Create dedicated user with minimal privileges
|
|
CREATE USER openprose_agent WITH PASSWORD 'changeme';
|
|
CREATE SCHEMA openprose AUTHORIZATION openprose_agent;
|
|
GRANT ALL ON SCHEMA openprose TO openprose_agent;
|
|
-- User can only access the openprose schema, nothing else
|
|
```
|
|
|
|
---
|
|
|
|
## When to Use PostgreSQL State
|
|
|
|
PostgreSQL state is for **power users** with specific scale or collaboration needs:
|
|
|
|
| Need | PostgreSQL Helps |
|
|
|------|------------------|
|
|
| >5 parallel branches writing simultaneously | SQLite locks; PostgreSQL doesn't |
|
|
| External dashboards querying state | PostgreSQL is designed for concurrent readers |
|
|
| Team collaboration on long workflows | Shared network access; no file sync needed |
|
|
| Outputs exceeding 1GB | Bulk ingestion; no single-file bottleneck |
|
|
| Mission-critical workflows (hours/days) | Robust durability; point-in-time recovery |
|
|
|
|
**If none of these apply, use filesystem or SQLite state.** They're simpler and sufficient for 99% of programs.
|
|
|
|
### Decision Tree
|
|
|
|
```
|
|
Is your program <30 statements with no parallel blocks?
|
|
YES -> Use in-context state (zero friction)
|
|
NO -> Continue...
|
|
|
|
Do external tools (dashboards, monitoring, analytics) need to query state?
|
|
YES -> Use PostgreSQL (network access required)
|
|
NO -> Continue...
|
|
|
|
Do multiple machines or team members need shared access to the same run?
|
|
YES -> Use PostgreSQL (collaboration)
|
|
NO -> Continue...
|
|
|
|
Do you have >5 concurrent parallel branches writing simultaneously?
|
|
YES -> Use PostgreSQL (concurrency)
|
|
NO -> Continue...
|
|
|
|
Will outputs exceed 1GB or writes exceed 100/minute?
|
|
YES -> Use PostgreSQL (scale)
|
|
NO -> Use filesystem (default) or SQLite (if you want SQL queries)
|
|
```
|
|
|
|
### The Concurrency Case
|
|
|
|
The primary motivation for PostgreSQL is **concurrent writes in parallel execution**:
|
|
|
|
- SQLite uses table-level locks: parallel branches serialize
|
|
- PostgreSQL uses row-level locks: parallel branches write simultaneously
|
|
|
|
If your program has 10 parallel branches completing at once, PostgreSQL will be 5-10x faster than SQLite for the write phase.
|
|
|
|
---
|
|
|
|
## Database Setup
|
|
|
|
### Option 1: Docker (Recommended)
|
|
|
|
The fastest path to a running PostgreSQL instance:
|
|
|
|
```bash
|
|
docker run -d \
|
|
--name prose-pg \
|
|
-e POSTGRES_DB=prose \
|
|
-e POSTGRES_HOST_AUTH_METHOD=trust \
|
|
-p 5432:5432 \
|
|
postgres:16
|
|
```
|
|
|
|
Then configure the connection:
|
|
|
|
```bash
|
|
mkdir -p .prose
|
|
echo "OPENPROSE_POSTGRES_URL=postgresql://postgres@localhost:5432/prose" > .prose/.env
|
|
```
|
|
|
|
Management commands:
|
|
|
|
```bash
|
|
docker ps | grep prose-pg # Check if running
|
|
docker logs prose-pg # View logs
|
|
docker stop prose-pg # Stop
|
|
docker start prose-pg # Start again
|
|
docker rm -f prose-pg # Remove completely
|
|
```
|
|
|
|
### Option 2: Local PostgreSQL
|
|
|
|
For users who prefer native PostgreSQL:
|
|
|
|
**macOS (Homebrew):**
|
|
|
|
```bash
|
|
brew install postgresql@16
|
|
brew services start postgresql@16
|
|
createdb myproject
|
|
echo "OPENPROSE_POSTGRES_URL=postgresql://localhost/myproject" >> .prose/.env
|
|
```
|
|
|
|
**Linux (Debian/Ubuntu):**
|
|
|
|
```bash
|
|
sudo apt install postgresql
|
|
sudo systemctl start postgresql
|
|
sudo -u postgres createdb myproject
|
|
echo "OPENPROSE_POSTGRES_URL=postgresql:///myproject" >> .prose/.env
|
|
```
|
|
|
|
### Option 3: Cloud PostgreSQL
|
|
|
|
For team collaboration or production:
|
|
|
|
| Provider | Free Tier | Cold Start | Best For |
|
|
|----------|-----------|------------|----------|
|
|
| **Neon** | 0.5GB, auto-suspend | 1-3s | Development, testing |
|
|
| **Supabase** | 500MB, no auto-suspend | None | Projects needing auth/storage |
|
|
| **Railway** | $5/mo credit | None | Simple production deploys |
|
|
|
|
```bash
|
|
# Example: Neon
|
|
echo "OPENPROSE_POSTGRES_URL=postgresql://user:pass@ep-name.us-east-2.aws.neon.tech/neondb?sslmode=require" >> .prose/.env
|
|
```
|
|
|
|
---
|
|
|
|
## Database Location
|
|
|
|
The connection string is stored in `.prose/.env`:
|
|
|
|
```
|
|
your-project/
|
|
├── .prose/
|
|
│ ├── .env # OPENPROSE_POSTGRES_URL=...
|
|
│ └── runs/ # Execution metadata and attachments
|
|
│ └── {YYYYMMDD}-{HHMMSS}-{random}/
|
|
│ ├── program.prose # Copy of running program
|
|
│ └── attachments/ # Large outputs (optional)
|
|
├── .gitignore # Should exclude .prose/.env
|
|
└── your-program.prose
|
|
```
|
|
|
|
**Run ID format:** `{YYYYMMDD}-{HHMMSS}-{random6}`
|
|
|
|
Example: `20260116-143052-a7b3c9`
|
|
|
|
### Environment Variable Precedence
|
|
|
|
The VM checks in this order:
|
|
|
|
1. `OPENPROSE_POSTGRES_URL` in `.prose/.env`
|
|
2. `OPENPROSE_POSTGRES_URL` in shell environment
|
|
3. `DATABASE_URL` in shell environment (common fallback)
|
|
|
|
### Security: Add to .gitignore
|
|
|
|
```gitignore
|
|
# OpenProse sensitive files
|
|
.prose/.env
|
|
.prose/runs/
|
|
```
|
|
|
|
---
|
|
|
|
## Responsibility Separation
|
|
|
|
This section defines **who does what**. This is the contract between the VM and subagents.
|
|
|
|
### VM Responsibilities
|
|
|
|
The VM (the orchestrating agent running the .prose program) is responsible for:
|
|
|
|
| Responsibility | Description |
|
|
|----------------|-------------|
|
|
| **Schema initialization** | Create `openprose` schema and tables at run start |
|
|
| **Run registration** | Store the program source and metadata |
|
|
| **Execution tracking** | Update position, status, and timing as statements execute |
|
|
| **Subagent spawning** | Spawn sessions via Task tool with database instructions |
|
|
| **Parallel coordination** | Track branch status, implement join strategies |
|
|
| **Loop management** | Track iteration counts, evaluate conditions |
|
|
| **Error aggregation** | Record failures, manage retry state |
|
|
| **Context preservation** | Maintain sufficient narration in the main thread |
|
|
| **Completion detection** | Mark the run as complete when finished |
|
|
|
|
**Critical:** The VM must preserve enough context in its own conversation to understand execution state without re-reading the entire database. The database is for coordination and persistence, not a replacement for working memory.
|
|
|
|
### Subagent Responsibilities
|
|
|
|
Subagents (sessions spawned by the VM) are responsible for:
|
|
|
|
| Responsibility | Description |
|
|
|----------------|-------------|
|
|
| **Writing own outputs** | Insert/update their binding in the `bindings` table |
|
|
| **Memory management** | For persistent agents: read and update their memory record |
|
|
| **Segment recording** | For persistent agents: append segment history |
|
|
| **Attachment handling** | Write large outputs to `attachments/` directory, store path in DB |
|
|
| **Atomic writes** | Use transactions when updating multiple related records |
|
|
|
|
**Critical:** Subagents write ONLY to `bindings`, `agents`, and `agent_segments` tables. The VM owns the `execution` table entirely. Completion signaling happens through the substrate (Task tool return), not database updates.
|
|
|
|
**Critical:** Subagents must write their outputs directly to the database. The VM does not write subagent outputs—it only reads them after the subagent completes.
|
|
|
|
**What subagents return to the VM:** A confirmation message with the binding location—not the full content:
|
|
|
|
**Root scope:**
|
|
```
|
|
Binding written: research
|
|
Location: openprose.bindings WHERE name='research' AND run_id='20260116-143052-a7b3c9' AND execution_id IS NULL
|
|
Summary: AI safety research covering alignment, robustness, and interpretability with 15 citations.
|
|
```
|
|
|
|
**Inside block invocation:**
|
|
```
|
|
Binding written: result
|
|
Location: openprose.bindings WHERE name='result' AND run_id='20260116-143052-a7b3c9' AND execution_id=43
|
|
Execution ID: 43
|
|
Summary: Processed chunk into 3 sub-parts for recursive processing.
|
|
```
|
|
|
|
The VM tracks locations, not values. This keeps the VM's context lean and enables arbitrarily large intermediate values.
|
|
|
|
### Shared Concerns
|
|
|
|
| Concern | Who Handles |
|
|
|---------|-------------|
|
|
| Schema evolution | Either (use `CREATE TABLE IF NOT EXISTS`, `ALTER TABLE` as needed) |
|
|
| Custom tables | Either (prefix with `x_` for extensions) |
|
|
| Indexing | Either (add indexes for frequently-queried columns) |
|
|
| Cleanup | VM (at run end, optionally delete old data) |
|
|
|
|
---
|
|
|
|
## Core Schema
|
|
|
|
The VM initializes these tables using the `openprose` schema. This is a **minimum viable schema**—extend freely.
|
|
|
|
```sql
|
|
-- Create dedicated schema for OpenProse state
|
|
CREATE SCHEMA IF NOT EXISTS openprose;
|
|
|
|
-- Run metadata
|
|
CREATE TABLE IF NOT EXISTS openprose.run (
|
|
id TEXT PRIMARY KEY,
|
|
program_path TEXT,
|
|
program_source TEXT,
|
|
started_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
|
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
|
status TEXT NOT NULL DEFAULT 'running'
|
|
CHECK (status IN ('running', 'completed', 'failed', 'interrupted')),
|
|
state_mode TEXT NOT NULL DEFAULT 'postgres',
|
|
metadata JSONB DEFAULT '{}'::jsonb
|
|
);
|
|
|
|
-- Execution position and history
|
|
CREATE TABLE IF NOT EXISTS openprose.execution (
|
|
id SERIAL PRIMARY KEY,
|
|
run_id TEXT NOT NULL REFERENCES openprose.run(id) ON DELETE CASCADE,
|
|
statement_index INTEGER NOT NULL,
|
|
statement_text TEXT,
|
|
status TEXT NOT NULL DEFAULT 'pending'
|
|
CHECK (status IN ('pending', 'executing', 'completed', 'failed', 'skipped')),
|
|
started_at TIMESTAMPTZ,
|
|
completed_at TIMESTAMPTZ,
|
|
error_message TEXT,
|
|
parent_id INTEGER REFERENCES openprose.execution(id) ON DELETE CASCADE,
|
|
metadata JSONB DEFAULT '{}'::jsonb
|
|
);
|
|
|
|
-- All named values (input, output, let, const)
|
|
CREATE TABLE IF NOT EXISTS openprose.bindings (
|
|
name TEXT NOT NULL,
|
|
run_id TEXT NOT NULL REFERENCES openprose.run(id) ON DELETE CASCADE,
|
|
execution_id INTEGER, -- NULL for root scope, non-null for block invocations
|
|
kind TEXT NOT NULL CHECK (kind IN ('input', 'output', 'let', 'const')),
|
|
value TEXT,
|
|
source_statement TEXT,
|
|
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
|
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
|
attachment_path TEXT,
|
|
metadata JSONB DEFAULT '{}'::jsonb,
|
|
PRIMARY KEY (name, run_id, COALESCE(execution_id, -1)) -- Composite key with scope
|
|
);
|
|
|
|
-- Persistent agent memory
|
|
CREATE TABLE IF NOT EXISTS openprose.agents (
|
|
name TEXT NOT NULL,
|
|
run_id TEXT, -- NULL for project-scoped and user-scoped agents
|
|
scope TEXT NOT NULL CHECK (scope IN ('execution', 'project', 'user', 'custom')),
|
|
memory TEXT,
|
|
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
|
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
|
metadata JSONB DEFAULT '{}'::jsonb,
|
|
PRIMARY KEY (name, COALESCE(run_id, '__project__'))
|
|
);
|
|
|
|
-- Agent invocation history
|
|
CREATE TABLE IF NOT EXISTS openprose.agent_segments (
|
|
id SERIAL PRIMARY KEY,
|
|
agent_name TEXT NOT NULL,
|
|
run_id TEXT, -- NULL for project-scoped agents
|
|
segment_number INTEGER NOT NULL,
|
|
timestamp TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
|
prompt TEXT,
|
|
summary TEXT,
|
|
metadata JSONB DEFAULT '{}'::jsonb,
|
|
UNIQUE (agent_name, COALESCE(run_id, '__project__'), segment_number)
|
|
);
|
|
|
|
-- Import registry
|
|
CREATE TABLE IF NOT EXISTS openprose.imports (
|
|
alias TEXT NOT NULL,
|
|
run_id TEXT NOT NULL REFERENCES openprose.run(id) ON DELETE CASCADE,
|
|
source_url TEXT NOT NULL,
|
|
fetched_at TIMESTAMPTZ,
|
|
inputs_schema JSONB,
|
|
outputs_schema JSONB,
|
|
content_hash TEXT,
|
|
metadata JSONB DEFAULT '{}'::jsonb,
|
|
PRIMARY KEY (alias, run_id)
|
|
);
|
|
|
|
-- Indexes for common queries
|
|
CREATE INDEX IF NOT EXISTS idx_execution_run_id ON openprose.execution(run_id);
|
|
CREATE INDEX IF NOT EXISTS idx_execution_status ON openprose.execution(status);
|
|
CREATE INDEX IF NOT EXISTS idx_execution_parent_id ON openprose.execution(parent_id) WHERE parent_id IS NOT NULL;
|
|
CREATE INDEX IF NOT EXISTS idx_execution_metadata_gin ON openprose.execution USING GIN (metadata jsonb_path_ops);
|
|
CREATE INDEX IF NOT EXISTS idx_bindings_run_id ON openprose.bindings(run_id);
|
|
CREATE INDEX IF NOT EXISTS idx_bindings_execution_id ON openprose.bindings(execution_id) WHERE execution_id IS NOT NULL;
|
|
CREATE INDEX IF NOT EXISTS idx_agents_run_id ON openprose.agents(run_id) WHERE run_id IS NOT NULL;
|
|
CREATE INDEX IF NOT EXISTS idx_agents_project_scoped ON openprose.agents(name) WHERE run_id IS NULL;
|
|
CREATE INDEX IF NOT EXISTS idx_agent_segments_lookup ON openprose.agent_segments(agent_name, run_id);
|
|
```
|
|
|
|
### Schema Conventions
|
|
|
|
- **Timestamps**: Use `TIMESTAMPTZ` with `NOW()` (timezone-aware)
|
|
- **JSON fields**: Use `JSONB` for structured data in `metadata` columns (queryable, indexable)
|
|
- **Large values**: If a binding value exceeds ~100KB, write to `attachments/{name}.md` and store path
|
|
- **Extension tables**: Prefix with `x_` (e.g., `x_metrics`, `x_audit_log`)
|
|
- **Anonymous bindings**: Sessions without explicit capture use auto-generated names: `anon_001`, `anon_002`, etc.
|
|
- **Import bindings**: Prefix with import alias for scoping: `research.findings`, `research.sources`
|
|
- **Scoped bindings**: Use `execution_id` column—NULL for root scope, non-null for block invocations
|
|
|
|
### Scope Resolution Query
|
|
|
|
For recursive blocks, bindings are scoped to their execution frame. Resolve variables by walking up the call stack:
|
|
|
|
```sql
|
|
-- Find binding 'result' starting from execution_id 43 in run '20260116-143052-a7b3c9'
|
|
WITH RECURSIVE scope_chain AS (
|
|
-- Start with current execution
|
|
SELECT id, parent_id FROM openprose.execution WHERE id = 43
|
|
UNION ALL
|
|
-- Walk up to parent
|
|
SELECT e.id, e.parent_id
|
|
FROM openprose.execution e
|
|
JOIN scope_chain s ON e.id = s.parent_id
|
|
)
|
|
SELECT b.* FROM openprose.bindings b
|
|
WHERE b.name = 'result'
|
|
AND b.run_id = '20260116-143052-a7b3c9'
|
|
AND (b.execution_id IN (SELECT id FROM scope_chain) OR b.execution_id IS NULL)
|
|
ORDER BY
|
|
CASE WHEN b.execution_id IS NULL THEN 1 ELSE 0 END, -- Prefer scoped over root
|
|
b.execution_id DESC NULLS LAST -- Prefer deeper (more local) scope
|
|
LIMIT 1;
|
|
```
|
|
|
|
**Simpler version if you know the scope chain:**
|
|
|
|
```sql
|
|
-- Direct lookup: check current scope (43), then parent (42), then root (NULL)
|
|
SELECT * FROM openprose.bindings
|
|
WHERE name = 'result'
|
|
AND run_id = '20260116-143052-a7b3c9'
|
|
AND (execution_id = 43 OR execution_id = 42 OR execution_id IS NULL)
|
|
ORDER BY execution_id DESC NULLS LAST
|
|
LIMIT 1;
|
|
```
|
|
|
|
---
|
|
|
|
## Database Interaction
|
|
|
|
Both VM and subagents interact via the `psql` CLI.
|
|
|
|
### From the VM
|
|
|
|
```bash
|
|
# Initialize schema
|
|
psql "$OPENPROSE_POSTGRES_URL" -f schema.sql
|
|
|
|
# Register a new run
|
|
psql "$OPENPROSE_POSTGRES_URL" -c "
|
|
INSERT INTO openprose.run (id, program_path, program_source, status)
|
|
VALUES ('20260116-143052-a7b3c9', '/path/to/program.prose', 'program source...', 'running')
|
|
"
|
|
|
|
# Update execution position
|
|
psql "$OPENPROSE_POSTGRES_URL" -c "
|
|
INSERT INTO openprose.execution (run_id, statement_index, statement_text, status, started_at)
|
|
VALUES ('20260116-143052-a7b3c9', 3, 'session \"Research AI safety\"', 'executing', NOW())
|
|
"
|
|
|
|
# Read a binding
|
|
psql "$OPENPROSE_POSTGRES_URL" -t -A -c "
|
|
SELECT value FROM openprose.bindings WHERE name = 'research' AND run_id = '20260116-143052-a7b3c9'
|
|
"
|
|
|
|
# Check parallel branch status
|
|
psql "$OPENPROSE_POSTGRES_URL" -c "
|
|
SELECT metadata->>'branch' AS branch, status FROM openprose.execution
|
|
WHERE run_id = '20260116-143052-a7b3c9' AND metadata->>'parallel_id' = 'p1'
|
|
"
|
|
```
|
|
|
|
### From Subagents
|
|
|
|
The VM provides the database path and instructions when spawning:
|
|
|
|
**Root scope (outside block invocations):**
|
|
|
|
```
|
|
Your output goes to PostgreSQL state.
|
|
|
|
| Property | Value |
|
|
|----------|-------|
|
|
| Connection | `postgresql://user:***@host:5432/db` |
|
|
| Schema | `openprose` |
|
|
| Run ID | `20260116-143052-a7b3c9` |
|
|
| Binding | `research` |
|
|
| Execution ID | (root scope) |
|
|
|
|
When complete, write your output:
|
|
|
|
psql "$OPENPROSE_POSTGRES_URL" -c "
|
|
INSERT INTO openprose.bindings (name, run_id, execution_id, kind, value, source_statement)
|
|
VALUES (
|
|
'research',
|
|
'20260116-143052-a7b3c9',
|
|
NULL, -- root scope
|
|
'let',
|
|
E'AI safety research covers alignment, robustness...',
|
|
'let research = session: researcher'
|
|
)
|
|
ON CONFLICT (name, run_id, COALESCE(execution_id, -1)) DO UPDATE
|
|
SET value = EXCLUDED.value, updated_at = NOW()
|
|
"
|
|
```
|
|
|
|
**Inside block invocation (include execution_id):**
|
|
|
|
```
|
|
Your output goes to PostgreSQL state.
|
|
|
|
| Property | Value |
|
|
|----------|-------|
|
|
| Connection | `postgresql://user:***@host:5432/db` |
|
|
| Schema | `openprose` |
|
|
| Run ID | `20260116-143052-a7b3c9` |
|
|
| Binding | `result` |
|
|
| Execution ID | `43` |
|
|
| Block | `process` |
|
|
| Depth | `3` |
|
|
|
|
When complete, write your output:
|
|
|
|
psql "$OPENPROSE_POSTGRES_URL" -c "
|
|
INSERT INTO openprose.bindings (name, run_id, execution_id, kind, value, source_statement)
|
|
VALUES (
|
|
'result',
|
|
'20260116-143052-a7b3c9',
|
|
43, -- scoped to this execution
|
|
'let',
|
|
E'Processed chunk into 3 sub-parts...',
|
|
'let result = session \"Process chunk\"'
|
|
)
|
|
ON CONFLICT (name, run_id, COALESCE(execution_id, -1)) DO UPDATE
|
|
SET value = EXCLUDED.value, updated_at = NOW()
|
|
"
|
|
```
|
|
|
|
For persistent agents (execution-scoped):
|
|
|
|
```
|
|
Your memory is in the database:
|
|
|
|
Read your current state:
|
|
psql "$OPENPROSE_POSTGRES_URL" -t -A -c "SELECT memory FROM openprose.agents WHERE name = 'captain' AND run_id = '20260116-143052-a7b3c9'"
|
|
|
|
Update when done:
|
|
psql "$OPENPROSE_POSTGRES_URL" -c "UPDATE openprose.agents SET memory = '...', updated_at = NOW() WHERE name = 'captain' AND run_id = '20260116-143052-a7b3c9'"
|
|
|
|
Record this segment:
|
|
psql "$OPENPROSE_POSTGRES_URL" -c "INSERT INTO openprose.agent_segments (agent_name, run_id, segment_number, prompt, summary) VALUES ('captain', '20260116-143052-a7b3c9', 3, '...', '...')"
|
|
```
|
|
|
|
For project-scoped agents, use `run_id IS NULL` in queries:
|
|
|
|
```sql
|
|
-- Read project-scoped agent memory
|
|
SELECT memory FROM openprose.agents WHERE name = 'advisor' AND run_id IS NULL;
|
|
|
|
-- Update project-scoped agent memory
|
|
UPDATE openprose.agents SET memory = '...' WHERE name = 'advisor' AND run_id IS NULL;
|
|
```
|
|
|
|
---
|
|
|
|
## Context Preservation in Main Thread
|
|
|
|
**This is critical.** The database is for persistence and coordination, but the VM must still maintain conversational context.
|
|
|
|
### What the VM Must Narrate
|
|
|
|
Even with PostgreSQL state, the VM should narrate key events in its conversation:
|
|
|
|
```
|
|
[Position] Statement 3: let research = session: researcher
|
|
Spawning session, will write to state database
|
|
[Task tool call]
|
|
[Success] Session complete, binding written to DB
|
|
[Binding] research = <stored in openprose.bindings>
|
|
```
|
|
|
|
### Why Both?
|
|
|
|
| Purpose | Mechanism |
|
|
|---------|-----------|
|
|
| **Working memory** | Conversation narration (what the VM "remembers" without re-querying) |
|
|
| **Durable state** | PostgreSQL database (survives context limits, enables resumption) |
|
|
| **Subagent coordination** | PostgreSQL database (shared access point) |
|
|
| **Debugging/inspection** | PostgreSQL database (queryable history) |
|
|
|
|
The narration is the VM's "mental model" of execution. The database is the "source of truth" for resumption and inspection.
|
|
|
|
---
|
|
|
|
## Parallel Execution
|
|
|
|
For parallel blocks, the VM uses the `metadata` JSONB field to track branches. **Only the VM writes to the `execution` table.**
|
|
|
|
```sql
|
|
-- VM marks parallel start
|
|
INSERT INTO openprose.execution (run_id, statement_index, statement_text, status, started_at, metadata)
|
|
VALUES ('20260116-143052-a7b3c9', 5, 'parallel:', 'executing', NOW(),
|
|
'{"parallel_id": "p1", "strategy": "all", "branches": ["a", "b", "c"]}'::jsonb)
|
|
RETURNING id; -- Save as parent_id (e.g., 42)
|
|
|
|
-- VM creates execution record for each branch
|
|
INSERT INTO openprose.execution (run_id, statement_index, statement_text, status, started_at, parent_id, metadata)
|
|
VALUES
|
|
('20260116-143052-a7b3c9', 6, 'a = session "Task A"', 'executing', NOW(), 42, '{"parallel_id": "p1", "branch": "a"}'::jsonb),
|
|
('20260116-143052-a7b3c9', 7, 'b = session "Task B"', 'executing', NOW(), 42, '{"parallel_id": "p1", "branch": "b"}'::jsonb),
|
|
('20260116-143052-a7b3c9', 8, 'c = session "Task C"', 'executing', NOW(), 42, '{"parallel_id": "p1", "branch": "c"}'::jsonb);
|
|
|
|
-- Subagents write their outputs to bindings table (see "From Subagents" section)
|
|
-- Task tool signals completion to VM via substrate
|
|
|
|
-- VM marks branch complete after Task returns
|
|
UPDATE openprose.execution SET status = 'completed', completed_at = NOW()
|
|
WHERE run_id = '20260116-143052-a7b3c9' AND metadata->>'parallel_id' = 'p1' AND metadata->>'branch' = 'a';
|
|
|
|
-- VM checks if all branches complete
|
|
SELECT COUNT(*) AS pending FROM openprose.execution
|
|
WHERE run_id = '20260116-143052-a7b3c9'
|
|
AND metadata->>'parallel_id' = 'p1'
|
|
AND parent_id IS NOT NULL
|
|
AND status NOT IN ('completed', 'failed', 'skipped');
|
|
```
|
|
|
|
### The Concurrency Advantage
|
|
|
|
Each subagent writes to a different row in `openprose.bindings`. PostgreSQL's row-level locking means **no blocking**:
|
|
|
|
```
|
|
SQLite (table locks):
|
|
Branch 1 writes -------|
|
|
Branch 2 waits ------|
|
|
Branch 3 waits -----|
|
|
Total time: 3 * write_time (serialized)
|
|
|
|
PostgreSQL (row locks):
|
|
Branch 1 writes --|
|
|
Branch 2 writes --| (concurrent)
|
|
Branch 3 writes --|
|
|
Total time: ~1 * write_time (parallel)
|
|
```
|
|
|
|
---
|
|
|
|
## Loop Tracking
|
|
|
|
```sql
|
|
-- Loop metadata tracks iteration state
|
|
INSERT INTO openprose.execution (run_id, statement_index, statement_text, status, started_at, metadata)
|
|
VALUES ('20260116-143052-a7b3c9', 10, 'loop until **analysis complete** (max: 5):', 'executing', NOW(),
|
|
'{"loop_id": "l1", "max_iterations": 5, "current_iteration": 0, "condition": "**analysis complete**"}'::jsonb);
|
|
|
|
-- Update iteration
|
|
UPDATE openprose.execution
|
|
SET metadata = jsonb_set(metadata, '{current_iteration}', '2')
|
|
WHERE run_id = '20260116-143052-a7b3c9' AND metadata->>'loop_id' = 'l1' AND parent_id IS NULL;
|
|
```
|
|
|
|
---
|
|
|
|
## Error Handling
|
|
|
|
```sql
|
|
-- Record failure
|
|
UPDATE openprose.execution
|
|
SET status = 'failed',
|
|
error_message = 'Connection timeout after 30s',
|
|
completed_at = NOW()
|
|
WHERE id = 15;
|
|
|
|
-- Track retry attempts in metadata
|
|
UPDATE openprose.execution
|
|
SET metadata = jsonb_set(jsonb_set(metadata, '{retry_attempt}', '2'), '{max_retries}', '3')
|
|
WHERE id = 15;
|
|
|
|
-- Mark run as failed
|
|
UPDATE openprose.run SET status = 'failed' WHERE id = '20260116-143052-a7b3c9';
|
|
```
|
|
|
|
---
|
|
|
|
## Project-Scoped and User-Scoped Agents
|
|
|
|
Execution-scoped agents (the default) use `run_id = specific value`. **Project-scoped agents** (`persist: project`) and **user-scoped agents** (`persist: user`) use `run_id IS NULL` and survive across runs.
|
|
|
|
For user-scoped agents, the VM maintains a separate connection or uses a naming convention to distinguish them from project-scoped agents. One approach is to prefix user-scoped agent names with `__user__` in the same database, or use a separate user-level database configured via `OPENPROSE_POSTGRES_USER_URL`.
|
|
|
|
### The run_id Approach
|
|
|
|
The `COALESCE` trick in the primary key allows both scopes in one table:
|
|
|
|
```sql
|
|
PRIMARY KEY (name, COALESCE(run_id, '__project__'))
|
|
```
|
|
|
|
This means:
|
|
- `name='advisor', run_id=NULL` has PK `('advisor', '__project__')`
|
|
- `name='advisor', run_id='20260116-143052-a7b3c9'` has PK `('advisor', '20260116-143052-a7b3c9')`
|
|
|
|
The same agent name can exist as both project-scoped and execution-scoped without collision.
|
|
|
|
### Query Patterns
|
|
|
|
| Scope | Query |
|
|
|-------|-------|
|
|
| Execution-scoped | `WHERE name = 'captain' AND run_id = '{RUN_ID}'` |
|
|
| Project-scoped | `WHERE name = 'advisor' AND run_id IS NULL` |
|
|
|
|
### Project-Scoped Memory Guidelines
|
|
|
|
Project-scoped agents should store generalizable knowledge that accumulates:
|
|
|
|
**DO store:** User preferences, project context, learned patterns, decision rationale
|
|
**DO NOT store:** Run-specific details, time-sensitive information, large data
|
|
|
|
### Agent Cleanup
|
|
|
|
- **Execution-scoped:** Can be deleted when run completes or after retention period
|
|
- **Project-scoped:** Only deleted on explicit user request
|
|
|
|
```sql
|
|
-- Delete execution-scoped agents for a completed run
|
|
DELETE FROM openprose.agents WHERE run_id = '20260116-143052-a7b3c9';
|
|
|
|
-- Delete a specific project-scoped agent (user-initiated)
|
|
DELETE FROM openprose.agents WHERE name = 'old_advisor' AND run_id IS NULL;
|
|
```
|
|
|
|
---
|
|
|
|
## Large Outputs
|
|
|
|
When a binding value is too large for comfortable database storage (>100KB):
|
|
|
|
1. Write content to `attachments/{binding_name}.md`
|
|
2. Store the path in the `attachment_path` column
|
|
3. Leave `value` as a summary
|
|
|
|
```sql
|
|
INSERT INTO openprose.bindings (name, run_id, kind, value, attachment_path, source_statement)
|
|
VALUES (
|
|
'full_report',
|
|
'20260116-143052-a7b3c9',
|
|
'let',
|
|
'Full analysis report (847KB) - see attachment',
|
|
'attachments/full_report.md',
|
|
'let full_report = session "Generate comprehensive report"'
|
|
)
|
|
ON CONFLICT (name, run_id) DO UPDATE
|
|
SET value = EXCLUDED.value, attachment_path = EXCLUDED.attachment_path, updated_at = NOW();
|
|
```
|
|
|
|
---
|
|
|
|
## Resuming Execution
|
|
|
|
To resume an interrupted run:
|
|
|
|
```sql
|
|
-- Find current position
|
|
SELECT statement_index, statement_text, status
|
|
FROM openprose.execution
|
|
WHERE run_id = '20260116-143052-a7b3c9' AND status = 'executing'
|
|
ORDER BY id DESC LIMIT 1;
|
|
|
|
-- Get all completed bindings
|
|
SELECT name, kind, value, attachment_path FROM openprose.bindings
|
|
WHERE run_id = '20260116-143052-a7b3c9';
|
|
|
|
-- Get agent memory states
|
|
SELECT name, scope, memory FROM openprose.agents
|
|
WHERE run_id = '20260116-143052-a7b3c9' OR run_id IS NULL;
|
|
|
|
-- Check parallel block status
|
|
SELECT metadata->>'branch' AS branch, status
|
|
FROM openprose.execution
|
|
WHERE run_id = '20260116-143052-a7b3c9'
|
|
AND metadata->>'parallel_id' IS NOT NULL
|
|
AND parent_id IS NOT NULL;
|
|
```
|
|
|
|
---
|
|
|
|
## Flexibility Encouragement
|
|
|
|
PostgreSQL state is intentionally **flexible**. The core schema is a starting point. You are encouraged to:
|
|
|
|
- **Add columns** to existing tables as needed
|
|
- **Create extension tables** (prefix with `x_`)
|
|
- **Store custom metrics** (timing, token counts, model info)
|
|
- **Build indexes** for your query patterns
|
|
- **Use JSONB operators** for semi-structured data queries
|
|
|
|
Example extensions:
|
|
|
|
```sql
|
|
-- Custom metrics table
|
|
CREATE TABLE IF NOT EXISTS openprose.x_metrics (
|
|
id SERIAL PRIMARY KEY,
|
|
run_id TEXT REFERENCES openprose.run(id) ON DELETE CASCADE,
|
|
execution_id INTEGER REFERENCES openprose.execution(id) ON DELETE CASCADE,
|
|
metric_name TEXT NOT NULL,
|
|
metric_value NUMERIC,
|
|
recorded_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
|
metadata JSONB DEFAULT '{}'::jsonb
|
|
);
|
|
|
|
-- Add custom column
|
|
ALTER TABLE openprose.bindings ADD COLUMN IF NOT EXISTS token_count INTEGER;
|
|
|
|
-- Create index for common query
|
|
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_bindings_created ON openprose.bindings(created_at);
|
|
```
|
|
|
|
The database is your workspace. Use it.
|
|
|
|
---
|
|
|
|
## Comparison with Other Modes
|
|
|
|
| Aspect | filesystem.md | in-context.md | sqlite.md | postgres.md |
|
|
|--------|---------------|---------------|-----------|-------------|
|
|
| **State location** | `.prose/runs/{id}/` files | Conversation history | `.prose/runs/{id}/state.db` | PostgreSQL database |
|
|
| **Queryable** | Via file reads | No | Yes (SQL) | Yes (SQL) |
|
|
| **Atomic updates** | No | N/A | Yes (transactions) | Yes (ACID) |
|
|
| **Concurrent writes** | Yes (different files) | N/A | **No (table locks)** | **Yes (row locks)** |
|
|
| **Network access** | No | No | No | **Yes** |
|
|
| **Team collaboration** | Via file sync | No | Via file sync | **Yes** |
|
|
| **Schema flexibility** | Rigid file structure | N/A | Flexible | Very flexible (JSONB) |
|
|
| **Resumption** | Read state.md | Re-read conversation | Query database | Query database |
|
|
| **Complexity ceiling** | High | Low (<30 statements) | High | **Very high** |
|
|
| **Dependency** | None | None | sqlite3 CLI | psql CLI + PostgreSQL |
|
|
| **Setup friction** | Zero | Zero | Low | Medium-High |
|
|
| **Status** | Stable | Stable | Experimental | **Experimental** |
|
|
|
|
---
|
|
|
|
## Summary
|
|
|
|
PostgreSQL state management:
|
|
|
|
1. Uses a **shared PostgreSQL database** for all runs
|
|
2. Provides **true concurrent writes** via row-level locking
|
|
3. Enables **network access** for external tools and dashboards
|
|
4. Supports **team collaboration** on shared run state
|
|
5. Allows **flexible schema evolution** with JSONB and custom tables
|
|
6. Requires the **psql CLI** and a running PostgreSQL server
|
|
7. Is **experimental**—expect changes
|
|
|
|
The core contract: the VM manages execution flow and spawns subagents; subagents write their own outputs directly to the database. Completion is signaled through the Task tool return, not database updates. External tools can query execution state in real-time.
|
|
|
|
**PostgreSQL state is for power users.** If you don't need concurrent writes, network access, or team collaboration, filesystem or SQLite state will be simpler and sufficient.
|