446 lines
13 KiB
Plaintext
446 lines
13 KiB
Plaintext
# Habit Miner
|
|
# Excavates your AI session history to find recurring workflows worth automating
|
|
# Scans .claude, .opencode, .cursor, etc. — discovers patterns, writes .prose programs
|
|
#
|
|
# BACKEND: Run with sqlite+ or postgres for incremental processing across runs
|
|
# prose run 48-habit-miner.prose --backend sqlite+
|
|
#
|
|
# KEY VM FEATURES USED:
|
|
# - persist: true on miner — remembers patterns across runs, watches them mature
|
|
# - resume: — incremental processing, only analyzes new logs since last run
|
|
# - recursive blocks — handles arbitrarily large log corpora
|
|
# - reference-based context — agents read from storage, not everything in memory
|
|
|
|
input mode: "Mode: 'full' (analyze everything), 'incremental' (new logs only), 'check' (see what's new)"
|
|
input min_frequency: "Minimum times a pattern must appear to qualify (default: 3)"
|
|
input focus: "Optional: filter to specific area (e.g., 'git', 'testing', 'refactoring')"
|
|
|
|
# ============================================================
|
|
# Agents
|
|
# ============================================================
|
|
|
|
agent scout:
|
|
model: sonnet
|
|
prompt: """
|
|
You discover AI assistant log files on the user's system.
|
|
|
|
Check common locations:
|
|
- ~/.claude/ (Claude Code)
|
|
- ~/.opencode/ (OpenCode)
|
|
- ~/.cursor/ (Cursor)
|
|
- ~/.continue/ (Continue)
|
|
- ~/.aider/ (Aider)
|
|
- ~/.copilot/ (GitHub Copilot)
|
|
- ~/.codeium/ (Codeium)
|
|
- ~/.tabnine/ (Tabnine)
|
|
- ~/.config/claude-code/
|
|
- ~/.config/github-copilot/
|
|
- ~/.local/share/*/
|
|
|
|
For each location found, report:
|
|
- Path
|
|
- Log format (jsonl, sqlite, json, etc.)
|
|
- Approximate size
|
|
- Number of sessions/files
|
|
- Date range (oldest to newest)
|
|
- NEW since last scan (if incremental)
|
|
|
|
Be thorough but respect permissions. Don't read content yet, just inventory.
|
|
"""
|
|
permissions:
|
|
bash: allow
|
|
read: ["~/.claude/**", "~/.opencode/**", "~/.cursor/**", "~/.continue/**",
|
|
"~/.aider/**", "~/.copilot/**", "~/.codeium/**", "~/.tabnine/**",
|
|
"~/.config/**", "~/.local/share/**"]
|
|
|
|
agent parser:
|
|
model: sonnet
|
|
prompt: """
|
|
You parse AI assistant log files into normalized conversation format.
|
|
|
|
Handle formats:
|
|
- JSONL: one JSON object per line (Claude Code, many others)
|
|
- SQLite: query conversation tables
|
|
- JSON: array of messages or nested structure
|
|
- Markdown: conversation exports
|
|
|
|
Extract for each session:
|
|
- Session ID / timestamp
|
|
- User messages (the requests)
|
|
- Assistant actions (tools used, files modified)
|
|
- Outcome (success/failure indicators)
|
|
|
|
Normalize to common schema regardless of source format.
|
|
Track file modification times for incremental processing.
|
|
"""
|
|
permissions:
|
|
bash: allow
|
|
read: ["~/.claude/**", "~/.opencode/**", "~/.cursor/**", "~/.continue/**",
|
|
"~/.aider/**", "~/.copilot/**", "~/.codeium/**", "~/.tabnine/**"]
|
|
|
|
agent miner:
|
|
model: opus
|
|
persist: true # <-- KEY: Remembers patterns across runs
|
|
prompt: """
|
|
You find and track patterns in conversation histories over time.
|
|
|
|
Your memory contains patterns from previous runs. Each pattern has:
|
|
- name: descriptive identifier
|
|
- maturity: emerging (3-5 hits) → established (6-15) → proven (16+)
|
|
- examples: representative instances
|
|
- last_seen: when pattern last appeared
|
|
- trend: growing / stable / declining
|
|
|
|
On each run:
|
|
1. Load your memory of known patterns
|
|
2. Process new sessions
|
|
3. Update pattern frequencies and maturity
|
|
4. Identify NEW emerging patterns
|
|
5. Note patterns that are declining (not seen recently)
|
|
|
|
Patterns MATURE over time. Don't rush to automate emerging patterns.
|
|
Wait until they're established before recommending automation.
|
|
"""
|
|
|
|
agent qualifier:
|
|
model: opus
|
|
prompt: """
|
|
You determine which patterns are ready for automation.
|
|
|
|
Consider MATURITY (from miner's memory):
|
|
- emerging: Too early. Note it, but don't automate yet.
|
|
- established: Good candidate. Enough data to generalize.
|
|
- proven: Strong candidate. Battle-tested pattern.
|
|
|
|
Also consider:
|
|
- COMPLEXITY: Multi-step, not trivial
|
|
- CONSISTENCY: Similar enough across instances
|
|
- AUTOMATABLE: Not too context-dependent
|
|
- VALUE: Would save meaningful time/effort
|
|
|
|
Reject patterns that are:
|
|
- Still emerging (wait for more data)
|
|
- Too simple (just run a single command)
|
|
- Too variable (every instance is different)
|
|
"""
|
|
|
|
agent author:
|
|
model: opus
|
|
prompt: """
|
|
You write .prose programs from mature workflow patterns.
|
|
|
|
For each qualified pattern:
|
|
- Identify the inputs (what varies between instances)
|
|
- Identify the constants (what's always the same)
|
|
- Design appropriate agents for the workflow
|
|
- Structure phases logically
|
|
- Add error handling where needed
|
|
- Include user checkpoints at decision points
|
|
|
|
Write idiomatic OpenProse. Follow existing example patterns.
|
|
Reference the pattern's maturity level in a header comment.
|
|
"""
|
|
permissions:
|
|
write: ["**/*.prose"]
|
|
|
|
agent organizer:
|
|
model: sonnet
|
|
prompt: """
|
|
You organize generated .prose programs into a coherent collection.
|
|
|
|
Tasks:
|
|
- Group related programs by domain (git, testing, docs, etc.)
|
|
- Suggest directory structure
|
|
- Create an index/README
|
|
- Identify programs that could share blocks or agents
|
|
- Note potential compositions (program A often followed by B)
|
|
"""
|
|
permissions:
|
|
write: ["**/*.md", "**/*.prose"]
|
|
|
|
# ============================================================
|
|
# Recursive block for processing large log corpora
|
|
# ============================================================
|
|
|
|
block process_logs(sources, depth):
|
|
# Base case: small enough to process directly
|
|
if **fewer than 50 sessions** or depth <= 0:
|
|
output sources | pmap:
|
|
session: parser
|
|
prompt: "Parse these logs into normalized format"
|
|
context: item
|
|
|
|
# Recursive case: chunk and fan out
|
|
let chunks = session "Split sources into ~25 session batches"
|
|
context: sources
|
|
|
|
let results = []
|
|
parallel for chunk in chunks:
|
|
let chunk_result = do process_logs(chunk, depth - 1)
|
|
results = results + chunk_result
|
|
|
|
output results
|
|
|
|
# ============================================================
|
|
# Phase 0: Discovery
|
|
# ============================================================
|
|
|
|
let inventory = session: scout
|
|
prompt: """
|
|
Scan the system for AI assistant log files.
|
|
Mode: {mode}
|
|
|
|
Check all common locations. For each found, report:
|
|
- Full path
|
|
- Format detected
|
|
- Size (human readable)
|
|
- Session/file count
|
|
- Date range
|
|
- If incremental: how many NEW since last scan
|
|
|
|
Return a structured inventory.
|
|
"""
|
|
|
|
# For "check" mode, just show what's available and exit
|
|
if **mode is check**:
|
|
output result = {
|
|
status: "check-complete",
|
|
inventory: inventory,
|
|
hint: "Run with mode:'incremental' to process new logs, or mode:'full' for everything"
|
|
}
|
|
|
|
input source_selection: """
|
|
## AI Assistant Logs Found
|
|
|
|
{inventory}
|
|
|
|
---
|
|
|
|
Mode: {mode}
|
|
|
|
Select which sources to analyze:
|
|
- List the paths you want included
|
|
- Or say "all" to analyze everything found
|
|
- Or say "none" to cancel
|
|
"""
|
|
|
|
if **user selected none or wants to cancel**:
|
|
output result = {
|
|
status: "cancelled",
|
|
inventory: inventory
|
|
}
|
|
throw "User cancelled - no sources selected"
|
|
|
|
let selected_sources = session: scout
|
|
prompt: "Parse user's selection into a list of paths to analyze"
|
|
context: { inventory, source_selection, mode }
|
|
|
|
# ============================================================
|
|
# Phase 1: Parsing (with recursive chunking for scale)
|
|
# ============================================================
|
|
|
|
let parsed_sessions = do process_logs(selected_sources, 3)
|
|
|
|
let session_count = session "Count total sessions parsed"
|
|
context: parsed_sessions
|
|
|
|
# ============================================================
|
|
# Phase 2: Mining (with persistent memory)
|
|
# ============================================================
|
|
|
|
# Resume the miner with its accumulated pattern knowledge
|
|
let pattern_update = resume: miner
|
|
prompt: """
|
|
Process these new sessions against your pattern memory.
|
|
|
|
1. Load your known patterns (with maturity levels)
|
|
2. Match new sessions to existing patterns OR identify new ones
|
|
3. Update frequencies, maturity levels, last_seen dates
|
|
4. Report:
|
|
- Patterns that MATURED (crossed a threshold)
|
|
- NEW patterns emerging
|
|
- Patterns DECLINING (not seen in a while)
|
|
- Current state of all tracked patterns
|
|
|
|
Focus area (if specified): {focus}
|
|
"""
|
|
context: { parsed_sessions, focus }
|
|
|
|
# ============================================================
|
|
# Phase 3: Qualification
|
|
# ============================================================
|
|
|
|
let qualified = session: qualifier
|
|
prompt: """
|
|
Review the miner's pattern update. Identify patterns ready for automation.
|
|
|
|
Minimum frequency threshold: {min_frequency}
|
|
|
|
PRIORITIZE:
|
|
1. Patterns that just reached "established" or "proven" maturity
|
|
2. Proven patterns not yet automated
|
|
3. High-value patterns even if just established
|
|
|
|
SKIP:
|
|
- Emerging patterns (let them mature)
|
|
- Already-automated patterns (unless significantly evolved)
|
|
- Declining patterns (might be obsolete)
|
|
|
|
Return ranked list with reasoning.
|
|
"""
|
|
context: { pattern_update, min_frequency }
|
|
|
|
if **no patterns ready for automation**:
|
|
output result = {
|
|
status: "no-new-automations",
|
|
sessions_analyzed: session_count,
|
|
pattern_update: pattern_update,
|
|
message: "Patterns are still maturing. Run again later."
|
|
}
|
|
|
|
# ============================================================
|
|
# Phase 4: User Checkpoint
|
|
# ============================================================
|
|
|
|
input pattern_selection: """
|
|
## Patterns Ready for Automation
|
|
|
|
Analyzed {session_count} sessions.
|
|
|
|
Pattern Update:
|
|
{pattern_update}
|
|
|
|
Ready for automation:
|
|
{qualified}
|
|
|
|
---
|
|
|
|
Which patterns should I write .prose programs for?
|
|
- List by name or number
|
|
- Or say "all" for everything qualified
|
|
- Or say "none" to let patterns mature further
|
|
|
|
You can also refine any pattern description before I write code.
|
|
"""
|
|
|
|
if **user wants to wait for more maturity**:
|
|
output result = {
|
|
status: "deferred",
|
|
sessions_analyzed: session_count,
|
|
pattern_update: pattern_update,
|
|
qualified: qualified
|
|
}
|
|
|
|
let patterns_to_automate = session: qualifier
|
|
prompt: "Parse user selection into final list of patterns to automate"
|
|
context: { qualified, pattern_selection }
|
|
|
|
# ============================================================
|
|
# Phase 5: Program Generation
|
|
# ============================================================
|
|
|
|
let programs = patterns_to_automate | map:
|
|
session: author
|
|
prompt: """
|
|
Write a .prose program for this pattern.
|
|
|
|
Pattern maturity: {pattern.maturity}
|
|
Times observed: {pattern.frequency}
|
|
Representative examples: {pattern.examples}
|
|
|
|
The program should:
|
|
- Parameterize what varies between instances
|
|
- Hardcode what's always the same
|
|
- Use appropriate agents for distinct roles
|
|
- Include error handling
|
|
- Add user checkpoints at decision points
|
|
|
|
Include a header comment noting:
|
|
- Pattern maturity level
|
|
- Number of observations it's based on
|
|
- Date generated
|
|
"""
|
|
context: item
|
|
|
|
# ============================================================
|
|
# Phase 6: Organization
|
|
# ============================================================
|
|
|
|
let organized = session: organizer
|
|
prompt: """
|
|
Organize the generated programs.
|
|
|
|
Tasks:
|
|
1. Group by domain (git, testing, docs, refactoring, etc.)
|
|
2. Suggest directory structure
|
|
3. Create an index README with:
|
|
- Program name and one-line description
|
|
- Pattern maturity (established/proven)
|
|
- When to use it
|
|
- Example invocation
|
|
4. Identify shared patterns that could be extracted
|
|
5. Note programs that often chain together
|
|
"""
|
|
context: programs
|
|
|
|
# ============================================================
|
|
# Phase 7: Output Location
|
|
# ============================================================
|
|
|
|
input output_location: """
|
|
## Generated Programs
|
|
|
|
{organized}
|
|
|
|
---
|
|
|
|
Where should I write these programs?
|
|
|
|
Options:
|
|
- A directory path (e.g., ~/my-workflows/)
|
|
- "preview" to just show them without writing
|
|
"""
|
|
|
|
if **user wants preview only**:
|
|
output result = {
|
|
status: "preview",
|
|
sessions_analyzed: session_count,
|
|
pattern_update: pattern_update,
|
|
qualified: qualified,
|
|
programs: programs,
|
|
organization: organized
|
|
}
|
|
|
|
let written = session: organizer
|
|
prompt: "Write all programs to the specified location with proper structure"
|
|
context: { programs, organized, output_location }
|
|
permissions:
|
|
write: ["**/*.prose", "**/*.md"]
|
|
|
|
# ============================================================
|
|
# Output
|
|
# ============================================================
|
|
|
|
output result = {
|
|
status: "complete",
|
|
|
|
# Discovery
|
|
sources_scanned: inventory,
|
|
sources_analyzed: selected_sources,
|
|
|
|
# Analysis
|
|
sessions_analyzed: session_count,
|
|
pattern_update: pattern_update,
|
|
|
|
# Qualification
|
|
patterns_qualified: qualified,
|
|
patterns_automated: patterns_to_automate,
|
|
|
|
# Generation
|
|
programs_written: written,
|
|
organization: organized,
|
|
|
|
# For next run
|
|
next_step: "Run again with mode:'incremental' to process new logs and mature patterns"
|
|
}
|