VoiceWake: capture utterance and add prefix
This commit is contained in:
@@ -4,11 +4,11 @@ Author: steipete · Updated: 2025-12-06 · Scope: macOS app (`apps/macos`)
|
||||
|
||||
- **Idle:** Normal icon animation (blink, occasional wiggle).
|
||||
- **Paused:** Status item uses `appearsDisabled`; no motion.
|
||||
- **Voice trigger (big ears):** Voice wake detector calls `AppState.triggerVoiceEars()` → `earBoostActive=true` for ~5s. Ears scale up (1.9x), get circular ear holes for readability, then auto-reset. Only fired from the in-app voice pipeline.
|
||||
- **Voice trigger (big ears):** Voice wake detector calls `AppState.triggerVoiceEars(ttl: nil)` when the wake word is heard, keeping `earBoostActive=true` while the utterance is captured. Ears scale up (1.9x), get circular ear holes for readability, then drop via `stopVoiceEars()` after 1s of silence. Only fired from the in-app voice pipeline.
|
||||
- **Working (agent running):** `AppState.isWorking=true` drives a “tail/leg scurry” micro-motion: faster leg wiggle and slight offset while work is in-flight. Currently toggled around WebChat agent runs; add the same toggle around other long tasks when you wire them.
|
||||
|
||||
Wiring points
|
||||
- Voice wake: see `VoiceWakeTester.handleResult` in `AppMain.swift`—on detection it calls `triggerVoiceEars()`.
|
||||
- Voice wake: runtime/tester call `AppState.triggerVoiceEars(ttl: nil)` on trigger and `stopVoiceEars()` after 1s of silence to match the capture window.
|
||||
- Agent activity: set `AppStateStore.shared.setWorking(true/false)` around work spans (already done in WebChat agent call). Keep spans short and reset in `defer` blocks to avoid stuck animations.
|
||||
|
||||
Shapes & sizes
|
||||
|
||||
28
docs/mac/voicewake.md
Normal file
28
docs/mac/voicewake.md
Normal file
@@ -0,0 +1,28 @@
|
||||
# Voice Wake Pipeline
|
||||
|
||||
Updated: 2025-12-08 · Owners: mac app
|
||||
|
||||
## Runtime behavior
|
||||
- Always-on listener (Speech framework) waits for any trigger word.
|
||||
- On first trigger hit: start capture, raise ears immediately via `AppState.triggerVoiceEars(ttl: nil)`, reset capture buffer.
|
||||
- While capturing: keep buffer in sync with partial transcripts; update `lastHeard` whenever audio arrives.
|
||||
- End capture when 1.0s of silence is observed (or 8s hard stop), then call `stopVoiceEars()`, prepend the voice-prefix string, send once to Claude, and restart the recognizer for a clean next trigger. A short 350ms debounce prevents double-fires.
|
||||
|
||||
## Visual states
|
||||
- **Listening for trigger:** idle icon.
|
||||
- **Wake word detected / capturing:** ears enlarged with holes; stays up until silence end, not a fixed timer.
|
||||
- **After send:** ears drop immediately when silence window elapses; icon returns to idle.
|
||||
|
||||
## Forwarding payload
|
||||
- Uses `VoiceWakeForwarder.prefixedTranscript(_:)` to prepend the model hint:
|
||||
`User talked via voice recognition on <machine> - repeat prompt first + remember some words might be incorrectly transcribed.`
|
||||
- Machine name resolves to Host.localizedName or hostName; caller can override for tests.
|
||||
|
||||
## Testing hooks
|
||||
- Settings tester mirrors runtime: same capture/silence flow, same prefix, same ear behavior.
|
||||
- Unit test: `VoiceWakeForwarderTests.prefixedTranscriptUsesMachineName` covers the prefix format.
|
||||
|
||||
## Tuning knobs (swift constants)
|
||||
- Silence window: 1.0s (`silenceWindow` in `VoiceWakeRuntime`).
|
||||
- Hard stop after trigger: 8s (`captureHardStop`).
|
||||
- Post-send debounce: 0.35s (`debounceAfterSend`).
|
||||
Reference in New Issue
Block a user