feat: share wake gate via SwabbleKit

This commit is contained in:
Peter Steinberger
2025-12-23 01:30:40 +01:00
parent cf48d297dd
commit ef35868bef
2945 changed files with 27887 additions and 122 deletions

11
Swabble/CHANGELOG.md Normal file
View File

@@ -0,0 +1,11 @@
# Changelog
## 0.2.0 — 2025-12-23
### Highlights
- Added `SwabbleKit` (multi-platform wake-word gate utilities with segment-aware gap detection).
- Swabble package now supports iOS + macOS consumers; CLI remains macOS 26-only.
### Changes
- CLI wake-word matching/stripping routed through `SwabbleKit` helpers.
- Speech pipeline types now explicitly gated to macOS 26 / iOS 26 availability.

View File

@@ -4,10 +4,12 @@ import PackageDescription
let package = Package(
name: "swabble",
platforms: [
.macOS(.v26),
.macOS(.v15),
.iOS(.v17),
],
products: [
.library(name: "Swabble", targets: ["Swabble"]),
.library(name: "SwabbleKit", targets: ["SwabbleKit"]),
.executable(name: "swabble", targets: ["SwabbleCLI"]),
],
dependencies: [
@@ -19,13 +21,30 @@ let package = Package(
name: "Swabble",
path: "Sources/SwabbleCore",
swiftSettings: []),
.target(
name: "SwabbleKit",
path: "Sources/SwabbleKit",
swiftSettings: [
.enableUpcomingFeature("StrictConcurrency"),
]),
.executableTarget(
name: "SwabbleCLI",
dependencies: [
"Swabble",
"SwabbleKit",
.product(name: "Commander", package: "Commander"),
],
path: "Sources/swabble"),
.testTarget(
name: "SwabbleKitTests",
dependencies: [
"SwabbleKit",
.product(name: "Testing", package: "swift-testing"),
],
swiftSettings: [
.enableUpcomingFeature("StrictConcurrency"),
.enableExperimentalFeature("SwiftTesting"),
]),
.testTarget(
name: "swabbleTests",
dependencies: [

View File

@@ -1,9 +1,10 @@
# 🎙️ swabble — Speech.framework wake-word hook daemon (macOS 26)
swabble is a Swift 6.2, macOS 26-only rewrite of the brabble voice daemon. It listens on your mic, gates on a wake word, transcribes locally using Apple's new SpeechAnalyzer + SpeechTranscriber, then fires a shell hook with the transcript. No cloud calls, no Whisper binaries.
swabble is a Swift 6.2 wake-word hook daemon. The CLI targets macOS 26 (SpeechAnalyzer + SpeechTranscriber). The shared `SwabbleKit` target is multi-platform and exposes wake-word gating utilities for iOS/macOS apps.
- **Local-only**: Speech.framework on-device models; zero network usage.
- **Wake word**: Default `clawd` (aliases `claude`), optional `--no-wake` bypass.
- **SwabbleKit**: Shared wake gate utilities (gap-based gating when you provide speech segments).
- **Hooks**: Run any command with prefix/env, cooldown, min_chars, timeout.
- **Services**: launchd helper stubs for start/stop/install.
- **File transcribe**: TXT or SRT with time ranges (using AttributedString splits).
@@ -30,7 +31,7 @@ swift run swabble transcribe /path/to/audio.m4a --format srt --output out.srt
```
## Use as a library
Add swabble as a SwiftPM dependency and import the `Swabble` product to reuse the Speech pipeline, config loader, hook executor, and transcript store in your own app:
Add swabble as a SwiftPM dependency and import the `Swabble` or `SwabbleKit` product:
```swift
// Package.swift
@@ -38,7 +39,10 @@ dependencies: [
.package(url: "https://github.com/steipete/swabble.git", branch: "main"),
],
targets: [
.target(name: "MyApp", dependencies: [.product(name: "Swabble", package: "swabble")]),
.target(name: "MyApp", dependencies: [
.product(name: "Swabble", package: "swabble"), // Speech pipeline (macOS 26+ / iOS 26+)
.product(name: "SwabbleKit", package: "swabble"), // Wake-word gate utilities (iOS 17+ / macOS 15+)
]),
]
```
@@ -93,7 +97,7 @@ Environment variables:
## Speech pipeline
- `AVAudioEngine` tap → `BufferConverter``AnalyzerInput``SpeechAnalyzer` with a `SpeechTranscriber` module.
- Requests volatile + final results; wake gating is string match on partial/final.
- Requests volatile + final results; the CLI uses text-only wake gating today.
- Authorization requested at first start; requires macOS 26 + new Speech.framework APIs.
## Development

View File

@@ -2,11 +2,13 @@ import AVFoundation
import Foundation
import Speech
@available(macOS 26.0, iOS 26.0, *)
public struct SpeechSegment: Sendable {
public let text: String
public let isFinal: Bool
}
@available(macOS 26.0, iOS 26.0, *)
public enum SpeechPipelineError: Error {
case authorizationDenied
case analyzerFormatUnavailable
@@ -14,6 +16,7 @@ public enum SpeechPipelineError: Error {
}
/// Live microphone SpeechAnalyzer SpeechTranscriber pipeline.
@available(macOS 26.0, iOS 26.0, *)
public actor SpeechPipeline {
private struct UnsafeBuffer: @unchecked Sendable { let buffer: AVAudioPCMBuffer }

View File

@@ -0,0 +1,202 @@
import Foundation
public struct WakeWordSegment: Sendable, Equatable {
public let text: String
public let start: TimeInterval
public let duration: TimeInterval
public let range: Range<String.Index>?
public init(text: String, start: TimeInterval, duration: TimeInterval, range: Range<String.Index>? = nil) {
self.text = text
self.start = start
self.duration = duration
self.range = range
}
public var end: TimeInterval { start + duration }
}
public struct WakeWordGateConfig: Sendable, Equatable {
public var triggers: [String]
public var minPostTriggerGap: TimeInterval
public var minCommandLength: Int
public init(
triggers: [String],
minPostTriggerGap: TimeInterval = 0.45,
minCommandLength: Int = 1)
{
self.triggers = triggers
self.minPostTriggerGap = minPostTriggerGap
self.minCommandLength = minCommandLength
}
}
public struct WakeWordGateMatch: Sendable, Equatable {
public let triggerEndTime: TimeInterval
public let postGap: TimeInterval
public let command: String
public init(triggerEndTime: TimeInterval, postGap: TimeInterval, command: String) {
self.triggerEndTime = triggerEndTime
self.postGap = postGap
self.command = command
}
}
public enum WakeWordGate {
private struct Token {
let normalized: String
let start: TimeInterval
let end: TimeInterval
let range: Range<String.Index>?
let text: String
}
private struct TriggerTokens {
let tokens: [String]
}
public static func match(
transcript: String,
segments: [WakeWordSegment],
config: WakeWordGateConfig)
-> WakeWordGateMatch? {
let triggerTokens = normalizeTriggers(config.triggers)
guard !triggerTokens.isEmpty else { return nil }
let tokens = normalizeSegments(segments)
guard !tokens.isEmpty else { return nil }
var bestIndex: Int?
var bestTriggerEnd: TimeInterval = 0
var bestGap: TimeInterval = 0
for trigger in triggerTokens {
let count = trigger.tokens.count
guard count > 0, tokens.count > count else { continue }
for i in 0...(tokens.count - count - 1) {
var matched = true
for t in 0..<count {
if tokens[i + t].normalized != trigger.tokens[t] {
matched = false
break
}
}
if !matched { continue }
let triggerEnd = tokens[i + count - 1].end
let nextToken = tokens[i + count]
let gap = nextToken.start - triggerEnd
if gap < config.minPostTriggerGap { continue }
if let bestIndex, i <= bestIndex { continue }
bestIndex = i
bestTriggerEnd = triggerEnd
bestGap = gap
}
}
guard let bestIndex else { return nil }
let command = commandText(transcript: transcript, segments: segments, triggerEndTime: bestTriggerEnd)
.trimmingCharacters(in: Self.whitespaceAndPunctuation)
guard command.count >= config.minCommandLength else { return nil }
return WakeWordGateMatch(triggerEndTime: bestTriggerEnd, postGap: bestGap, command: command)
}
public static func commandText(
transcript: String,
segments: [WakeWordSegment],
triggerEndTime: TimeInterval)
-> String {
let threshold = triggerEndTime + 0.001
for segment in segments where segment.start >= threshold {
if normalizeToken(segment.text).isEmpty { continue }
if let range = segment.range {
let slice = transcript[range.lowerBound...]
return String(slice).trimmingCharacters(in: Self.whitespaceAndPunctuation)
}
break
}
let text = segments
.filter { $0.start >= threshold && !normalizeToken($0.text).isEmpty }
.map { $0.text }
.joined(separator: " ")
return text.trimmingCharacters(in: Self.whitespaceAndPunctuation)
}
public static func matchesTextOnly(text: String, triggers: [String]) -> Bool {
guard !text.isEmpty else { return false }
let normalized = text.lowercased()
for trigger in triggers {
let token = trigger.trimmingCharacters(in: whitespaceAndPunctuation).lowercased()
if token.isEmpty { continue }
if normalized.contains(token) { return true }
}
return false
}
public static func stripWake(text: String, triggers: [String]) -> String {
var out = text
for trigger in triggers {
let token = trigger.trimmingCharacters(in: whitespaceAndPunctuation)
guard !token.isEmpty else { continue }
out = out.replacingOccurrences(of: token, with: "", options: [.caseInsensitive])
}
return out.trimmingCharacters(in: whitespaceAndPunctuation)
}
private static func normalizeTriggers(_ triggers: [String]) -> [TriggerTokens] {
var output: [TriggerTokens] = []
for trigger in triggers {
let tokens = trigger
.split(whereSeparator: { $0.isWhitespace })
.map { normalizeToken(String($0)) }
.filter { !$0.isEmpty }
if tokens.isEmpty { continue }
output.append(TriggerTokens(tokens: tokens))
}
return output
}
private static func normalizeSegments(_ segments: [WakeWordSegment]) -> [Token] {
segments.compactMap { segment in
let normalized = normalizeToken(segment.text)
guard !normalized.isEmpty else { return nil }
return Token(
normalized: normalized,
start: segment.start,
end: segment.end,
range: segment.range,
text: segment.text)
}
}
private static func normalizeToken(_ token: String) -> String {
token
.trimmingCharacters(in: whitespaceAndPunctuation)
.lowercased()
}
private static let whitespaceAndPunctuation = CharacterSet.whitespacesAndNewlines
.union(.punctuationCharacters)
}
#if canImport(Speech)
import Speech
public enum WakeWordSpeechSegments {
public static func from(transcription: SFTranscription, transcript: String) -> [WakeWordSegment] {
transcription.segments.map { segment in
let range = Range(segment.substringRange, in: transcript)
return WakeWordSegment(
text: segment.substring,
start: segment.timestamp,
duration: segment.duration,
range: range)
}
}
}
#endif

View File

@@ -1,6 +1,7 @@
import Commander
import Foundation
@available(macOS 26.0, *)
@MainActor
enum CLIRegistry {
static var descriptors: [CommandDescriptor] {

View File

@@ -1,7 +1,9 @@
import Commander
import Foundation
import Swabble
import SwabbleKit
@available(macOS 26.0, *)
@MainActor
struct ServeCommand: ParsableCommand {
@Option(name: .long("config"), help: "Path to config JSON") var configPath: String?
@@ -68,17 +70,12 @@ struct ServeCommand: ParsableCommand {
}
private static func matchesWake(text: String, cfg: SwabbleConfig) -> Bool {
let lowered = text.lowercased()
if lowered.contains(cfg.wake.word.lowercased()) { return true }
return cfg.wake.aliases.contains(where: { lowered.contains($0.lowercased()) })
let triggers = [cfg.wake.word] + cfg.wake.aliases
return WakeWordGate.matchesTextOnly(text: text, triggers: triggers)
}
private static func stripWake(text: String, cfg: SwabbleConfig) -> String {
var out = text
out = out.replacingOccurrences(of: cfg.wake.word, with: "", options: [.caseInsensitive])
for alias in cfg.wake.aliases {
out = out.replacingOccurrences(of: alias, with: "", options: [.caseInsensitive])
}
return out.trimmingCharacters(in: .whitespacesAndNewlines)
let triggers = [cfg.wake.word] + cfg.wake.aliases
return WakeWordGate.stripWake(text: text, triggers: triggers)
}
}

View File

@@ -1,6 +1,7 @@
import Commander
import Foundation
@available(macOS 26.0, *)
@MainActor
private func runCLI() async -> Int32 {
do {
@@ -15,6 +16,7 @@ private func runCLI() async -> Int32 {
}
}
@available(macOS 26.0, *)
@MainActor
private func dispatch(invocation: CommandInvocation) async throws {
let parsed = invocation.parsedValues
@@ -95,5 +97,10 @@ private func dispatch(invocation: CommandInvocation) async throws {
}
}
let exitCode = await runCLI()
exit(exitCode)
if #available(macOS 26.0, *) {
let exitCode = await runCLI()
exit(exitCode)
} else {
fputs("error: swabble requires macOS 26 or newer\n", stderr)
exit(1)
}

View File

@@ -0,0 +1,63 @@
import Foundation
import Testing
import SwabbleKit
@Suite struct WakeWordGateTests {
@Test func matchRequiresGapAfterTrigger() {
let transcript = "hey clawd do thing"
let segments = makeSegments(
transcript: transcript,
words: [
("hey", 0.0, 0.1),
("clawd", 0.2, 0.1),
("do", 0.35, 0.1),
("thing", 0.5, 0.1),
])
let config = WakeWordGateConfig(triggers: ["clawd"], minPostTriggerGap: 0.3)
#expect(WakeWordGate.match(transcript: transcript, segments: segments, config: config) == nil)
}
@Test func matchAllowsGapAndExtractsCommand() {
let transcript = "hey clawd do thing"
let segments = makeSegments(
transcript: transcript,
words: [
("hey", 0.0, 0.1),
("clawd", 0.2, 0.1),
("do", 0.9, 0.1),
("thing", 1.1, 0.1),
])
let config = WakeWordGateConfig(triggers: ["clawd"], minPostTriggerGap: 0.3)
let match = WakeWordGate.match(transcript: transcript, segments: segments, config: config)
#expect(match?.command == "do thing")
}
@Test func matchHandlesMultiWordTriggers() {
let transcript = "hey clawd do it"
let segments = makeSegments(
transcript: transcript,
words: [
("hey", 0.0, 0.1),
("clawd", 0.2, 0.1),
("do", 0.8, 0.1),
("it", 1.0, 0.1),
])
let config = WakeWordGateConfig(triggers: ["hey clawd"], minPostTriggerGap: 0.3)
let match = WakeWordGate.match(transcript: transcript, segments: segments, config: config)
#expect(match?.command == "do it")
}
}
private func makeSegments(
transcript: String,
words: [(String, TimeInterval, TimeInterval)])
-> [WakeWordSegment] {
var searchStart = transcript.startIndex
var output: [WakeWordSegment] = []
for (word, start, duration) in words {
let range = transcript.range(of: word, range: searchStart..<transcript.endIndex)
output.append(WakeWordSegment(text: word, start: start, duration: duration, range: range))
if let range { searchStart = range.upperBound }
}
return output
}

View File

@@ -1,11 +1,12 @@
# swabble — macOS 26 speech hook daemon (Swift 6.2)
Goal: brabble-style always-on voice hook for macOS 26 using Apple Speech.framework (SpeechAnalyzer + SpeechTranscriber) instead of whisper.cpp. Local-only, wake word gated, dispatches a shell hook with the transcript.
Goal: brabble-style always-on voice hook for macOS 26 using Apple Speech.framework (SpeechAnalyzer + SpeechTranscriber) instead of whisper.cpp. Local-only, wake word gated, dispatches a shell hook with the transcript. Shared wake-gate utilities live in `SwabbleKit` for reuse by other apps (iOS/macOS).
## Requirements
- macOS 26+, Swift 6.2, Speech.framework with on-device assets.
- Local only; no network calls during transcription.
- Wake word gating (default "clawd" plus aliases) with bypass flag `--no-wake`.
- `SwabbleKit` target (multi-platform) providing wake-word gating helpers that can use speech segment timing to require a post-trigger gap.
- Hook execution with cooldown, min_chars, timeout, prefix, env vars.
- Simple config at `~/.config/swabble/config.json` (JSON, Codable) — no TOML.
- CLI implemented with Commander (SwiftPM package `steipete/Commander`); core types are available via the SwiftPM library product `Swabble` for embedding.
@@ -17,7 +18,7 @@ Goal: brabble-style always-on voice hook for macOS 26 using Apple Speech.framewo
- **CLI layer (Commander)**: Root command `swabble` with subcommands `serve`, `transcribe`, `test-hook`, `mic list|set`, `doctor`, `health`, `tail-log`. Runtime flags from Commander (`-v/--verbose`, `--json-output`, `--log-level`). Custom `--config` path applies everywhere.
- **Config**: `SwabbleConfig` Codable. Fields: audio device name/index, wake (enabled/word/aliases/sensitivity placeholder), hook (command/args/prefix/cooldown/min_chars/timeout/env), logging (level, format), transcripts (enabled, max kept), speech (locale, enableEtiquetteReplacements flag). Stored JSON; default written by `setup`.
- **Audio + Speech pipeline**: `SpeechPipeline` wraps `AVAudioEngine` input → `SpeechAnalyzer` with `SpeechTranscriber` module. Emits partial/final transcripts via async stream. Requests `.audioTimeRange` when transcripts enabled. Handles Speech permission and asset download prompts ahead of capture.
- **Wake gate**: text-based keyword match against latest partial/final; strips wake term before hook dispatch. `--no-wake` disables.
- **Wake gate**: CLI currently uses text-only keyword match; shared `SwabbleKit` gate can enforce a minimum pause between the wake word and the next token when speech segments are available. `--no-wake` disables gating.
- **Hook executor**: async `HookExecutor` spawns `Process` with configured args, prefix substitution `${hostname}`. Enforces cooldown + timeout; injects env `SWABBLE_TEXT`, `SWABBLE_PREFIX` plus user env map.
- **Transcripts store**: in-memory ring buffer; optional persisted JSON lines under `~/Library/Application Support/swabble/transcripts.log`.
- **Logging**: simple structured logger to stderr; respects log level.
@@ -25,7 +26,7 @@ Goal: brabble-style always-on voice hook for macOS 26 using Apple Speech.framewo
## Out of scope (initial cut)
- Model management (Speech handles assets).
- Launchd helper (planned follow-up).
- Advanced wake-word detector (text match only for now).
- Advanced wake-word detector (segment-aware gate now lives in `SwabbleKit`; CLI still text-only until segment timing is plumbed through).
## Open decisions
- Whether to expose a UNIX control socket for `status`/`health` (currently planned as stdin/out direct calls).