feat(discovery): gateway bonjour + node pairing bridge

This commit is contained in:
Peter Steinberger
2025-12-13 03:47:27 +00:00
parent 163080b609
commit eace21dcae
18 changed files with 1780 additions and 29 deletions

106
docs/discovery.md Normal file
View File

@@ -0,0 +1,106 @@
---
summary: "Node discovery and transports (Bonjour, Tailscale, SSH) for finding the master gateway"
read_when:
- Implementing or changing Bonjour discovery/advertising
- Adjusting remote connection modes (direct vs SSH)
- Designing bridge + pairing for remote nodes
---
# Discovery & transports
Clawdis has two distinct problems that look similar on the surface:
1) **Operator remote control**: the macOS menu bar app controlling a “master” gateway running elsewhere.
2) **Node pairing**: Iris/iOS (and future nodes) finding a gateway and pairing securely.
The design goal is to keep all network discovery/advertising in the **Node Gateway** (`clawd` / `clawdis gateway`) and keep clients (mac app, iOS) as consumers.
## Terms
- **Master gateway**: the single, long-running gateway process that owns state (sessions, pairing, node registry) and runs providers.
- **Gateway WS (loopback)**: the existing gateway WebSocket control endpoint on `127.0.0.1:18789`.
- **Bridge (direct transport)**: a LAN/tailnet-facing endpoint owned by the gateway that allows authenticated clients/nodes to call a scoped subset of gateway methods. The bridge exists so the gateway can remain loopback-only.
- **SSH transport (fallback)**: remote control by forwarding `127.0.0.1:18789` over SSH.
## Why we keep both “direct” and SSH
- **Direct bridge** is the best UX on the same network and within a tailnet:
- auto-discovery on LAN via Bonjour
- pairing tokens + ACLs owned by the gateway
- no shell access required; protocol surface can stay tight and auditable
- **SSH** remains the universal fallback:
- works anywhere you have SSH access (even across unrelated networks)
- survives multicast/mDNS issues
- requires no new inbound ports besides SSH
## Discovery inputs (how clients learn where the master is)
### 1) Bonjour / mDNS (LAN only)
Bonjour is best-effort and does not cross networks. It is only used for “same LAN” convenience.
Target direction:
- The **gateway** advertises itself (and/or its bridge) via Bonjour.
- Clients browse and show a “pick a master” list, then store the chosen endpoint.
#### Current implementation
- Service types:
- `_clawdis-master._tcp` (gateway “master” beacon)
- `_clawdis-bridge._tcp` (optional; bridge transport beacon)
- TXT keys (non-secret):
- `role=master`
- `lanHost=<hostname>.local`
- `sshPort=22` (or whatever is advertised)
- `gatewayPort=18789` (loopback WS port; informational)
- `bridgePort=18790` (when bridge is enabled)
- `tailnetDns=<magicdns>` (optional hint)
Disable/override:
- `CLAWDIS_DISABLE_BONJOUR=1` disables advertising.
- `CLAWDIS_BRIDGE_ENABLED=0` disables the bridge listener.
- `CLAWDIS_BRIDGE_HOST` / `CLAWDIS_BRIDGE_PORT` control bind/port.
### 2) Tailnet (cross-network)
For London/Vienna style setups, Bonjour wont help. The recommended “direct” target is:
- Tailscale MagicDNS name (preferred) or a stable tailnet IP.
If the gateway can detect it is running under Tailscale, it can publish `tailnetDns` as an optional hint for clients.
### 3) Manual / SSH target
When there is no direct route (or direct is disabled), clients can always connect via SSH by forwarding the loopback gateway port.
See `docs/remote.md`.
## Transport selection (client policy)
Recommended client behavior:
1) If a paired direct endpoint is configured and reachable, use it.
2) Else, if Bonjour finds a master on LAN, offer a one-tap “Use this master” choice and save it as the direct endpoint.
3) Else, if a tailnet DNS/IP is configured, try direct.
4) Else, fall back to SSH.
## Pairing + auth (direct transport)
The gateway is the source of truth for node/client admission.
- Pairing requests are created/approved/rejected in the gateway (see `docs/gateway/pairing.md`).
- The bridge enforces:
- auth (token / keypair)
- scopes/ACLs (bridge is not a raw proxy to every gateway method)
- rate limits
## Where the code lives (target architecture)
- Node gateway:
- advertises discovery beacons (Bonjour)
- owns pairing storage + decisions
- runs the bridge listener (direct transport)
- macOS app:
- UI for picking a master, showing pairing prompts, and troubleshooting
- SSH tunneling only for the fallback path
- iOS node:
- browses Bonjour (LAN) as a convenience only
- uses direct transport + pairing to connect to the gateway