feat: add reply tags and replyToMode

2026-01-02 23:18:41 +01:00
parent a9ff03acaf
commit 2c92ccd66e
19 changed files with 353 additions and 27 deletions
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -169,6 +169,7 @@ Set `telegram.enabled: false` to disable automatic startup.
  telegram: {
    enabled: true,
    botToken: "your-bot-token",
+    replyToMode: "off",
    groups: {
      "*": { requireMention: true },
      "123456789": { requireMention: false } // group chat id
@@ -183,6 +184,7 @@ Set `telegram.enabled: false` to disable automatic startup.
 }
 ```
 Mention gating precedence (most specific wins): `telegram.groups.<chatId>.requireMention` → `telegram.groups."*".requireMention` → default `true`.
+Reply threading is controlled via `telegram.replyToMode` (`off` | `first` | `all`) and reply tags in the model output.

 ### `discord` (bot transport)

@@ -195,6 +197,7 @@ Configure the Discord bot by setting the bot token and optional gating:
    token: "your-bot-token",
    mediaMaxMb: 8,                          // clamp inbound media size
    enableReactions: true,                  // allow agent-triggered reactions
+    replyToMode: "off",                     // off | first | all
    slashCommand: {                         // user-installed app slash commands
      enabled: true,
      name: "clawd",
@@ -225,6 +228,7 @@ Configure the Discord bot by setting the bot token and optional gating:
 ```

 Clawdis starts Discord only when a `discord` config section exists. The token is resolved from `DISCORD_BOT_TOKEN` or `discord.token` (unless `discord.enabled` is `false`). Use `user:<id>` (DM) or `channel:<id>` (guild channel) when specifying delivery targets for cron/CLI commands.
+Reply threading is controlled via `discord.replyToMode` (`off` | `first` | `all`) and reply tags in the model output.
 Guild slugs are lowercase with spaces replaced by `-`; channel keys use the slugged channel name (no leading `#`). Prefer guild ids as keys to avoid rename ambiguity.
 Use `discord.guilds."*"` for default per-guild settings.

--- a/docs/discord.md
+++ b/docs/discord.md
@@ -40,6 +40,7 @@ Note: Guild context `[from:]` lines include `author.tag` + `id` to make ping-rea
 - File uploads supported up to the configured `discord.mediaMaxMb` (default 8 MB).
 - Mention-gated guild replies by default to avoid noisy bots.
 - Reply context is injected when a message references another message (quoted content + ids).
+- Native reply threading is **off by default**; enable with `discord.replyToMode` and reply tags.

 ## Config

@@ -50,6 +51,7 @@ Note: Guild context `[from:]` lines include `author.tag` + `id` to make ping-rea
    token: "abc.123",
    mediaMaxMb: 8,
    enableReactions: true,
+    replyToMode: "off",
    slashCommand: {
      enabled: true,
      name: "clawd",
@@ -92,6 +94,18 @@ Note: Guild context `[from:]` lines include `author.tag` + `id` to make ping-rea
 - `mediaMaxMb`: clamp inbound media saved to disk.
 - `historyLimit`: number of recent guild messages to include as context when replying to a mention (default 20, `0` disables).
 - `enableReactions`: allow agent-triggered reactions via the `clawdis_discord` tool (default `true`).
+- `replyToMode`: `off` (default), `first`, or `all`. Applies only when the model includes a reply tag.
+
+## Reply tags
+To request a threaded reply, the model can include one tag in its output:
+- `[[reply_to_current]]` — reply to the triggering Discord message.
+- `[[reply_to:<id>]]` — reply to a specific message id from context/history.
+Current message ids are appended to prompts as `[message_id: …]`; history entries already include ids.
+
+Behavior is controlled by `discord.replyToMode`:
+- `off`: ignore tags.
+- `first`: only the first outbound chunk/attachment is a reply.
+- `all`: every outbound chunk/attachment is a reply.

 Allowlist matching notes:
 - `allowFrom`/`users`/`groupChannels` accept ids, names, tags, or mentions like `<@id>`.
--- a/docs/setup.md
+++ b/docs/setup.md
@@ -113,5 +113,6 @@ pnpm clawdis health

 - `docs/gateway.md` (Gateway runbook; flags, supervision, ports)
 - `docs/configuration.md` (config schema + examples)
+- `docs/discord.md` and `docs/telegram.md` (reply tags + replyToMode settings)
 - `docs/clawd.md` (personal assistant setup)
 - `docs/clawdis-mac.md` (macOS app behavior; gateway lifecycle + “Attach only”)
--- a/docs/telegram.md
+++ b/docs/telegram.md
@@ -31,13 +31,13 @@ Status: ready for bot-mode use with grammY (long-polling by default; webhook sup
 - Sees only messages sent after it’s added to a chat; no pre-history access.
 - Cannot DM users first; they must initiate. Channels are receive-only unless the bot is an admin poster.
 - File size caps follow Telegram Bot API (up to 2 GB for documents; smaller for some media types).
- Typing indicators (`sendChatAction`) supported; outbound replies are sent as native replies to the triggering message (threaded where Telegram allows).
+- Typing indicators (`sendChatAction`) supported; native replies are **off by default** and enabled via `telegram.replyToMode` + reply tags.

 ## Planned implementation details
 - Library: grammY is the only client for send + gateway (fetch fallback removed); grammY throttler is enabled by default to stay under Bot API limits.
- Inbound normalization: maps Bot API updates to `MsgContext` with `Surface: "telegram"`, `ChatType: direct|group`, `SenderName`, `MediaPath`/`MediaType` when attachments arrive, `Timestamp`, and reply-to metadata (`ReplyToId`, `ReplyToBody`, `ReplyToSender`) when the user replies; reply context is appended to `Body` as a `[Replying to ...]` block; groups require @bot mention by default (override per chat in config).
+- Inbound normalization: maps Bot API updates to `MsgContext` with `Surface: "telegram"`, `ChatType: direct|group`, `SenderName`, `MediaPath`/`MediaType` when attachments arrive, `Timestamp`, and reply-to metadata (`ReplyToId`, `ReplyToBody`, `ReplyToSender`) when the user replies; reply context is appended to `Body` as a `[Replying to ...]` block (includes `id:` when available); groups require @bot mention by default (override per chat in config).
 - Outbound: text and media (photo/video/audio/document) with optional caption; chunked to limits. Typing cue sent best-effort.
- Config: `TELEGRAM_BOT_TOKEN` env or `telegram.botToken` required; `telegram.groups`, `telegram.allowFrom`, `telegram.mediaMaxMb`, `telegram.proxy`, `telegram.webhookSecret`, `telegram.webhookUrl`, `telegram.webhookPath` supported.
+- Config: `TELEGRAM_BOT_TOKEN` env or `telegram.botToken` required; `telegram.groups`, `telegram.allowFrom`, `telegram.mediaMaxMb`, `telegram.replyToMode`, `telegram.proxy`, `telegram.webhookSecret`, `telegram.webhookUrl`, `telegram.webhookPath` supported.
  - Mention gating precedence (most specific wins): `telegram.groups.<chatId>.requireMention` → `telegram.groups."*".requireMention` → default `true`.

 Example config:
@@ -46,6 +46,7 @@ Example config:
  telegram: {
    enabled: true,
    botToken: "123:abc",
+    replyToMode: "off",
    groups: {
      "*": { requireMention: true },
      "123456789": { requireMention: false } // group chat id
@@ -66,6 +67,17 @@ Example config:
 - Make the bot an admin if you need it to send in restricted groups or channels.
 - Mention the bot (`@yourbot`) or use commands to trigger; per-group overrides live in `telegram.groups` if you want always-on behavior.

+## Reply tags
+To request a threaded reply, the model can include one tag in its output:
+- `[[reply_to_current]]` — reply to the triggering Telegram message.
+- `[[reply_to:<id>]]` — reply to a specific message id from context.
+Current message ids are appended to prompts as `[message_id: …]`; reply context includes `id:` when available.
+
+Behavior is controlled by `telegram.replyToMode`:
+- `off`: ignore tags.
+- `first`: only the first outbound chunk/attachment is a reply.
+- `all`: every outbound chunk/attachment is a reply.
+
 ## Roadmap
 - ✅ Design and defaults (this doc)
 - ✅ grammY long-poll gateway + text/media send
--- a/docs/whatsapp.md
+++ b/docs/whatsapp.md
@@ -54,7 +54,7 @@ WhatsApp requires a real mobile number for verification. VoIP and virtual number
 - `Body` is the current message body with envelope.
 - Quoted reply context is **always appended**:
  ```
-  [Replying to +1555]
+  [Replying to +1555 id:ABC123]
  <quoted text or <media:...>>
  [/Replying]
  ```
@@ -81,8 +81,8 @@ WhatsApp requires a real mobile number for verification. VoIP and virtual number
 - Group metadata cached 5 min (subject + participants).

 ## Reply delivery (threading)
- Outbound replies are sent as **native replies** (quoted message).
- Model does not need IDs for threading; gateway attaches quote.
+- WhatsApp Web sends standard messages (no quoted reply threading in the current gateway).
+- Reply tags are ignored on this surface.

 ## Outbound send (text + media)
 - Uses active web listener; error if gateway not running.