feat: add reply tags and replyToMode

This commit is contained in:
Peter Steinberger
2026-01-02 23:18:41 +01:00
parent a9ff03acaf
commit 2c92ccd66e
19 changed files with 353 additions and 27 deletions

View File

@@ -169,6 +169,7 @@ Set `telegram.enabled: false` to disable automatic startup.
telegram: {
enabled: true,
botToken: "your-bot-token",
replyToMode: "off",
groups: {
"*": { requireMention: true },
"123456789": { requireMention: false } // group chat id
@@ -183,6 +184,7 @@ Set `telegram.enabled: false` to disable automatic startup.
}
```
Mention gating precedence (most specific wins): `telegram.groups.<chatId>.requireMention``telegram.groups."*".requireMention` → default `true`.
Reply threading is controlled via `telegram.replyToMode` (`off` | `first` | `all`) and reply tags in the model output.
### `discord` (bot transport)
@@ -195,6 +197,7 @@ Configure the Discord bot by setting the bot token and optional gating:
token: "your-bot-token",
mediaMaxMb: 8, // clamp inbound media size
enableReactions: true, // allow agent-triggered reactions
replyToMode: "off", // off | first | all
slashCommand: { // user-installed app slash commands
enabled: true,
name: "clawd",
@@ -225,6 +228,7 @@ Configure the Discord bot by setting the bot token and optional gating:
```
Clawdis starts Discord only when a `discord` config section exists. The token is resolved from `DISCORD_BOT_TOKEN` or `discord.token` (unless `discord.enabled` is `false`). Use `user:<id>` (DM) or `channel:<id>` (guild channel) when specifying delivery targets for cron/CLI commands.
Reply threading is controlled via `discord.replyToMode` (`off` | `first` | `all`) and reply tags in the model output.
Guild slugs are lowercase with spaces replaced by `-`; channel keys use the slugged channel name (no leading `#`). Prefer guild ids as keys to avoid rename ambiguity.
Use `discord.guilds."*"` for default per-guild settings.

View File

@@ -40,6 +40,7 @@ Note: Guild context `[from:]` lines include `author.tag` + `id` to make ping-rea
- File uploads supported up to the configured `discord.mediaMaxMb` (default 8 MB).
- Mention-gated guild replies by default to avoid noisy bots.
- Reply context is injected when a message references another message (quoted content + ids).
- Native reply threading is **off by default**; enable with `discord.replyToMode` and reply tags.
## Config
@@ -50,6 +51,7 @@ Note: Guild context `[from:]` lines include `author.tag` + `id` to make ping-rea
token: "abc.123",
mediaMaxMb: 8,
enableReactions: true,
replyToMode: "off",
slashCommand: {
enabled: true,
name: "clawd",
@@ -92,6 +94,18 @@ Note: Guild context `[from:]` lines include `author.tag` + `id` to make ping-rea
- `mediaMaxMb`: clamp inbound media saved to disk.
- `historyLimit`: number of recent guild messages to include as context when replying to a mention (default 20, `0` disables).
- `enableReactions`: allow agent-triggered reactions via the `clawdis_discord` tool (default `true`).
- `replyToMode`: `off` (default), `first`, or `all`. Applies only when the model includes a reply tag.
## Reply tags
To request a threaded reply, the model can include one tag in its output:
- `[[reply_to_current]]` — reply to the triggering Discord message.
- `[[reply_to:<id>]]` — reply to a specific message id from context/history.
Current message ids are appended to prompts as `[message_id: …]`; history entries already include ids.
Behavior is controlled by `discord.replyToMode`:
- `off`: ignore tags.
- `first`: only the first outbound chunk/attachment is a reply.
- `all`: every outbound chunk/attachment is a reply.
Allowlist matching notes:
- `allowFrom`/`users`/`groupChannels` accept ids, names, tags, or mentions like `<@id>`.

View File

@@ -113,5 +113,6 @@ pnpm clawdis health
- `docs/gateway.md` (Gateway runbook; flags, supervision, ports)
- `docs/configuration.md` (config schema + examples)
- `docs/discord.md` and `docs/telegram.md` (reply tags + replyToMode settings)
- `docs/clawd.md` (personal assistant setup)
- `docs/clawdis-mac.md` (macOS app behavior; gateway lifecycle + “Attach only”)

View File

@@ -31,13 +31,13 @@ Status: ready for bot-mode use with grammY (long-polling by default; webhook sup
- Sees only messages sent after its added to a chat; no pre-history access.
- Cannot DM users first; they must initiate. Channels are receive-only unless the bot is an admin poster.
- File size caps follow Telegram Bot API (up to 2 GB for documents; smaller for some media types).
- Typing indicators (`sendChatAction`) supported; outbound replies are sent as native replies to the triggering message (threaded where Telegram allows).
- Typing indicators (`sendChatAction`) supported; native replies are **off by default** and enabled via `telegram.replyToMode` + reply tags.
## Planned implementation details
- Library: grammY is the only client for send + gateway (fetch fallback removed); grammY throttler is enabled by default to stay under Bot API limits.
- Inbound normalization: maps Bot API updates to `MsgContext` with `Surface: "telegram"`, `ChatType: direct|group`, `SenderName`, `MediaPath`/`MediaType` when attachments arrive, `Timestamp`, and reply-to metadata (`ReplyToId`, `ReplyToBody`, `ReplyToSender`) when the user replies; reply context is appended to `Body` as a `[Replying to ...]` block; groups require @bot mention by default (override per chat in config).
- Inbound normalization: maps Bot API updates to `MsgContext` with `Surface: "telegram"`, `ChatType: direct|group`, `SenderName`, `MediaPath`/`MediaType` when attachments arrive, `Timestamp`, and reply-to metadata (`ReplyToId`, `ReplyToBody`, `ReplyToSender`) when the user replies; reply context is appended to `Body` as a `[Replying to ...]` block (includes `id:` when available); groups require @bot mention by default (override per chat in config).
- Outbound: text and media (photo/video/audio/document) with optional caption; chunked to limits. Typing cue sent best-effort.
- Config: `TELEGRAM_BOT_TOKEN` env or `telegram.botToken` required; `telegram.groups`, `telegram.allowFrom`, `telegram.mediaMaxMb`, `telegram.proxy`, `telegram.webhookSecret`, `telegram.webhookUrl`, `telegram.webhookPath` supported.
- Config: `TELEGRAM_BOT_TOKEN` env or `telegram.botToken` required; `telegram.groups`, `telegram.allowFrom`, `telegram.mediaMaxMb`, `telegram.replyToMode`, `telegram.proxy`, `telegram.webhookSecret`, `telegram.webhookUrl`, `telegram.webhookPath` supported.
- Mention gating precedence (most specific wins): `telegram.groups.<chatId>.requireMention``telegram.groups."*".requireMention` → default `true`.
Example config:
@@ -46,6 +46,7 @@ Example config:
telegram: {
enabled: true,
botToken: "123:abc",
replyToMode: "off",
groups: {
"*": { requireMention: true },
"123456789": { requireMention: false } // group chat id
@@ -66,6 +67,17 @@ Example config:
- Make the bot an admin if you need it to send in restricted groups or channels.
- Mention the bot (`@yourbot`) or use commands to trigger; per-group overrides live in `telegram.groups` if you want always-on behavior.
## Reply tags
To request a threaded reply, the model can include one tag in its output:
- `[[reply_to_current]]` — reply to the triggering Telegram message.
- `[[reply_to:<id>]]` — reply to a specific message id from context.
Current message ids are appended to prompts as `[message_id: …]`; reply context includes `id:` when available.
Behavior is controlled by `telegram.replyToMode`:
- `off`: ignore tags.
- `first`: only the first outbound chunk/attachment is a reply.
- `all`: every outbound chunk/attachment is a reply.
## Roadmap
- ✅ Design and defaults (this doc)
- ✅ grammY long-poll gateway + text/media send

View File

@@ -54,7 +54,7 @@ WhatsApp requires a real mobile number for verification. VoIP and virtual number
- `Body` is the current message body with envelope.
- Quoted reply context is **always appended**:
```
[Replying to +1555]
[Replying to +1555 id:ABC123]
<quoted text or <media:...>>
[/Replying]
```
@@ -81,8 +81,8 @@ WhatsApp requires a real mobile number for verification. VoIP and virtual number
- Group metadata cached 5 min (subject + participants).
## Reply delivery (threading)
- Outbound replies are sent as **native replies** (quoted message).
- Model does not need IDs for threading; gateway attaches quote.
- WhatsApp Web sends standard messages (no quoted reply threading in the current gateway).
- Reply tags are ignored on this surface.
## Outbound send (text + media)
- Uses active web listener; error if gateway not running.