refactor: unify markdown formatting pipeline
This commit is contained in:
34
docs/concepts/markdown-formatting.md
Normal file
34
docs/concepts/markdown-formatting.md
Normal file
@@ -0,0 +1,34 @@
|
||||
---
|
||||
summary: "Markdown formatting pipeline for outbound channels"
|
||||
read_when:
|
||||
- You are changing markdown formatting or chunking for outbound channels
|
||||
- You are adding a new channel formatter or style mapping
|
||||
---
|
||||
# Markdown formatting
|
||||
|
||||
Clawdbot formats outbound Markdown by converting it into a shared intermediate
|
||||
representation (IR) before rendering channel-specific output.
|
||||
|
||||
## Pipeline
|
||||
|
||||
1. **Parse Markdown -> IR**
|
||||
- IR is plain text plus style spans (bold/italic/strike/code/spoiler) and link spans.
|
||||
- Offsets are UTF-16 code units so Signal style ranges align with its API.
|
||||
2. **Chunk IR (format-first)**
|
||||
- Chunking happens on the IR text before rendering.
|
||||
- Inline formatting does not split across chunks; spans are sliced per chunk.
|
||||
3. **Render per channel**
|
||||
- **Slack:** mrkdwn tokens (bold/italic/strike/code), links as `<url|label>`.
|
||||
- **Telegram:** HTML tags (`<b>`, `<i>`, `<s>`, `<code>`, `<pre><code>`, `<a href>`).
|
||||
- **Signal:** plain text + `text-style` ranges; links become `label (url)` when label differs.
|
||||
|
||||
## Link policy
|
||||
|
||||
- **Slack:** `[label](url)` -> `<url|label>`; bare URLs are left as-is.
|
||||
- **Telegram:** `[label](url)` -> `<a href="url">label</a>` (HTML parse mode).
|
||||
- **Signal:** `[label](url)` -> `label (url)` unless label matches url.
|
||||
|
||||
## Spoilers
|
||||
|
||||
Spoiler markers (`||spoiler||`) are parsed only for Signal, where they map to
|
||||
SPOILER style ranges. Other channels treat them as plain text.
|
||||
Reference in New Issue
Block a user