MS Teams: refactor provider + replyStyle + reliability
This commit is contained in:
156
tmp/msteams-refactor-plan.md
Normal file
156
tmp/msteams-refactor-plan.md
Normal file
@@ -0,0 +1,156 @@
|
||||
# MS Teams provider refactor plan (production-ready)
|
||||
|
||||
Goal: refactor the MS Teams provider code (`src/msteams/*`) for long-term maintainability and correctness **without changing user-facing behavior** (except incidental bug fixes discovered during refactor).
|
||||
|
||||
Status (2026-01-08): implemented (Phases 1–3) with unit tests; `pnpm lint && pnpm build && pnpm test` pass.
|
||||
|
||||
## Why refactor
|
||||
|
||||
Current pain points in `src/msteams/monitor.ts` / `src/msteams/send.ts` / `src/msteams/conversation-store.ts`:
|
||||
|
||||
- **Mixed concerns**: HTTP server wiring, SDK handler, routing, policy resolution, and outbound delivery live in one file.
|
||||
- **Duplicated outbound logic**: proactive vs in-thread sending is implemented in multiple places (monitor + send).
|
||||
- **Weak typing boundary**: custom “SDK-like” shapes + structural casts make it harder to evolve safely.
|
||||
- **Conversation store is fragile**: JSON file writes are un-locked and non-atomic; no TTL; potential corruption under concurrency.
|
||||
- **Hard to test**: key logic (policy precedence and delivery behavior) is not isolated/pure.
|
||||
|
||||
## Non-goals
|
||||
|
||||
- Rewriting the provider around a different SDK.
|
||||
- Introducing new configuration knobs beyond what already exists (`msteams.replyStyle`, `requireMention`, etc.).
|
||||
- Changing routing semantics, payload envelope format, or session key logic.
|
||||
- Adding new CLI commands (unless needed for validation/testing).
|
||||
|
||||
## Target architecture (module split)
|
||||
|
||||
### 1) Policy resolution (pure + tested)
|
||||
|
||||
Add `src/msteams/policy.ts` (and `src/msteams/policy.test.ts`) containing pure functions:
|
||||
|
||||
- `resolveMSTeamsRouteConfig({ cfg, teamId, conversationId }): { teamConfig?, channelConfig? }`
|
||||
- `resolveMSTeamsReplyPolicy({ isDirectMessage, cfg, teamConfig?, channelConfig? }): { requireMention: boolean; replyStyle: "thread" | "top-level" }`
|
||||
|
||||
Acceptance: precedence is encoded and unit-tested:
|
||||
|
||||
- Channel overrides > team defaults > global defaults > implicit defaults.
|
||||
- DM behavior: `replyStyle` is forced to `"thread"`, mention-gating is bypassed.
|
||||
- Defaulting behavior matches existing runtime logic (e.g. `requireMention -> default replyStyle` heuristic).
|
||||
|
||||
### 2) Outbound delivery (single implementation)
|
||||
|
||||
Add `src/msteams/messenger.ts` (and `src/msteams/messenger.test.ts`) to centralize:
|
||||
|
||||
- chunking (`resolveTextChunkLimit`, `chunkMarkdownText`, `SILENT_REPLY_TOKEN`)
|
||||
- send mode selection (`"thread"` vs `"top-level"`)
|
||||
- media URL message splitting (same semantics as current)
|
||||
- error formatting + consistent structured logs
|
||||
|
||||
Surface (current implementation):
|
||||
|
||||
- `renderReplyPayloadsToMessages(replies, { textChunkLimit, chunkText, mediaMode })`
|
||||
- `sendMSTeamsMessages({ replyStyle, adapter, appId, conversationRef, context?, messages })`
|
||||
- uses `context.sendActivity` for `"thread"`
|
||||
- uses `adapter.continueConversation` for `"top-level"`
|
||||
|
||||
Acceptance: `src/msteams/monitor.ts` and `src/msteams/send.ts` both use the messenger, so there’s exactly one “how do we send a message” implementation.
|
||||
|
||||
### 3) SDK typing boundary (type-only imports; no eager runtime deps)
|
||||
|
||||
Add `src/msteams/sdk-types.ts` exporting the minimal types we depend on:
|
||||
|
||||
- Turn context type (`sendActivity`, `activity` with fields we read)
|
||||
- Conversation reference type for `continueConversation`
|
||||
- Adapter interface subset (`continueConversation`, `process`)
|
||||
|
||||
Implementation note:
|
||||
|
||||
- Use `import type …` from the Microsoft SDK packages (or fallback to minimal structural types if the SDK does not export them cleanly).
|
||||
- Keep current dynamic runtime imports (`await import("@microsoft/agents-hosting")`) intact; type-only imports compile away.
|
||||
|
||||
Acceptance: eliminate bespoke `TeamsTurnContext` / ad-hoc casts where possible, while preserving lazy-load behavior (some casting may remain if SDK typings are stricter than runtime behavior).
|
||||
|
||||
### 4) Conversation store interface + hardened FS implementation
|
||||
|
||||
Introduce a store interface (e.g. `src/msteams/conversation-store.ts`) and move the current file-backed store to `src/msteams/conversation-store-fs.ts`.
|
||||
|
||||
Store interface:
|
||||
|
||||
- `upsert(conversationId, reference)`
|
||||
- `get(conversationId)`
|
||||
- `findByUser({ aadObjectId?, userId? })`
|
||||
- `list()`
|
||||
- `remove(conversationId)`
|
||||
|
||||
FS implementation hardening:
|
||||
|
||||
- **Atomic writes**: write to `*.tmp` then `rename` (or equivalent).
|
||||
- **Locking**: use `proper-lockfile` (already a dependency) to guard read-modify-write.
|
||||
- **TTL + pruning**:
|
||||
- persist `lastSeenAt`
|
||||
- prune on every write and/or on a timer
|
||||
- cap size (keep existing `MAX_CONVERSATIONS` behavior, but deterministic + documented)
|
||||
- **Permissions**:
|
||||
- dir is already `0700`; ensure file is written with `0600`
|
||||
|
||||
Tests:
|
||||
|
||||
- Use an in-memory store implementation for unit tests.
|
||||
- Add FS store tests only where stable (avoid flaky timing issues).
|
||||
|
||||
Acceptance: no store corruption under concurrent writes in-process; behavior preserved for CLI `send` lookup.
|
||||
|
||||
### 5) Monitor wiring becomes “thin”
|
||||
|
||||
Refactor `src/msteams/monitor.ts` so it:
|
||||
|
||||
- loads config + credentials
|
||||
- creates adapter + express routes
|
||||
- routes inbound messages to a smaller `handleInboundMessage(...)` function
|
||||
- delegates:
|
||||
- policy decisions to `policy.ts`
|
||||
- outbound sends to `messenger.ts`
|
||||
- reference persistence to the store abstraction
|
||||
|
||||
Acceptance: `monitor.ts` is mostly wiring and orchestration; logic-heavy parts are tested in isolation.
|
||||
|
||||
## Implementation phases (incremental, safe)
|
||||
|
||||
### Phase 1 (behavior-preserving extraction)
|
||||
|
||||
1. Add `src/msteams/policy.ts` + `src/msteams/policy.test.ts`.
|
||||
2. Add `src/msteams/messenger.ts` + `src/msteams/messenger.test.ts` (unit test chunking + send mode selection; mock context/adapter).
|
||||
3. Refactor `src/msteams/monitor.ts` to use policy + messenger (no behavior change).
|
||||
4. Refactor `src/msteams/send.ts` to use messenger (no behavior change).
|
||||
5. Extract inbound helpers (`stripMentionTags`, mention detection, conversation ID normalization) into `src/msteams/inbound.ts` + tests.
|
||||
6. Ensure `pnpm lint && pnpm build && pnpm test` pass.
|
||||
|
||||
### Phase 2 (store hardening)
|
||||
|
||||
1. Introduce store interface + in-memory test store.
|
||||
2. Move FS store to its own module; add locking + atomic writes + TTL.
|
||||
3. Update `monitor.ts` + `send.ts` to depend on the interface (inject FS store from wiring).
|
||||
4. Add targeted tests.
|
||||
|
||||
### Phase 3 (production reliability)
|
||||
|
||||
1. Add retry/backoff around outbound sends (careful: avoid duplicate posts; only retry safe failures).
|
||||
2. Error classification helpers (auth misconfig, transient network, throttling).
|
||||
3. Improve `probeMSTeams` to validate credentials (optional; can be separate).
|
||||
|
||||
## Done criteria / checkpoints
|
||||
|
||||
- Phase 1 done:
|
||||
- New policy tests cover precedence and DM behavior.
|
||||
- `monitor.ts` + `send.ts` share outbound sending via messenger.
|
||||
- No new runtime imports that break lazy-load behavior.
|
||||
- Phase 2 done:
|
||||
- Store is locked + atomic + bounded.
|
||||
- Clear migration story (keep same file format/version or bump explicitly).
|
||||
- Phase 3 done:
|
||||
- Retries are safe and bounded; logs are structured and actionable.
|
||||
|
||||
## Notes / edge cases to validate during refactor
|
||||
|
||||
- “Channel config” keys: currently based on `conversation.id` (e.g. `19:…@thread.tacv2`). Preserve that.
|
||||
- `replyStyle="top-level"` correctness: ensure the conversation reference normalization is centralized and tested.
|
||||
- Mention-gating: preserve current detection behavior (`entities` mention matching `recipient.id`), but isolate it for future improvements.
|
||||
Reference in New Issue
Block a user