MS Teams: refactor provider + replyStyle + reliability

This commit is contained in:
Onur
2026-01-08 09:47:01 +03:00
committed by Peter Steinberger
parent 81f81be816
commit a2bab7d923
19 changed files with 1834 additions and 389 deletions

View File

@@ -0,0 +1,156 @@
# MS Teams provider refactor plan (production-ready)
Goal: refactor the MS Teams provider code (`src/msteams/*`) for long-term maintainability and correctness **without changing user-facing behavior** (except incidental bug fixes discovered during refactor).
Status (2026-01-08): implemented (Phases 13) with unit tests; `pnpm lint && pnpm build && pnpm test` pass.
## Why refactor
Current pain points in `src/msteams/monitor.ts` / `src/msteams/send.ts` / `src/msteams/conversation-store.ts`:
- **Mixed concerns**: HTTP server wiring, SDK handler, routing, policy resolution, and outbound delivery live in one file.
- **Duplicated outbound logic**: proactive vs in-thread sending is implemented in multiple places (monitor + send).
- **Weak typing boundary**: custom “SDK-like” shapes + structural casts make it harder to evolve safely.
- **Conversation store is fragile**: JSON file writes are un-locked and non-atomic; no TTL; potential corruption under concurrency.
- **Hard to test**: key logic (policy precedence and delivery behavior) is not isolated/pure.
## Non-goals
- Rewriting the provider around a different SDK.
- Introducing new configuration knobs beyond what already exists (`msteams.replyStyle`, `requireMention`, etc.).
- Changing routing semantics, payload envelope format, or session key logic.
- Adding new CLI commands (unless needed for validation/testing).
## Target architecture (module split)
### 1) Policy resolution (pure + tested)
Add `src/msteams/policy.ts` (and `src/msteams/policy.test.ts`) containing pure functions:
- `resolveMSTeamsRouteConfig({ cfg, teamId, conversationId }): { teamConfig?, channelConfig? }`
- `resolveMSTeamsReplyPolicy({ isDirectMessage, cfg, teamConfig?, channelConfig? }): { requireMention: boolean; replyStyle: "thread" | "top-level" }`
Acceptance: precedence is encoded and unit-tested:
- Channel overrides > team defaults > global defaults > implicit defaults.
- DM behavior: `replyStyle` is forced to `"thread"`, mention-gating is bypassed.
- Defaulting behavior matches existing runtime logic (e.g. `requireMention -> default replyStyle` heuristic).
### 2) Outbound delivery (single implementation)
Add `src/msteams/messenger.ts` (and `src/msteams/messenger.test.ts`) to centralize:
- chunking (`resolveTextChunkLimit`, `chunkMarkdownText`, `SILENT_REPLY_TOKEN`)
- send mode selection (`"thread"` vs `"top-level"`)
- media URL message splitting (same semantics as current)
- error formatting + consistent structured logs
Surface (current implementation):
- `renderReplyPayloadsToMessages(replies, { textChunkLimit, chunkText, mediaMode })`
- `sendMSTeamsMessages({ replyStyle, adapter, appId, conversationRef, context?, messages })`
- uses `context.sendActivity` for `"thread"`
- uses `adapter.continueConversation` for `"top-level"`
Acceptance: `src/msteams/monitor.ts` and `src/msteams/send.ts` both use the messenger, so theres exactly one “how do we send a message” implementation.
### 3) SDK typing boundary (type-only imports; no eager runtime deps)
Add `src/msteams/sdk-types.ts` exporting the minimal types we depend on:
- Turn context type (`sendActivity`, `activity` with fields we read)
- Conversation reference type for `continueConversation`
- Adapter interface subset (`continueConversation`, `process`)
Implementation note:
- Use `import type …` from the Microsoft SDK packages (or fallback to minimal structural types if the SDK does not export them cleanly).
- Keep current dynamic runtime imports (`await import("@microsoft/agents-hosting")`) intact; type-only imports compile away.
Acceptance: eliminate bespoke `TeamsTurnContext` / ad-hoc casts where possible, while preserving lazy-load behavior (some casting may remain if SDK typings are stricter than runtime behavior).
### 4) Conversation store interface + hardened FS implementation
Introduce a store interface (e.g. `src/msteams/conversation-store.ts`) and move the current file-backed store to `src/msteams/conversation-store-fs.ts`.
Store interface:
- `upsert(conversationId, reference)`
- `get(conversationId)`
- `findByUser({ aadObjectId?, userId? })`
- `list()`
- `remove(conversationId)`
FS implementation hardening:
- **Atomic writes**: write to `*.tmp` then `rename` (or equivalent).
- **Locking**: use `proper-lockfile` (already a dependency) to guard read-modify-write.
- **TTL + pruning**:
- persist `lastSeenAt`
- prune on every write and/or on a timer
- cap size (keep existing `MAX_CONVERSATIONS` behavior, but deterministic + documented)
- **Permissions**:
- dir is already `0700`; ensure file is written with `0600`
Tests:
- Use an in-memory store implementation for unit tests.
- Add FS store tests only where stable (avoid flaky timing issues).
Acceptance: no store corruption under concurrent writes in-process; behavior preserved for CLI `send` lookup.
### 5) Monitor wiring becomes “thin”
Refactor `src/msteams/monitor.ts` so it:
- loads config + credentials
- creates adapter + express routes
- routes inbound messages to a smaller `handleInboundMessage(...)` function
- delegates:
- policy decisions to `policy.ts`
- outbound sends to `messenger.ts`
- reference persistence to the store abstraction
Acceptance: `monitor.ts` is mostly wiring and orchestration; logic-heavy parts are tested in isolation.
## Implementation phases (incremental, safe)
### Phase 1 (behavior-preserving extraction)
1. Add `src/msteams/policy.ts` + `src/msteams/policy.test.ts`.
2. Add `src/msteams/messenger.ts` + `src/msteams/messenger.test.ts` (unit test chunking + send mode selection; mock context/adapter).
3. Refactor `src/msteams/monitor.ts` to use policy + messenger (no behavior change).
4. Refactor `src/msteams/send.ts` to use messenger (no behavior change).
5. Extract inbound helpers (`stripMentionTags`, mention detection, conversation ID normalization) into `src/msteams/inbound.ts` + tests.
6. Ensure `pnpm lint && pnpm build && pnpm test` pass.
### Phase 2 (store hardening)
1. Introduce store interface + in-memory test store.
2. Move FS store to its own module; add locking + atomic writes + TTL.
3. Update `monitor.ts` + `send.ts` to depend on the interface (inject FS store from wiring).
4. Add targeted tests.
### Phase 3 (production reliability)
1. Add retry/backoff around outbound sends (careful: avoid duplicate posts; only retry safe failures).
2. Error classification helpers (auth misconfig, transient network, throttling).
3. Improve `probeMSTeams` to validate credentials (optional; can be separate).
## Done criteria / checkpoints
- Phase 1 done:
- New policy tests cover precedence and DM behavior.
- `monitor.ts` + `send.ts` share outbound sending via messenger.
- No new runtime imports that break lazy-load behavior.
- Phase 2 done:
- Store is locked + atomic + bounded.
- Clear migration story (keep same file format/version or bump explicitly).
- Phase 3 done:
- Retries are safe and bounded; logs are structured and actionable.
## Notes / edge cases to validate during refactor
- “Channel config” keys: currently based on `conversation.id` (e.g. `19:…@thread.tacv2`). Preserve that.
- `replyStyle="top-level"` correctness: ensure the conversation reference normalization is centralized and tested.
- Mention-gating: preserve current detection behavior (`entities` mention matching `recipient.id`), but isolate it for future improvements.