7.6 KiB
MS Teams provider refactor plan (production-ready)
Goal: refactor the MS Teams provider code (src/msteams/*) for long-term maintainability and correctness without changing user-facing behavior (except incidental bug fixes discovered during refactor).
Status (2026-01-08): implemented (Phases 1–3) with unit tests; pnpm lint && pnpm build && pnpm test pass.
Why refactor
Current pain points in src/msteams/monitor.ts / src/msteams/send.ts / src/msteams/conversation-store.ts:
- Mixed concerns: HTTP server wiring, SDK handler, routing, policy resolution, and outbound delivery live in one file.
- Duplicated outbound logic: proactive vs in-thread sending is implemented in multiple places (monitor + send).
- Weak typing boundary: custom “SDK-like” shapes + structural casts make it harder to evolve safely.
- Conversation store is fragile: JSON file writes are un-locked and non-atomic; no TTL; potential corruption under concurrency.
- Hard to test: key logic (policy precedence and delivery behavior) is not isolated/pure.
Non-goals
- Rewriting the provider around a different SDK.
- Introducing new configuration knobs beyond what already exists (
msteams.replyStyle,requireMention, etc.). - Changing routing semantics, payload envelope format, or session key logic.
- Adding new CLI commands (unless needed for validation/testing).
Target architecture (module split)
1) Policy resolution (pure + tested)
Add src/msteams/policy.ts (and src/msteams/policy.test.ts) containing pure functions:
resolveMSTeamsRouteConfig({ cfg, teamId, conversationId }): { teamConfig?, channelConfig? }resolveMSTeamsReplyPolicy({ isDirectMessage, cfg, teamConfig?, channelConfig? }): { requireMention: boolean; replyStyle: "thread" | "top-level" }
Acceptance: precedence is encoded and unit-tested:
- Channel overrides > team defaults > global defaults > implicit defaults.
- DM behavior:
replyStyleis forced to"thread", mention-gating is bypassed. - Defaulting behavior matches existing runtime logic (e.g.
requireMention -> default replyStyleheuristic).
2) Outbound delivery (single implementation)
Add src/msteams/messenger.ts (and src/msteams/messenger.test.ts) to centralize:
- chunking (
resolveTextChunkLimit,chunkMarkdownText,SILENT_REPLY_TOKEN) - send mode selection (
"thread"vs"top-level") - media URL message splitting (same semantics as current)
- error formatting + consistent structured logs
Surface (current implementation):
renderReplyPayloadsToMessages(replies, { textChunkLimit, chunkText, mediaMode })sendMSTeamsMessages({ replyStyle, adapter, appId, conversationRef, context?, messages })- uses
context.sendActivityfor"thread" - uses
adapter.continueConversationfor"top-level"
- uses
Acceptance: src/msteams/monitor.ts and src/msteams/send.ts both use the messenger, so there’s exactly one “how do we send a message” implementation.
3) SDK typing boundary (type-only imports; no eager runtime deps)
Add src/msteams/sdk-types.ts exporting the minimal types we depend on:
- Turn context type (
sendActivity,activitywith fields we read) - Conversation reference type for
continueConversation - Adapter interface subset (
continueConversation,process)
Implementation note:
- Use
import type …from the Microsoft SDK packages (or fallback to minimal structural types if the SDK does not export them cleanly). - Keep current dynamic runtime imports (
await import("@microsoft/agents-hosting")) intact; type-only imports compile away.
Acceptance: eliminate bespoke TeamsTurnContext / ad-hoc casts where possible, while preserving lazy-load behavior (some casting may remain if SDK typings are stricter than runtime behavior).
4) Conversation store interface + hardened FS implementation
Introduce a store interface (e.g. src/msteams/conversation-store.ts) and move the current file-backed store to src/msteams/conversation-store-fs.ts.
Store interface:
upsert(conversationId, reference)get(conversationId)findByUser({ aadObjectId?, userId? })list()remove(conversationId)
FS implementation hardening:
- Atomic writes: write to
*.tmpthenrename(or equivalent). - Locking: use
proper-lockfile(already a dependency) to guard read-modify-write. - TTL + pruning:
- persist
lastSeenAt - prune on every write and/or on a timer
- cap size (keep existing
MAX_CONVERSATIONSbehavior, but deterministic + documented)
- persist
- Permissions:
- dir is already
0700; ensure file is written with0600
- dir is already
Tests:
- Use an in-memory store implementation for unit tests.
- Add FS store tests only where stable (avoid flaky timing issues).
Acceptance: no store corruption under concurrent writes in-process; behavior preserved for CLI send lookup.
5) Monitor wiring becomes “thin”
Refactor src/msteams/monitor.ts so it:
- loads config + credentials
- creates adapter + express routes
- routes inbound messages to a smaller
handleInboundMessage(...)function - delegates:
- policy decisions to
policy.ts - outbound sends to
messenger.ts - reference persistence to the store abstraction
- policy decisions to
Acceptance: monitor.ts is mostly wiring and orchestration; logic-heavy parts are tested in isolation.
Implementation phases (incremental, safe)
Phase 1 (behavior-preserving extraction)
- Add
src/msteams/policy.ts+src/msteams/policy.test.ts. - Add
src/msteams/messenger.ts+src/msteams/messenger.test.ts(unit test chunking + send mode selection; mock context/adapter). - Refactor
src/msteams/monitor.tsto use policy + messenger (no behavior change). - Refactor
src/msteams/send.tsto use messenger (no behavior change). - Extract inbound helpers (
stripMentionTags, mention detection, conversation ID normalization) intosrc/msteams/inbound.ts+ tests. - Ensure
pnpm lint && pnpm build && pnpm testpass. - If testing manifest/RSC updates, fully quit/relaunch Teams and reinstall the app to flush cached app metadata.
Phase 2 (store hardening)
- Introduce store interface + in-memory test store.
- Move FS store to its own module; add locking + atomic writes + TTL.
- Update
monitor.ts+send.tsto depend on the interface (inject FS store from wiring). - Add targeted tests.
Phase 3 (production reliability)
- Add retry/backoff around outbound sends (careful: avoid duplicate posts; only retry safe failures).
- Error classification helpers (auth misconfig, transient network, throttling).
- Improve
probeMSTeamsto validate credentials (optional; can be separate).
Done criteria / checkpoints
- Phase 1 done:
- New policy tests cover precedence and DM behavior.
monitor.ts+send.tsshare outbound sending via messenger.- No new runtime imports that break lazy-load behavior.
- Phase 2 done:
- Store is locked + atomic + bounded.
- Clear migration story (keep same file format/version or bump explicitly).
- Phase 3 done:
- Retries are safe and bounded; logs are structured and actionable.
Notes / edge cases to validate during refactor
- “Channel config” keys: currently based on
conversation.id(e.g.19:…@thread.tacv2). Preserve that. replyStyle="top-level"correctness: ensure the conversation reference normalization is centralized and tested.- Mention-gating: preserve current detection behavior (
entitiesmention matchingrecipient.id), but isolate it for future improvements. - Teams client caches app manifests; after uploading a new package or changing RSC permissions, fully quit/relaunch Teams (not just close the window) and reinstall the app to force the version + permission refresh.