it0/packages/services/agent-service
hailin e87924563c fix(wecom): health endpoint, fail-closed lease, Redis cross-instance recovery
Fix 1 — Observability (health endpoint):
  WecomRouterService.getStatus() returns { enabled, isLeader, lastPollAt,
  staleSinceMs, consecutiveErrors, pendingCallbacks, queuedUsers }.
  GET /api/v1/agent/channels/wecom/health exposes it.

Fix 2 — Leader lease fail-closed:
  tryClaimLeaderLease() catch now returns false instead of true.
  DB failure → skip poll, preventing multi-master on DB outage.
  isLeader flag tracked for health status.

Fix 3 — Cross-instance callback recovery via Redis:
  routeToAgent() stores wecom:pending:{msgId} → externalUserId in Redis
  with 200s TTL before waiting for the bridge callback.
  resolveCallbackReply() is now async:
    Fast path  — local pendingCallbacks (same instance, 99% case)
    Recovery   — Redis GET → send reply directly to WeChat user
  onModuleDestroy() cleans up Redis keys on graceful shutdown.
  wecom/bridge-callback handler updated to await resolveCallbackReply.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10 06:07:58 -07:00
..
prisma chore(agent): add empty prisma dir to fix Docker build COPY step 2026-03-08 03:10:19 -07:00
src fix(wecom): health endpoint, fail-closed lease, Redis cross-instance recovery 2026-03-10 06:07:58 -07:00
package.json fix(feishu): commit missing entity field + SDK dependency 2026-03-09 03:20:03 -07:00
tsconfig.json Initial commit: IT0 AI-powered server cluster operations platform 2026-02-08 22:54:37 -08:00