it0/packages/services
hailin e87924563c fix(wecom): health endpoint, fail-closed lease, Redis cross-instance recovery
Fix 1 — Observability (health endpoint):
  WecomRouterService.getStatus() returns { enabled, isLeader, lastPollAt,
  staleSinceMs, consecutiveErrors, pendingCallbacks, queuedUsers }.
  GET /api/v1/agent/channels/wecom/health exposes it.

Fix 2 — Leader lease fail-closed:
  tryClaimLeaderLease() catch now returns false instead of true.
  DB failure → skip poll, preventing multi-master on DB outage.
  isLeader flag tracked for health status.

Fix 3 — Cross-instance callback recovery via Redis:
  routeToAgent() stores wecom:pending:{msgId} → externalUserId in Redis
  with 200s TTL before waiting for the bridge callback.
  resolveCallbackReply() is now async:
    Fast path  — local pendingCallbacks (same instance, 99% case)
    Recovery   — Redis GET → send reply directly to WeChat user
  onModuleDestroy() cleans up Redis keys on graceful shutdown.
  wecom/bridge-callback handler updated to await resolveCallbackReply.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10 06:07:58 -07:00
..
agent-service fix(wecom): health endpoint, fail-closed lease, Redis cross-instance recovery 2026-03-10 06:07:58 -07:00
audit-service fix(auth): allow platform_admin to access all web-admin endpoints 2026-03-07 05:54:05 -08:00
auth-service fix(auth): add name to JWT payload, fix phone-user session restore 2026-03-09 08:23:26 -07:00
billing-service fix: store tenant slug (not UUID) in current_tenant; remove plan trial periods 2026-03-07 09:01:21 -08:00
comm-service fix: release QueryRunner connections to prevent pool exhaustion 2026-02-23 15:55:06 -08:00
inventory-service feat(openclaw): Phase 1 — server pool + agent instance deployment infrastructure 2026-03-07 11:11:21 -08:00
monitor-service fix: release QueryRunner connections to prevent pool exhaustion 2026-02-23 15:55:06 -08:00
notification-service fix(push): fix TypeScript Map type inference error in OfflinePushService 2026-03-10 04:50:11 -07:00
ops-service fix(ops-service): add new TenantInfo quota fields to inline TenantContextService.run calls 2026-03-04 00:04:36 -08:00
presence-service fix(presence-service): use linux-musl-openssl-3.0.x Prisma binary target for Alpine 2026-03-07 18:19:13 -08:00
referral-service feat(referral): add user-level personal circle + points system 2026-03-08 00:18:17 -08:00
version-service feat(auth): add platform_super_admin role for two-level platform access control 2026-03-07 01:17:27 -08:00
voice-agent feat(voice): randomly pick thinking sound from all 7 built-in clips per session 2026-03-09 06:56:31 -07:00
voice-service feat: add engine type selection (Agent SDK / Claude API) for voice calls 2026-03-02 02:11:51 -08:00