Four additional robustness fixes: 1. **Token refresh mutex** — tokenRefreshPromise deduplicates concurrent refresh calls. All callers share one in-flight HTTP request instead of each firing their own, eliminating the race condition. 2. **Distributed leader lease** — service_state table used for a TTL-based leader election (LEADER_LEASE_TTL_S=90s). Only one agent-service instance polls at a time; others skip until the lease expires. Lease auto-released on graceful shutdown. 3. **Exponential backoff** — consecutive poll errors increment a counter; next delay = min(10s × 2^(n-1), 5min). Prevents log spam and reduces load during sustained WeCom API outages. Counter resets on any successful poll. 4. **Watchdog timer** — setInterval every 2min checks lastPollAt. If poll loop has been silent for >5min, clears the timer and reschedules immediately, recovering from any silent crash. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| gateway | ||
| openclaw-bridge | ||
| services | ||
| shared | ||