Root cause of "Bridge call failed" errors: bridge /task endpoint defaults
to 25s agent reply timeout, but LLM calls through the iConsulting gateway
can take 30-60s. Fix: pass timeoutSeconds=55 explicitly in POST body.
Also add batchSend fallback in routeToAgent: if the sessionWebhook has
expired by the time the LLM replies (user sent a message, LLM took >30s,
webhook window closed), the reply is now sent via proactive batchSend
using senderStaffId instead of being silently dropped.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two binding paths store different DingTalk ID types:
- OAuth binding stores staffId (resolved via unionId→userId at auth time)
- Code binding stores senderId ($:LWCP_v1:$... format from bot message)
DingTalk Stream API senderId != OAuth openId (different encodings), so
primary lookup by senderId always missed OAuth-bound instances, requiring
a fallback every time. Reverse the lookup order: try senderStaffId first
(direct hit for OAuth binding), fall back to senderId (code binding).
Also add MAX_RESPONSE_BYTES cap to httpPostJson — previously uncapped
unlike the DingTalk API helpers which already had the 256KB guard.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Text sessions were not passing sessionId to SystemPromptBuilder, causing
Claude to use the `initiate_dingtalk_binding` custom tool (claude_api only).
When the engine is claude_agent_sdk, this tool does not exist → 404.
Fix: pass session.id as sessionId to systemPromptBuilder.build() in
agent.controller.ts. Claude will now use the wget oauth-trigger endpoint
for ALL session types (text and voice), which works with every engine.
Also: store userId (staffId) as the DingTalk binding ID when resolvable,
falling back to openId. Bot messages deliver senderStaffId which matches
userId, not openId — this prevents the "binding not found" routing failure.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Problem: sendGreeting() was passing openId as `userIds` to batchSend, but
the API requires the enterprise staffId (userId). This caused HTTP 400
"staffId.notExisted" for every OAuth-bound greeting.
Fix:
1. completeOAuthBinding now resolves unionId → userId via
oapi.dingtalk.com/topapi/user/getbyunionid with corp app token.
Non-fatal: if the user has no enterprise context, greeting is skipped
with a clear log explaining why (no Contact.User.Read permission or
user is not an enterprise member).
2. sendGreeting accepts userId (staffId) and openId separately; uses
the correct staffId for batchSend. If userId is undefined, emits a
WARN and skips (user gets greeting on first message instead).
3. routeToAgent now tries senderStaffId as fallback if senderId lookup
misses — handles edge cases where DingTalk delivers staffId in senderId.
4. Added detailed logging: all three IDs (openId, unionId, userId) are
logged at binding time so future issues are immediately diagnosable.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
openclaw-bridge:
- index.ts: /task endpoint now calls chatSendAndWait() with idempotencyKey
(removes broken timeoutSeconds param; uses caller-supplied msgId for dedup)
- openclaw-client.ts: added onEvent() subscription + chatSendAndWait() that
subscribes to 'chat' WS events, waits for state='final' matching runId,
and extracts text from the message payload
dingtalk-router:
- After OAuth binding completes, sends a proactive greeting to the user via
DingTalk batchSend API (/v1.0/robot/oToMessages/batchSend) introducing the
agent by name and explaining what it can do
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- DingTalk binding UX replaced with OAuth one-tap flow:
- GET /api/v1/agent/channels/dingtalk/oauth/init returns OAuth URL
- GET /api/v1/agent/channels/dingtalk/oauth/callback (public, no JWT)
exchanges code+state for openId, saves binding, returns HTML page
- oauthStates Map with 10-min TTL; state validated before exchange
- msg.senderId (openId) aligned with OAuth openId for consistent routing
- CODE_TTL_MS extended from 5→15 min (fallback code method preserved)
- Kong: dingtalk-oauth-public service declared before agent-service
so callback path matches without JWT plugin
- Voice sessions: use stored session.systemPrompt + voice rules;
allowedTools includes Bash so Claude can call internal APIs
- Flutter _DingTalkBindSheet: OAuth-first UX with code-based fallback
phases: idle→loadingOAuth→waitingOAuth→success + polling every 2s
- docker-compose: IT0_BASE_URL env var for agent-service (redirect URI)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Critical fixes:
- ws.on('message') fully wrapped in try/catch — uncaught exception in
wsSend() no longer propagates to EventEmitter boundary and crashes process
- wsSend() helper: checks readyState === OPEN before send(), never throws
- Stale-WS guard: close/message events from old WS ignored after reconnect
(ws !== this.ws check); terminateCurrentWs() closes old WS before new one
- Queue tail: .catch(() => {}) appended to guarantee promise always resolves,
preventing permanently dead queue tail from silently dropping future tasks
- DISCONNECT frame handler: force-close + reconnect immediately
High fixes:
- sessionWebhookExpiredTime unit auto-detection: values < 1e11 treated as
seconds (×1000), values >= 1e11 treated as ms — prevents always-blocked reply
- httpsPost response capped at 256 KB to prevent memory spike on bad response
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>