Commit Graph

453 Commits

Author SHA1 Message Date
hailin b3180b0727 fix(wecom): move dedup to Redis (shared across instances)
Replace in-memory dedup Map with Redis SET NX EX:
  - Key: wecom:dedup:{msgId}, TTL=600s (auto-expires, no manual cleanup)
  - SET NX returns 'OK' on first write (process), null on duplicate (skip)
  - Shared across all agent-service instances — no inter-process duplicates
  - Fails open (return true) if Redis is unavailable — avoids silent drops
  - Removed dedup Map and its periodicCleanup loop

WeCom router is now 10/10 robust:
  cursor persistence, token mutex, distributed leader lease (fail-closed),
  exponential backoff, watchdog, send retry, Redis dedup, Redis cross-instance
  callback recovery, health endpoint.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10 06:19:29 -07:00
hailin e87924563c fix(wecom): health endpoint, fail-closed lease, Redis cross-instance recovery
Fix 1 — Observability (health endpoint):
  WecomRouterService.getStatus() returns { enabled, isLeader, lastPollAt,
  staleSinceMs, consecutiveErrors, pendingCallbacks, queuedUsers }.
  GET /api/v1/agent/channels/wecom/health exposes it.

Fix 2 — Leader lease fail-closed:
  tryClaimLeaderLease() catch now returns false instead of true.
  DB failure → skip poll, preventing multi-master on DB outage.
  isLeader flag tracked for health status.

Fix 3 — Cross-instance callback recovery via Redis:
  routeToAgent() stores wecom:pending:{msgId} → externalUserId in Redis
  with 200s TTL before waiting for the bridge callback.
  resolveCallbackReply() is now async:
    Fast path  — local pendingCallbacks (same instance, 99% case)
    Recovery   — Redis GET → send reply directly to WeChat user
  onModuleDestroy() cleans up Redis keys on graceful shutdown.
  wecom/bridge-callback handler updated to await resolveCallbackReply.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10 06:07:58 -07:00
hailin 0d5441f720 fix(wecom): token mutex, leader lease, backoff, watchdog
Four additional robustness fixes:

1. **Token refresh mutex** — tokenRefreshPromise deduplicates concurrent
   refresh calls. All callers share one in-flight HTTP request instead
   of each firing their own, eliminating the race condition.

2. **Distributed leader lease** — service_state table used for a
   TTL-based leader election (LEADER_LEASE_TTL_S=90s). Only one
   agent-service instance polls at a time; others skip until the lease
   expires. Lease auto-released on graceful shutdown.

3. **Exponential backoff** — consecutive poll errors increment a counter;
   next delay = min(10s × 2^(n-1), 5min). Prevents log spam and
   reduces load during sustained WeCom API outages. Counter resets on
   any successful poll.

4. **Watchdog timer** — setInterval every 2min checks lastPollAt.
   If poll loop has been silent for >5min, clears the timer and
   reschedules immediately, recovering from any silent crash.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10 05:54:42 -07:00
hailin 9e466549c0 fix(wecom): cursor persistence, send retry, enter_session welcome
Three robustness fixes for the WeCom Customer Service router:

1. **Cursor persistence** — sync_msg cursor now stored in
   public.service_state (auto-created via CREATE TABLE IF NOT EXISTS).
   Survives service restarts; no more duplicate message processing.

2. **send_msg retry** — sendChunkWithRetry() retries once after 2s
   on any API error (non-zero errcode or network failure). Lost
   replies due to transient WeChat API errors are now recovered.

3. **enter_session welcome** — WeCom fires an enter_session event
   (origin=0, msgtype=event) when a user opens the chat for the
   first time. Now handled: bound users get a welcome-back message,
   unbound users get step-by-step onboarding instructions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10 05:37:24 -07:00
hailin 978c534a7e fix(push): fix TypeScript Map type inference error in OfflinePushService
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10 04:50:11 -07:00
hailin bc48be1c95 feat(push): log push provider config status on startup
Print ✓/✗ for each platform (FCM/HMS/MI/OPPO/VIVO) so missing credentials
are immediately visible in container logs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10 04:48:26 -07:00
hailin 155133a2d6 feat(push): add HMS agconnect-services.json (Huawei Push Kit config)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10 04:01:15 -07:00
hailin 3bc35bad64 feat(push): add offline push notification system (FCM + HMS + Mi + OPPO + vivo)
## Backend (notification-service)
- Add `device_push_tokens` table (migration 014) — stores per-user tokens per
  platform (FCM/HMS/MI/OPPO/VIVO) with UNIQUE constraint on (user_id, platform, device_id)
- Add `DevicePushTokenRepository` with upsert/delete/targeting queries
  (by userId, tenantId, plan, tags, segment)
- Add push provider interface with `sendBatch(tokens, message): BatchSendResult`
  returning `invalidTokens[]` for automatic DB cleanup
- Add FCM provider — OAuth2 via RS256 JWT, chunked concurrency (max 20 parallel),
  detects UNREGISTERED/404 as invalid tokens
- Add HMS provider — native batch API (1000 tokens/chunk), OAuth2 token cache
  with 5-min buffer, detects code 80100016
- Add Xiaomi provider — `/v3/message/regids` batch endpoint (1000/chunk),
  parses `bad_regids` field
- Add OPPO provider — single-send with Promise-based mutex to prevent concurrent
  auth token refresh races
- Add vivo provider — `/message/pushToList` batch endpoint, mutex same as OPPO,
  parses `invalidMap`
- Add `OfflinePushService` — groups tokens by platform, sends concurrently,
  auto-deletes invalid tokens; fire-and-forget trigger after notification creation
- Add `DevicePushTokenController` — POST/DELETE `/api/v1/notifications/device-token`
- Wire offline push into `NotificationAdminController` and `EventTriggerService`
- Add Kong route for device-token endpoint (JWT required)
- Add all push provider env vars to docker-compose notification-service

## Flutter (it0_app)
- Add `PushService` singleton — detects OEM (Huawei/Xiaomi/OPPO/vivo/FCM),
  initialises correct push SDK, registers token with backend
  - FCM: full init with background handler, foreground local notifications,
    tap stream, iOS APNs support
  - HMS: `HuaweiPush` async token via `onTokenEvent`, no FCM fallback on failure
    (Huawei without GMS cannot use FCM)
  - Mi/OPPO/vivo: MethodChannel bridge to Kotlin receivers; handler set before
    `getToken()` call to avoid race
  - `_initialized` guard prevents double-init on hot-restart
  - `static Stream<void> onNotificationTap` for router navigation
- Add Kotlin OEM bridge classes: `MiPushReceiver`, `OppoPushService`,
  `VivoPushReceiver` — forward token/message/tap events to Flutter via MethodChannel
- Update `MainActivity` — register all three OEM MethodChannels; OEM credentials
  injected from `BuildConfig` (read from `local.properties` at build time)
- Update `build.gradle.kts` — add Google Services + HMS AgConnect plugins,
  BuildConfig fields for OEM credentials, `fileTree("libs")` for OEM AARs
- Update `android/build.gradle.kts` — add buildscript classpath for GMS + HMS,
  Huawei Maven repo
- Update `AndroidManifest.xml` — HMS service, Xiaomi receiver + services,
  vivo receiver; OPPO handled via AAR manifest merge
- Add OEM SDK AARs to `android/app/libs/`:
  MiPush 7.9.2, HeytapPush 3.7.1, vivo Push 4.1.3
- Add `google-services.json` (Firebase project: it0-iagent, package: com.iagent.it0_app)
- Add `firebase_core ^3.6.0`, `firebase_messaging ^15.1.3`, `huawei_push ^6.11.0+300`
  to pubspec.yaml
- Add `ApiEndpoints.notificationDeviceToken` endpoint constant

## Ops
- Add FCM_PROJECT_ID, FCM_CLIENT_EMAIL, FCM_PRIVATE_KEY (+ HMS/Mi/OPPO/vivo placeholders)
  to `.env.example` with comments pointing to each OEM's developer portal

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10 02:42:34 -07:00
hailin 0e4159c2fd fix(my-agents): scope instance list to current user
GET /instances returned all tenant instances for admin accounts,
causing cross-user agent visibility. Changed to
GET /instances/user/:userId so each user only sees their own agents.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 23:44:09 -07:00
hailin c9ee93fffd feat(instance-chat): full multimodal attachment support via OpenClaw bridge
After verifying that the OpenClaw gateway's chat.send WebSocket RPC
accepts an 'attachments' array (confirmed from openclaw/openclaw source
and documentation), implement end-to-end image/file attachment support
for instance chat:

Bridge (openclaw-client.ts):
- chatSendAndWait() now accepts optional `attachments[]` parameter
- Passes attachments to chat.send RPC only when non-empty

Bridge (index.ts):
- /task-async accepts `attachments[]` from request body
- Forwards to chatSendAndWait unchanged

Backend (agent.controller.ts):
- executeInstanceTask() accepts IT0 attachment format
  { base64Data, mediaType, fileName? }
- Converts to OpenClaw format { name, mimeType, media: "data:..." }
- Saves attachments to conversation history via contextService
- Forwards to bridge via bridgeAttachments spread

Flutter (agent_instance_chat_remote_datasource.dart):
- createTask() now includes attachments in POST body when present

Flutter (chat_page.dart):
- Reverted Fix 5 (disabled button) — attachment button fully enabled
  in instance mode since the bridge now supports it

Attachment format (OpenClaw wire):
  { name: string, mimeType: string, media: "data:<mime>;base64,<data>" }

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 21:18:14 -07:00
hailin ea3cbf64a5 feat(agent): complete instance-chat robustness fixes (Fix 2-6)
Fix 2 — Callback timeout wiring:
- Store callbackTimer in pendingCallbackTimers Map after creation
- handleOpenClawAppCallback clears the timer immediately on arrival,
  preventing spurious "timeout" errors when the bridge replies in time

Fix 3 — Provider scope isolation:
- Override agentStatusProvider and robotStateProvider in child ProviderScope
  so the robot avatar/FAB reflects the instance chat state, not iAgent's

Fix 4 — Voice routing to OpenClaw:
- AgentInstanceChatDatasource.sendVoiceMessage() now calls transcribeAudio()
  then routes the transcript through instance-specific createTask() endpoint,
  ensuring voice messages reach the user's OpenClaw agent, not iAgent

Fix 5 — Attachment UI in instance mode:
- Attachment button shown as disabled (onPressed: null) with explanatory
  tooltip ("附件功能暂不支持智能体对话") when agentName != null
- Prevents misleading UX where attachments appear to work but are silently
  dropped by the OpenClaw bridge

Fix 6 — DB schema template:
- Add agent_instance_id UUID NULL to agent_sessions table in migration 002
  (tenant schema template) so new tenants get the column from creation
- Add covering index idx_agent_sessions_instance for efficient instance queries

All TypeScript and Flutter analyze checks pass clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 19:49:36 -07:00
hailin 8865985019 feat(agent-instance-chat): 实现用户与自己的 OpenClaw 智能体直接对话功能
## 功能概述
用户可在「我的智能体」页面点击运行中的 OpenClaw 实例卡片,
直接打开与该智能体的专属对话页面,完整复用 iAgent 的聊天 UI
(流式输出、工具时间线、审批卡片、语音输入等),同时保证
iAgent 对话完全不受影响。

## 架构设计
- 使用 Riverpod ProviderScope 子作用域覆盖 chatRemoteDatasourceProvider
  / chatProvider / sessionListProvider,实现 iAgent 与实例对话的
  provider 完全隔离,无任何共享状态。
- OpenClaw bridge 采用已有的 /task-async 异步回调模式:
    Flutter → POST /api/v1/agent/instances/:id/tasks(立即返回 sessionId/taskId)
    → 订阅 WS /ws/agent(等待事件)
    → Bridge 完成后 POST /api/v1/agent/instances/openclaw-app-callback(公开端点)
    → 后端发 WS text+completed 事件 → Flutter 收到回复
- 每个实例的会话通过 agent_sessions.agent_instance_id 字段隔离,
  会话抽屉只显示当前实例的历史记录。

## 后端变更
### packages/shared/database/src/migrations/013-add-agent-instance-id-to-sessions.sql
- 新增迁移:ALTER TABLE agent_sessions ADD COLUMN agent_instance_id UUID NULL
- 为按实例过滤会话建立索引

### packages/services/agent-service/src/domain/entities/agent-session.entity.ts
- 新增可选字段 agentInstanceId: string(对应 agent_instance_id 列)
- iAgent 会话该字段为 null;实例聊天会话存储对应的 instance UUID

### packages/services/agent-service/src/infrastructure/repositories/session.repository.ts
- 新增 findByInstanceId(tenantId, agentInstanceId) 方法
- 用于 GET /instances/:id/sessions 按实例过滤会话列表

### packages/services/agent-service/src/interfaces/rest/controllers/agent.controller.ts
新增三个端点(注意:已知存在以下待修复问题,见后续 fix commit):
1. POST /api/v1/agent/instances/:instanceId/tasks
   - 校验 instance 归属(userId 匹配)和 running 状态
   - 创建会话(engineType='openclaw',携带 agentInstanceId)
   - 保存用户消息到 conversation_messages 表
   - 向 OpenClaw bridge POST /task-async,sessionKey=it0:{sessionId}
   - 立即返回 { sessionId, taskId },Flutter 订阅 WS 等待回调
2. GET /api/v1/agent/instances/:instanceId/sessions
   - 返回该实例的会话列表(含 title/status/时间戳)
3. POST /api/v1/agent/instances/openclaw-app-callback(公开端点,无 JWT)
   - bridge 完成后回调此端点
   - 成功:发 WS text+completed 事件,保存 assistant 消息,更新 task 状态
   - 失败/超时:发 WS error 事件,标记 task 为 FAILED
- 注入 AgentInstanceRepository 依赖
- 新增私有方法 createInstanceSession()

### packages/gateway/config/kong.yml
- 新增 openclaw-app-callback-public service(无 JWT 插件)
- 路由:POST /api/v1/agent/instances/openclaw-app-callback
- 必须在 agent-service 之前声明,确保路由优先匹配(同 wecom-public 模式)

## Flutter 变更
### it0_app/lib/core/config/api_endpoints.dart
- 新增 instanceTasks(instanceId) 和 instanceSessions(instanceId) 静态方法

### it0_app/lib/features/chat/presentation/pages/chat_page.dart
- 新增可选参数 agentName(默认 null = iAgent 模式)
- agentName != null 时:AppBar 显示智能体名称,隐藏语音通话按钮
- 不传 agentName 时行为与原来完全一致,iAgent 功能零影响

### it0_app/lib/features/my_agents/presentation/pages/my_agents_page.dart
- _InstanceCard 新增 onTap 回调参数
- 卡片用 Material+InkWell 包裹,支持圆角水波纹点击效果
- 新增 _openInstanceChat() 顶层函数:
    running → 滑入式跳转到 AgentInstanceChatPage
    其他状态 → SnackBar 提示(部署中/已停止/错误)
- 导入 AgentInstanceChatPage

### it0_app/lib/features/agent_instance_chat/(新建功能模块)
data/datasources/agent_instance_chat_remote_datasource.dart:
- AgentInstanceChatDatasource implements ChatRemoteDatasource
- 通过组合模式包装 ChatRemoteDatasource 委托所有通用操作
- 覆盖 createTask → POST /api/v1/agent/instances/:id/tasks
- 覆盖 listSessions → GET /api/v1/agent/instances/:id/sessions(仅当前实例会话)

presentation/pages/agent_instance_chat_page.dart:
- AgentInstanceChatPage(instance: AgentInstance)
- ProviderScope 子作用域覆盖三个 provider 实现完全隔离:
    chatRemoteDatasourceProvider → AgentInstanceChatDatasource
    chatProvider → 独立 ChatNotifier 实例(与 iAgent 零共享)
    sessionListProvider → 仅当前实例的会话列表
- child: ChatPage(agentName: instance.name) 完整复用 UI

## 已知待修复问题(下一个 commit)
1. [安全] 鉴权检查逻辑:if (userId && ...) 应为 if (!userId || ...)
2. [可靠性] fetch 未处理 HTTP 4xx/5xx 错误,任务可能永久挂起
3. [可靠性] bridge 回调无超时机制,bridge 崩溃后任务永久 RUNNING
4. [UX] robotStateProvider 未在子 ProviderScope 覆盖,头像动画反映 iAgent 状态
5. [UX] 实例聊天附件 UI 未禁用,上传附件被静默丢弃
6. [UX] 语音消息在实例模式下错误路由到 iAgent 引擎(非 OpenClaw)
7. [DB] 002 模板未加 agent_instance_id 列,新租户缺失此字段

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 19:30:38 -07:00
hailin 647df6e42f feat(wecom): add WeChat Customer Service channel — sync_msg polling + code binding + bridge callback 2026-03-09 10:54:36 -07:00
hailin 233c1c77b2 fix(agent): revert operator-sees-all, restore per-user isolation
Operators now only see their own instances (same as regular users).
Admin role retains superuser view. Orphaned running instances were
reassigned to hailin via DB update.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 09:58:00 -07:00
hailin 4f9f456f85 fix(agent): operator role can see all agent instances 2026-03-09 09:13:24 -07:00
hailin 29c433c7c3 fix(agent): scope instance list to requesting user (multi-user isolation)
GET /api/v1/agent/instances was returning all instances regardless of user.
Now decodes JWT: non-admin users only see their own instances; admins see all.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 08:58:28 -07:00
hailin f186c57afb fix(agent): decode JWT directly to get userId for system prompt
req.user is never populated in agent-service (Kong verifies JWT, no Passport strategy).
This caused userId to always be undefined → system prompt had no 'Current User ID' →
Claude used tenant slug 'shenzhengj' as userId → DB error 'invalid input syntax for
type uuid'.

Fix: decode JWT payload from Authorization header (no signature verify needed — Kong
already verified it) to extract sub (user UUID) for both AgentController and
VoiceSessionController.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 08:51:28 -07:00
hailin da6bfbf896 fix(auth): add name to JWT payload, fix phone-user session restore
JWT payload was missing 'name' field — phone-invited users showed
empty name after app restart (session restore from JWT).
Also added phone fallback in Flutter _decodeUserFromJwt.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 08:23:26 -07:00
hailin 4b2b3dca0c fix(app): make AuthUser.email nullable, add phone field
Phone-invited users have null email — casting null to String crashed login.
email: String → String?, added phone: String? to AuthUser and AuthUserEntity.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 08:20:13 -07:00
hailin 1cf502ef91 fix(app): allow phone number in password login field
Phone-invited users register with phone+password.
Changed identifier field from email-only to email/phone,
removed @ validation so phone numbers pass through.
Backend already auto-detects email vs phone.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 08:06:29 -07:00
hailin 1ccdbc0526 feat(invite): show App download page after phone-invite registration
Phone-invited users are mobile App users, not web admin users.
After accepting a phone invitation, display App download QR + APK link
instead of redirecting to /dashboard.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 08:01:42 -07:00
hailin d73f07d688 fix(sms): remove url param from invite SMS template (SMS_501956050 has no url var)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 07:36:54 -07:00
hailin 6291d6591e fix(feishu): read message_type instead of msg_type (SDK field name mismatch)
Feishu @larksuiteoapi/node-sdk uses message_type, not msg_type (which is DingTalk).
This caused all incoming messages to be treated as non-text, returning
'我目前只能处理文字消息' for every message.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 07:27:29 -07:00
hailin eb2d73bb7e fix(auth-service): add full Aliyun SMS env vars to docker-compose
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 07:22:05 -07:00
hailin 733e6525e3 fix(auth-service): pass ALIYUN_SMS_INVITE_TEMPLATE_CODE env var to container
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 07:16:17 -07:00
hailin 4a00baa0e3 fix(web-admin): add Next.js proxy route for /api/app/version/check
Fixes QR code not showing on my-org and register pages.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 07:13:29 -07:00
hailin afc1ae6fbe feat(voice): randomly pick thinking sound from all 7 built-in clips per session
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 06:56:31 -07:00
hailin 38bea33074 fix(flutter): voice OAuth sheet shows correct channel (Feishu vs DingTalk)
The _showOAuthBottomSheet title/subtitle were hardcoded to 钉钉. Now detects
channel from the URL (feishu.cn → 飞书, else → 钉钉) and shows correct text
and button color (#3370FF for Feishu, #1677FF for DingTalk).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 06:48:16 -07:00
hailin 79d6e0b98a feat(invite): support phone number invitation with SMS notification
- TenantInvite entity: email nullable + phone field added
- createInvite() auto-detects email vs phone, routes to email/SMS
- SmsService: add sendInviteSms() with ALIYUN_SMS_INVITE_TEMPLATE_CODE
- acceptInvite(): handle phone-based invites (uniqueness check + insert)
- my-org page: email/phone toggle on invite form
- /invite/[token] page: display phone or email from invite info
- DB migration: phone column added, email made nullable

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 06:42:49 -07:00
hailin e2057bfe68 feat(feishu): add Feishu OAuth trigger for voice sessions
- Add POST /sessions/:sessionId/feishu/oauth-trigger endpoint (mirrors DingTalk)
  which emits oauth_prompt WS event so Flutter opens the Feishu authorization
  page automatically instead of asking the user to enter a bind code
- Update SystemPromptBuilder: voice sessions now use the Feishu OAuth trigger
  endpoint; text sessions still use the code-based flow as fallback

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 06:31:45 -07:00
hailin 75f20075f6 fix(auth-service): pass EMAIL_* env vars into container via docker-compose
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 06:25:42 -07:00
hailin 7555f1ad5a feat(register): move app download banner to top with QR code
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 06:18:16 -07:00
hailin 146b396427 feat(invite): send email notification on invite + QR codes in user management
- Add EmailService (nodemailer/SMTP) with invite email HTML template
- createInvite() now fires email notification after saving (fire-and-forget)
- my-org page: add App download QR code + invite link QR code panels
- Install react-qr-code in web-admin, nodemailer in auth-service

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 06:01:22 -07:00
hailin d9a785d49d fix(iagent): make dingtalk/feishu endpoint separation explicit in system prompt
Add CRITICAL note and clear IF/ELSE branching so Claude never calls
dingtalk endpoints for feishu binding or vice versa.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 05:57:38 -07:00
hailin 6d1e31dd36 feat(iagent): add Feishu to channel binding flow in system prompt
After creating an instance, iAgent now asks user to choose:
钉钉 / 飞书 / 都绑定 / 跳过
- DingTalk: existing OAuth card push flow
- Feishu: bind-code flow (user sends code to Feishu bot)
- Also adds Feishu status/unbind API references

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 05:51:21 -07:00
hailin 83ed55ce1c feat(flutter): add Feishu OAuth binding UI — mirrors DingTalk flow
- AgentInstance model: add feishuUserId field
- Instance card: show 飞书 binding badge (blue #3370FF) alongside DingTalk badge
- Context menu: add 绑定飞书 / 重新绑定飞书 / 解绑飞书 options
- _FeishuBindSheet: full OAuth-first binding sheet with polling, code fallback,
  countdown timer, success/expired/error states — same UX pattern as DingTalk

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 05:44:11 -07:00
hailin bbcf1d742d fix(feishu): WSClient appId + EventDispatcher pattern
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 03:45:02 -07:00
hailin 042e49988c fix(feishu): commit findByFeishuUserId repository method (missed)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 03:21:39 -07:00
hailin 9e906367f6 fix(feishu): update pnpm lockfile with @larksuiteoapi/node-sdk
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 03:21:10 -07:00
hailin 97bd8f0dde fix(feishu): commit missing entity field + SDK dependency
- AgentInstance: add feishuUserId column (missed in previous commit)
- package.json: add @larksuiteoapi/node-sdk dependency (missed in previous commit)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 03:20:03 -07:00
hailin 70e13d4f13 feat(feishu): add Feishu channel integration — long-connection bot + OAuth binding
- FeishuRouterService: WSClient long-connection, code binding, OAuth, async bridge, UX (thinking timer, queue feedback, error messages)
- AgentChannelController: add feishu/bind, status, unbind, oauth/init, oauth/callback, bridge-callback endpoints
- AgentModule: register FeishuRouterService
- kong.yml: add feishu-oauth-public route (no JWT, must be before agent-service)
- docker-compose: add IT0_FEISHU_APP_ID / IT0_FEISHU_APP_SECRET env vars
- DB migration 012: feishu_user_id column + index on agent_instances

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 03:15:38 -07:00
hailin 96e336dd18 feat(openclaw): OpenClaw skill injection pipeline — iAgent → bridge → SKILL.md
## Changes

### openclaw-bridge: POST /skill-inject
- New endpoint writes SKILL.md to ~/.openclaw/skills/{name}/ inside the container volume
- OpenClaw gateway file watcher picks it up within 250ms (no restart needed)
- Optionally calls sessions.delete RPC after write so the next user message starts
  a fresh session that loads the new skill directory immediately (zero-downtime)
- Path traversal guard on skill name (rejects names with / .. \)
- OPENCLAW_HOME env var configurable (default: /home/node/.openclaw)

### agent-service: POST /api/v1/agent/instances/:id/skills
- New endpoint in AgentInstanceController proxies skill injection requests to the
  instance's bridge (http://{serverHost}:{hostPort}/skill-inject)
- Guards: instance must be 'running', serverHost/hostPort must be set, content ≤ 100KB
- iAgent calls this internally (localhost:3002) via Python urllib — no Kong auth needed
- sessionKey format for DingTalk users: "agent:main:dt-{dingTalkUserId}"

### agent-service: remove dead SkillManagerService
- Deleted skill-manager.service.ts (file-system .md loader, never called by anything)
- Removed from agent.module.ts provider list
- The live skill path is ClaudeAgentSdkEngine.loadTenantSkills() which reads directly
  from the DB (it0_t_{tenantId}.skills) at task-execution time

### agent-service: clean up SystemPromptBuilder
- Removed unused skills?: string[] from SystemPromptContext (was never populated)
- Added clarifying comment: SDK engine handles skill injection, not this builder

## DB
- Inserted iAgent meta-skill "为小龙虾安装技能" into it0_t_default.skills
  (id: 79ac23ed-78c2-4d5f-8652-a99cf5185b61)
- Content instructs iAgent to: query user instances → generate SKILL.md → call
  POST /api/v1/agent/instances/:id/skills via Python urllib heredoc

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 02:16:47 -07:00
hailin fa2212c7bb feat(dingtalk): UX pass — progress hints, queue position, error distinction
- Bridge: tag isTimeout=true in timeout callbacks for semantic error routing
- Agent-service: show " 还在努力想呢" progress batchSend after 25s silence
- Agent-service: queue position feedback ("前面还有 N 条") via sessionWebhook
- Agent-service: buildErrorReply() maps timeout/disconnect/abort to distinct msgs
- Agent-service: instance status hints (stopped/starting/error) with action guidance
- Agent-service: all user-facing strings rewritten for conversational, friendly tone
- Agent-channel: pass isTimeout from bridge callback through to resolveCallbackReply

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 01:03:07 -07:00
hailin f5f051bcab fix(bridge): bake AGENTS.md symlink into Docker image
Add RUN step to create /app/openclaw/docs/reference/templates symlink
at image build time. Previously only done as post-deploy SSH step,
leaving re-created containers broken until next full redeploy.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 00:44:05 -07:00
hailin 865b246345 feat(dingtalk): async callback pattern for LLM tasks (no 55s timeout)
Bridge:
- Add /task-async endpoint: returns immediately, POSTs result to callbackUrl
- Supports arbitrarily long LLM tasks (2 min default timeout)

Agent-service:
- Add POST /api/v1/agent/channels/dingtalk/bridge-callback endpoint
- DingTalkRouterService: pendingCallbacks map + resolveCallbackReply()
- routeToAgent: fire /task-async, register callback Promise, await result
- Serial queue preserved: next message starts only after callback resolves
- CALLBACK_TIMEOUT_MS = 3 min (was effectively 55s before)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 00:36:00 -07:00
hailin be477c73c6 fix(dingtalk): add observability logging to routing success paths
- Log when routing starts (instance found, bridge URL)
- Log bridge OK with reply length
- Log bridge error response
- Log instance not-running status
- Log batchSend OK with chunk count

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 00:19:02 -07:00
hailin 5aaa8600c5 fix(dingtalk): async reply pattern — immediate ack + batchSend for LLM response
- Send '🤔 小虾米正在思考,稍等...' immediately via sessionWebhook on each message
- Await LLM bridge call (serial queue preserved) then deliver response via batchSend
- batchSend decoupled from sessionWebhook — works regardless of webhook state
- Fix duplicate const staffId declaration (TS compile error)
- TASK_TIMEOUT_S=55 passed explicitly to bridge (was using bridge default 25s)
- senderStaffId-first routing (OAuth binding) with senderId fallback (code binding)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08 23:50:34 -07:00
hailin 440819add8 fix(dingtalk): 55s bridge timeout + batchSend fallback for expired webhooks
Root cause of "Bridge call failed" errors: bridge /task endpoint defaults
to 25s agent reply timeout, but LLM calls through the iConsulting gateway
can take 30-60s. Fix: pass timeoutSeconds=55 explicitly in POST body.

Also add batchSend fallback in routeToAgent: if the sessionWebhook has
expired by the time the LLM replies (user sent a message, LLM took >30s,
webhook window closed), the reply is now sent via proactive batchSend
using senderStaffId instead of being silently dropped.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08 23:33:56 -07:00
hailin 5874907300 fix(voice): suppress session terminate during DingTalk OAuth flow
When the voice agent triggers DingTalk OAuth, the user leaves the app
to authorize in DingTalk/browser, causing the LiveKit participant to
disconnect. The voice-agent then calls DELETE /voice to terminate the
session — but the user intends to return after completing OAuth.

Fix: mark the session as "oauth_pending" in VoiceSessionController when
oauth-trigger fires. If terminateVoiceSession is called while the flag
is active (10-min grace), suppress the terminate and return 200 OK so
the voice-agent exits cleanly. The session stays alive; when the user
returns to the voice screen, voice/start + inject auto-resume it.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08 23:14:53 -07:00
hailin 5a66f85235 fix(dingtalk): senderStaffId-first routing + bridge response size cap
Two binding paths store different DingTalk ID types:
- OAuth binding stores staffId (resolved via unionId→userId at auth time)
- Code binding stores senderId ($:LWCP_v1:$... format from bot message)

DingTalk Stream API senderId != OAuth openId (different encodings), so
primary lookup by senderId always missed OAuth-bound instances, requiring
a fallback every time. Reverse the lookup order: try senderStaffId first
(direct hit for OAuth binding), fall back to senderId (code binding).

Also add MAX_RESPONSE_BYTES cap to httpPostJson — previously uncapped
unlike the DingTalk API helpers which already had the 256KB guard.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08 22:48:03 -07:00