Commit Graph

18 Commits

Author SHA1 Message Date
hailin 96e336dd18 feat(openclaw): OpenClaw skill injection pipeline — iAgent → bridge → SKILL.md
## Changes

### openclaw-bridge: POST /skill-inject
- New endpoint writes SKILL.md to ~/.openclaw/skills/{name}/ inside the container volume
- OpenClaw gateway file watcher picks it up within 250ms (no restart needed)
- Optionally calls sessions.delete RPC after write so the next user message starts
  a fresh session that loads the new skill directory immediately (zero-downtime)
- Path traversal guard on skill name (rejects names with / .. \)
- OPENCLAW_HOME env var configurable (default: /home/node/.openclaw)

### agent-service: POST /api/v1/agent/instances/:id/skills
- New endpoint in AgentInstanceController proxies skill injection requests to the
  instance's bridge (http://{serverHost}:{hostPort}/skill-inject)
- Guards: instance must be 'running', serverHost/hostPort must be set, content ≤ 100KB
- iAgent calls this internally (localhost:3002) via Python urllib — no Kong auth needed
- sessionKey format for DingTalk users: "agent:main:dt-{dingTalkUserId}"

### agent-service: remove dead SkillManagerService
- Deleted skill-manager.service.ts (file-system .md loader, never called by anything)
- Removed from agent.module.ts provider list
- The live skill path is ClaudeAgentSdkEngine.loadTenantSkills() which reads directly
  from the DB (it0_t_{tenantId}.skills) at task-execution time

### agent-service: clean up SystemPromptBuilder
- Removed unused skills?: string[] from SystemPromptContext (was never populated)
- Added clarifying comment: SDK engine handles skill injection, not this builder

## DB
- Inserted iAgent meta-skill "为小龙虾安装技能" into it0_t_default.skills
  (id: 79ac23ed-78c2-4d5f-8652-a99cf5185b61)
- Content instructs iAgent to: query user instances → generate SKILL.md → call
  POST /api/v1/agent/instances/:id/skills via Python urllib heredoc

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 02:16:47 -07:00
hailin fa2212c7bb feat(dingtalk): UX pass — progress hints, queue position, error distinction
- Bridge: tag isTimeout=true in timeout callbacks for semantic error routing
- Agent-service: show " 还在努力想呢" progress batchSend after 25s silence
- Agent-service: queue position feedback ("前面还有 N 条") via sessionWebhook
- Agent-service: buildErrorReply() maps timeout/disconnect/abort to distinct msgs
- Agent-service: instance status hints (stopped/starting/error) with action guidance
- Agent-service: all user-facing strings rewritten for conversational, friendly tone
- Agent-channel: pass isTimeout from bridge callback through to resolveCallbackReply

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 01:03:07 -07:00
hailin f5f051bcab fix(bridge): bake AGENTS.md symlink into Docker image
Add RUN step to create /app/openclaw/docs/reference/templates symlink
at image build time. Previously only done as post-deploy SSH step,
leaving re-created containers broken until next full redeploy.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 00:44:05 -07:00
hailin 865b246345 feat(dingtalk): async callback pattern for LLM tasks (no 55s timeout)
Bridge:
- Add /task-async endpoint: returns immediately, POSTs result to callbackUrl
- Supports arbitrarily long LLM tasks (2 min default timeout)

Agent-service:
- Add POST /api/v1/agent/channels/dingtalk/bridge-callback endpoint
- DingTalkRouterService: pendingCallbacks map + resolveCallbackReply()
- routeToAgent: fire /task-async, register callback Promise, await result
- Serial queue preserved: next message starts only after callback resolves
- CALLBACK_TIMEOUT_MS = 3 min (was effectively 55s before)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-09 00:36:00 -07:00
hailin 5cf72c4780 feat(dingtalk+bridge): event-based agent reply + greeting on binding
openclaw-bridge:
- index.ts: /task endpoint now calls chatSendAndWait() with idempotencyKey
  (removes broken timeoutSeconds param; uses caller-supplied msgId for dedup)
- openclaw-client.ts: added onEvent() subscription + chatSendAndWait() that
  subscribes to 'chat' WS events, waits for state='final' matching runId,
  and extracts text from the message payload

dingtalk-router:
- After OAuth binding completes, sends a proactive greeting to the user via
  DingTalk batchSend API (/v1.0/robot/oToMessages/batchSend) introducing the
  agent by name and explaining what it can do

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08 13:18:52 -07:00
hailin aca4dd0177 feat(openclaw-bridge): patch models.generated.js to route Anthropic calls via LLM gateway
Replace hardcoded https://api.anthropic.com with http://154.84.135.121:3008 in
@mariozechner/pi-ai's models.generated.js at image build time. Uses find+xargs
to be version-agnostic. The gateway key sk-gw-oc-881239445c76b8d349a13be9fc4507a3
is configured in auth-profiles.json on the mounted volume.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08 13:15:29 -07:00
hailin 20e96f31cd fix(openclaw-bridge): use regex replace for base64url (ES2020 compat)
replaceAll() requires ES2021+, tsconfig targets ES2020.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08 07:40:56 -07:00
hailin a7c5b264fa fix(openclaw-bridge): implement correct v3 device auth protocol
- device.id = SHA256(rawPubKey, hex) — not a random UUID
- device.publicKey = raw 32-byte key encoded as base64url (not SPKI DER)
- sign the full v3 payload string (not just raw nonce bytes):
  "v3|{deviceId}|{clientId}|{mode}|{role}|{scopes}|{ts}|{token}|{nonce}|{platform}|"
- device.signature encoded as base64url

Matches buildDeviceAuthPayloadV3/verifyDeviceSignature from openclaw dist.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08 07:40:15 -07:00
hailin bb4b73f847 fix(openclaw-bridge): include nonce in device handshake params
The gateway expects device.nonce (the challenge nonce) to be echoed
back in the connect request. Without it the connection is rejected with
'device: must have required property nonce'.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08 07:32:29 -07:00
hailin 00a944c9a9 fix(openclaw-bridge): use 'gateway-client'/'backend' as WS client id/mode
openclaw gateway validates client.id against a fixed set of known IDs.
Using a random UUID caused the connection to be rejected immediately with
'client/id must be equal to constant'. Use 'gateway-client' + 'backend'.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08 07:30:06 -07:00
hailin d13b9b7df9 fix(openclaw-bridge): add --allow-unconfigured to gateway command
openclaw gateway requires either a config file or --allow-unconfigured
flag to start without prior setup. Without it the process exits with
'Missing config' and enters a restart loop.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08 07:25:39 -07:00
hailin 72825d5526 fix(openclaw-bridge): add 'gateway --port 18789' subcommand to openclaw supervisord entry
Without a subcommand the CLI prints help and exits (status 1), causing
supervisord to restart it in a loop and the OC gateway to never start.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08 07:07:39 -07:00
hailin a47e894d4b fix(bridge): upgrade Node from 20 to 22 — openclaw requires >=22.12.0
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08 06:23:46 -07:00
hailin 805cd849fb fix(bridge): use correct openclaw entry point dist/index.js
openclaw package.json main=dist/index.js, not dist/openclaw.mjs.
The bin wrapper file is not copied into the Docker image.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08 06:21:56 -07:00
hailin c3b31ebf47 fix(bridge): remove environment= from supervisord programs, inherit from container
supervisord %(ENV_VAR)s expansion fails hard if the variable is absent.
For optional vars like DINGTALK_CLIENT_ID this crashes the entire supervisor.
Fix: remove all per-program environment= directives. All vars are injected
via docker run -e and supervisord child processes inherit them automatically.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08 06:12:52 -07:00
hailin b0801e0983 feat(bridge): DingTalk channel plugin + OpenClaw Protocol v3 rewrite
Core changes:
- src/channels/dingtalk.ts: DingTalk Stream SDK channel (no public IP needed)
  - TokenManager: auto-refresh with refreshPromise mutex (prevents race condition)
  - UserQueue: per-user serial queue, max depth 5
  - MsgDedup: O(1) Map<string,timestamp> with 10min TTL + 10k cap
  - RateLimiter: sliding window 10 msg/min per user
  - ResilientOcClient: 10s heartbeat poll + atomic reconnect guard
  - DingTalkStream: exponential backoff reconnect (2s→60s), immediate ACK
  - replyToUser: sessionWebhook expiry check + 4800-char chunking

- src/openclaw-client.ts: rewritten for correct Protocol v3 wire format
  - Request frame: { type:"req", id, method, params }
  - Challenge-response Ed25519 handshake (connect.challenge → connect req)
  - Correct rpc() with configurable timeoutMs

- src/index.ts: fixed RPC method names
  - agent.run → chat.send with { sessionKey, message, timeoutSeconds }
  - metrics.get → gateway.status

- Dockerfile: adds start-dingtalk.sh COPY + chmod
- supervisord.conf: dingtalk-channel program block (autorestart=unexpected)
- start-dingtalk.sh: exits 0 if DINGTALK_CLIENT_ID unset (no restart loop)
- CHANNEL_DEV_GUIDE.md: full dev guide for future channel integrations

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08 05:10:01 -07:00
hailin 688219ab74 fix(bridge): remove user= from supervisord.conf to fix non-root startup
Container runs as 'node' user (USER node in Dockerfile). Setting user=root
in [supervisord] causes "Can't drop privilege as nonroot user" error.
Remove all user= directives — user is managed at the Docker/container level.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08 05:07:25 -07:00
hailin 7d5840c245 feat(openclaw): Phase 1 — server pool + agent instance deployment infrastructure
## inventory-service
- New: pool_servers table (public schema, platform-admin managed)
- New: PoolServer entity, PoolServerRepository, PoolServerController
- CRUD endpoints at /api/v1/inventory/pool-servers
- Internal /deploy-creds endpoint (x-internal-api-key protected) for SSH key retrieval
- increment/decrement endpoints for capacity tracking

## agent-service
- New: agent_instances table (tenant schema)
- New: AgentInstance entity, AgentInstanceRepository, AgentInstanceController
- New: AgentInstanceDeployService — SSH-based docker deployment
  - Queries pool server availability from inventory-service
  - AES-256 encrypts OpenClaw gateway token at rest
  - Allocates host ports in range 20000-29999
  - Fires docker run for it0hub/openclaw-bridge:latest
  - Async deploy with error capture
- Added ssh2 dependency for SSH execution
- Added INVENTORY_SERVICE_URL, INTERNAL_API_KEY, VAULT_MASTER_KEY to docker-compose

## openclaw-bridge (new package)
- packages/openclaw-bridge/ — custom Docker image
- Two processes via supervisord: OpenClaw gateway + IT0 Bridge (Node.js)
- IT0 Bridge exposes REST API on port 3000:
  GET /health, GET /status, POST /task, GET /sessions, GET /metrics
- Connects to OpenClaw gateway at ws://127.0.0.1:18789 via WebSocket RPC
- Sends heartbeat to IT0 agent-service every 60s
- Dockerfile: multi-stage build (openclaw source + bridge TS compilation)

## Web Admin
- New: /server-pool page — list/add/edit/delete pool servers with capacity bars
- New: /openclaw-instances page — cross-tenant instance monitoring with status filter
- Sidebar: added 服务器池 (Database icon) + OpenClaw 实例 (Boxes icon) to platform_admin nav

## Flutter App
- my_agents_page: rewritten to show real AgentInstance data from /api/v1/agent/instances
- Added AgentInstance model with status-driven UI (running/deploying/stopped/error)
- Status badges with color coding + spinner for deploying state
- Summary chips showing running vs stopped counts
- api_endpoints.dart: added agentInstances endpoint

## Design docs
- OPENCLAW_INTEGRATION_PLAN.md: complete architecture document with all confirmed decisions

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-07 11:11:21 -08:00