hailin/it0 - it0 - AI Wolves Team

Commit Graph

Author	SHA1	Message	Date
hailin	8865985019	feat(agent-instance-chat): 实现用户与自己的 OpenClaw 智能体直接对话功能 ## 功能概述用户可在「我的智能体」页面点击运行中的 OpenClaw 实例卡片，直接打开与该智能体的专属对话页面，完整复用 iAgent 的聊天 UI （流式输出、工具时间线、审批卡片、语音输入等），同时保证 iAgent 对话完全不受影响。 ## 架构设计 - 使用 Riverpod ProviderScope 子作用域覆盖 chatRemoteDatasourceProvider / chatProvider / sessionListProvider，实现 iAgent 与实例对话的 provider 完全隔离，无任何共享状态。 - OpenClaw bridge 采用已有的 /task-async 异步回调模式： Flutter → POST /api/v1/agent/instances/:id/tasks（立即返回 sessionId/taskId） → 订阅 WS /ws/agent（等待事件） → Bridge 完成后 POST /api/v1/agent/instances/openclaw-app-callback（公开端点） → 后端发 WS text+completed 事件 → Flutter 收到回复 - 每个实例的会话通过 agent_sessions.agent_instance_id 字段隔离，会话抽屉只显示当前实例的历史记录。 ## 后端变更 ### packages/shared/database/src/migrations/013-add-agent-instance-id-to-sessions.sql - 新增迁移：ALTER TABLE agent_sessions ADD COLUMN agent_instance_id UUID NULL - 为按实例过滤会话建立索引 ### packages/services/agent-service/src/domain/entities/agent-session.entity.ts - 新增可选字段 agentInstanceId: string（对应 agent_instance_id 列） - iAgent 会话该字段为 null；实例聊天会话存储对应的 instance UUID ### packages/services/agent-service/src/infrastructure/repositories/session.repository.ts - 新增 findByInstanceId(tenantId, agentInstanceId) 方法 - 用于 GET /instances/:id/sessions 按实例过滤会话列表 ### packages/services/agent-service/src/interfaces/rest/controllers/agent.controller.ts 新增三个端点（注意：已知存在以下待修复问题，见后续 fix commit）： 1. POST /api/v1/agent/instances/:instanceId/tasks - 校验 instance 归属（userId 匹配）和 running 状态 - 创建会话（engineType='openclaw'，携带 agentInstanceId） - 保存用户消息到 conversation_messages 表 - 向 OpenClaw bridge POST /task-async，sessionKey=it0:{sessionId} - 立即返回 { sessionId, taskId }，Flutter 订阅 WS 等待回调 2. GET /api/v1/agent/instances/:instanceId/sessions - 返回该实例的会话列表（含 title/status/时间戳） 3. POST /api/v1/agent/instances/openclaw-app-callback（公开端点，无 JWT） - bridge 完成后回调此端点 - 成功：发 WS text+completed 事件，保存 assistant 消息，更新 task 状态 - 失败/超时：发 WS error 事件，标记 task 为 FAILED - 注入 AgentInstanceRepository 依赖 - 新增私有方法 createInstanceSession() ### packages/gateway/config/kong.yml - 新增 openclaw-app-callback-public service（无 JWT 插件） - 路由：POST /api/v1/agent/instances/openclaw-app-callback - 必须在 agent-service 之前声明，确保路由优先匹配（同 wecom-public 模式） ## Flutter 变更 ### it0_app/lib/core/config/api_endpoints.dart - 新增 instanceTasks(instanceId) 和 instanceSessions(instanceId) 静态方法 ### it0_app/lib/features/chat/presentation/pages/chat_page.dart - 新增可选参数 agentName（默认 null = iAgent 模式） - agentName != null 时：AppBar 显示智能体名称，隐藏语音通话按钮 - 不传 agentName 时行为与原来完全一致，iAgent 功能零影响 ### it0_app/lib/features/my_agents/presentation/pages/my_agents_page.dart - _InstanceCard 新增 onTap 回调参数 - 卡片用 Material+InkWell 包裹，支持圆角水波纹点击效果 - 新增 _openInstanceChat() 顶层函数： running → 滑入式跳转到 AgentInstanceChatPage 其他状态 → SnackBar 提示（部署中/已停止/错误） - 导入 AgentInstanceChatPage ### it0_app/lib/features/agent_instance_chat/（新建功能模块） data/datasources/agent_instance_chat_remote_datasource.dart: - AgentInstanceChatDatasource implements ChatRemoteDatasource - 通过组合模式包装 ChatRemoteDatasource 委托所有通用操作 - 覆盖 createTask → POST /api/v1/agent/instances/:id/tasks - 覆盖 listSessions → GET /api/v1/agent/instances/:id/sessions（仅当前实例会话） presentation/pages/agent_instance_chat_page.dart: - AgentInstanceChatPage(instance: AgentInstance) - ProviderScope 子作用域覆盖三个 provider 实现完全隔离： chatRemoteDatasourceProvider → AgentInstanceChatDatasource chatProvider → 独立 ChatNotifier 实例（与 iAgent 零共享） sessionListProvider → 仅当前实例的会话列表 - child: ChatPage(agentName: instance.name) 完整复用 UI ## 已知待修复问题（下一个 commit） 1. [安全] 鉴权检查逻辑：if (userId && ...) 应为 if (!userId \|\| ...) 2. [可靠性] fetch 未处理 HTTP 4xx/5xx 错误，任务可能永久挂起 3. [可靠性] bridge 回调无超时机制，bridge 崩溃后任务永久 RUNNING 4. [UX] robotStateProvider 未在子 ProviderScope 覆盖，头像动画反映 iAgent 状态 5. [UX] 实例聊天附件 UI 未禁用，上传附件被静默丢弃 6. [UX] 语音消息在实例模式下错误路由到 iAgent 引擎（非 OpenClaw） 7. [DB] 002 模板未加 agent_instance_id 列，新租户缺失此字段 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-09 19:30:38 -07:00
hailin	f186c57afb	fix(agent): decode JWT directly to get userId for system prompt req.user is never populated in agent-service (Kong verifies JWT, no Passport strategy). This caused userId to always be undefined → system prompt had no 'Current User ID' → Claude used tenant slug 'shenzhengj' as userId → DB error 'invalid input syntax for type uuid'. Fix: decode JWT payload from Authorization header (no signature verify needed — Kong already verified it) to extract sub (user UUID) for both AgentController and VoiceSessionController. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-09 08:51:28 -07:00
hailin	495407d25b	fix(agent): pass sessionId to system prompt for text chat OAuth trigger Text sessions were not passing sessionId to SystemPromptBuilder, causing Claude to use the `initiate_dingtalk_binding` custom tool (claude_api only). When the engine is claude_agent_sdk, this tool does not exist → 404. Fix: pass session.id as sessionId to systemPromptBuilder.build() in agent.controller.ts. Claude will now use the wget oauth-trigger endpoint for ALL session types (text and voice), which works with every engine. Also: store userId (staffId) as the DingTalk binding ID when resolvable, falling back to openId. Bot messages deliver senderStaffId which matches userId, not openId — this prevents the "binding not found" routing failure. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-08 14:20:58 -07:00
hailin	b87cebf465	feat(agent): inject userId into system prompt + fix agent-instance nullable columns - SystemPromptBuilder: add userId/userEmail to context, expose internal API curl commands for OpenClaw creation - agent.controller.ts: extract userId from JWT, build system prompt via SystemPromptBuilder so iAgent knows current user - agent.module.ts: register SystemPromptBuilder as provider - agent-instance.entity.ts: make serverHost/sshUser nullable (pool mode doesn't set these upfront) - DB: ALTER TABLE agent_instances DROP NOT NULL on server_host/ssh_user Now iAgent can create 小龙虾 instances autonomously when user asks in natural language. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-08 03:05:15 -07:00
hailin	4c7c05eb37	feat(stt): support auto language detection for mixed Chinese-English input - Flutter: language='auto' omits the language field → backend receives none - Backend: no language field → passes undefined to STT service - STT service: language=undefined → omits language param from Whisper request - Whisper auto-detects language per utterance when no hint is provided Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-06 08:13:26 -08:00
hailin	2182149c4c	feat(chat): voice-to-text fills input box instead of auto-sending - Add POST /api/v1/agent/transcribe endpoint (STT only, no agent trigger) - Add transcribeAudio() to chat datasource and provider - VoiceMicButton now fills the text input field with transcript; user reviews and sends manually - Add OPENAI_API_KEY/OPENAI_BASE_URL to agent-service in docker-compose Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-06 07:01:39 -08:00
hailin	a2af76bcd7	feat(agent-service): add voice message endpoint with Whisper STT and async interrupt New endpoint: POST /api/v1/agent/sessions/:sessionId/voice-message - Accepts multipart/form-data audio file (any format Whisper supports) - Transcribes via OpenAI Whisper API (routed through existing proxy) - If a task is currently running in the session → hard-interrupts it first (same cancel+inject pattern as text inject, triggered by voice command) - Otherwise → starts a fresh task with the transcript - Returns { sessionId, taskId, transcript } so client can subscribe to WS stream This enables WhatsApp-style push-to-talk and doubles as an async voice interrupt into any active agent workflow, bypassing the need for speaker diarization (whoever presses record owns the message). New files: infrastructure/stt/openai-stt.service.ts — OpenAI Whisper client, manually builds multipart/form-data, supports self-signed proxy cert Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-06 03:12:03 -08:00
hailin	6ca8aab243	fix(agent-service): store proper title in session metadata, exclude systemPrompt from list API Two issues fixed: 1. agent.controller.ts — on the FIRST task of each session, write title+voiceMode into session.metadata so the client can display a meaningful conversation title: - Text sessions: metadata.title = first 40 chars of user prompt - Voice sessions: metadata.title = '' + metadata.voiceMode = true (Flutter renders these as '语音对话 M/D HH:mm') titleSet flag prevents overwriting the title on subsequent turns of the same session. 2. session.controller.ts — listSessions() now returns a DTO instead of the raw entity. systemPrompt is an internal engine instruction and is explicitly excluded from the response. The client receives { id, status, engineType, metadata, createdAt, updatedAt }.	2026-03-04 02:39:47 -08:00
hailin	9ed80cd0bc	feat: implement complete commercial monetization loop (Phases 1-4) ## Phase 1 - Token Metering + Quota Enforcement ### Usage Tracking - agent-service: add UsageRecord entity (per-tenant schema) tracking inputTokens/outputTokens/costUsd per AI task - Modify all 3 AI engines (claude-api, claude-code-cli, claude-agent-sdk) to emit separate input/output token counts in the `completed` event - claude-api-engine: costUsd = (input3 + output15) / 1,000,000 (claude-sonnet-4-5 pricing: $3/MTok in, $15/MTok out) - agent.controller: persist UsageRecord and publish `usage.recorded` event to Redis Streams on every task completion (non-blocking) - shared/events: new events UsageRecordedEvent, SubscriptionChangedEvent, QuotaExceededEvent, PaymentReceivedEvent ### Quota Enforcement - TenantInfo: add maxServers, maxUsers, maxStandingOrders, maxAgentTokensPerMonth fields - TenantContextMiddleware: rewritten to query public.tenants table for real quota values; 5-min in-memory cache; plan-based fallback on error - TenantContextService: getTenant() returns null instead of throwing; added getTenantOrThrow() for strict callers - inventory-service/server.controller: 429 when maxServers exceeded - ops-service/standing-order.controller: 429 when maxStandingOrders exceeded - auth-service/auth.service: 429 when maxUsers exceeded - 002-create-tenant-schema-template.sql: add usage_records table ## Phase 2 - billing-service (New Microservice, port 3010) ### Domain Layer (public schema, all UUIDs) Entities: Plan, Subscription, Invoice, InvoiceItem, Payment, PaymentMethod, UsageAggregate Domain services: - SubscriptionLifecycleService: full state machine (trialing -> active -> past_due -> cancelled/expired); upgrades immediate, downgrades at period end - InvoiceGeneratorService: monthly invoice = base fee + overage charges; proration item for mid-cycle upgrades - OverageCalculatorService: (totalTokens - includedTokens) * overageRate ### Infrastructure (all repos use DataSource directly, NOT TenantAwareRepository) - PlanRepository, SubscriptionRepository, InvoiceRepository (atomic transaction for invoice+items), PaymentRepository (payments + methods), UsageAggregateRepository (UPSERT via ON CONFLICT for atomic accumulation) ### Application Use Cases - CreateSubscriptionUseCase: called on tenant registration - ChangePlanUseCase: upgrade (immediate + proration) or downgrade (scheduled) - CancelSubscriptionUseCase: immediate or at-period-end - GenerateMonthlyInvoiceUseCase: cron target (1st of month 00:05 UTC); generates invoices, renews periods, applies scheduled downgrades - AggregateUsageUseCase: Redis Streams consumer group billing-service, upserts monthly usage aggregates from usage.recorded events - CheckTokenQuotaUseCase: hard limit enforcement per plan - CreatePaymentSessionUseCase + HandlePaymentWebhookUseCase ### REST API - GET /api/v1/billing/plans - GET/POST /api/v1/billing/subscription (+ /upgrade, /cancel) - GET /api/v1/billing/invoices (paginated) - GET /api/v1/billing/invoices/:id - POST /api/v1/billing/invoices/:id/pay - GET /api/v1/billing/usage/current + /history - CRUD /api/v1/billing/payment-methods - POST /api/v1/billing/webhooks/{stripe,alipay,wechat,crypto} ### Plan Seed (auto on startup via PlanSeedService) - free: $0/mo, 100K tokens, no overage, hard limit 100% - pro: $49.99/mo, 1M tokens, $8/MTok, hard limit 150% - enterprise: $199.99/mo, 10M tokens, $5/MTok, no hard limit ## Phase 3 - Payment Provider Integration ### PaymentProviderRegistry (Strategy Pattern, mirrors EngineRegistry) All providers use @Optional() injection; unconfigured providers omitted - StripeProvider: PaymentIntent API; webhook via stripe.webhooks.constructEvent - AlipayProvider: alipay-sdk; Native QR (precreate); RSA2 signature verify - WeChatPayProvider: v3 REST; Native Pay code_url; AES-256-GCM decrypt; HMAC-SHA256 request signing and webhook verification - CryptoProvider: Coinbase Commerce; hosted checkout; HMAC-SHA256 verify ### WebhookController All 4 webhook endpoints are public (no JWT) for payment provider callbacks. rawBody: true enabled in main.ts for signature verification. ## Infrastructure Changes - docker-compose.yml: billing-service container (port 13010); added as dependency of api-gateway - kong.yml: /api/v1/billing routes (JWT); /api/v1/billing/webhooks (public) - 005-create-billing-tables.sql: 7 billing tables + invoice sequence + ALTER tenants to add quota columns - run-migrations.ts: 005 runs as part of shared schema step ## Phase 4 - Frontend ### Web Admin (Next.js) New pages: - /billing: subscription card + token usage bar + warning banner + invoices - /billing/plans: comparison grid with USD/CNY toggle + upgrade/downgrade flow - /billing/invoices: paginated table with Pay Now button Sidebar: Billing group (CreditCard icon, 3 sub-items) i18n: billing keys added to en + zh sidebar translations ### Flutter App New feature module it0_app/lib/features/billing/: - BillingOverviewPage: plan card + token LinearProgressIndicator + latest invoice + upgrade button - BillingProvider (FutureProvider): parallel fetch subscription/quota/invoice Settings page: "订阅与用量" entry card Router: /settings/billing sub-route Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-03 21:09:17 -08:00
hailin	da17488389	feat: voice mode event filtering — skip tool/thinking events for Agent SDK 1. Remove on_enter greeting entirely (no more race condition) 2. voice-agent sends voiceMode: true when engine_type is claude_agent_sdk 3. AgentController.runTaskStream() filters thinking, tool_use, tool_result events in voice mode — only text, completed, error reach the client 4. Detailed logging: each event logged with [FILTERED-voice] tag when skipped Claude API mode is completely unaffected (voiceMode defaults to false). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-02 02:56:41 -08:00
hailin	e4c2505048	feat: add multimodal image input with streaming markdown optimization Two major features in this commit: 1. Streaming Markdown Rendering Optimization - Replace deprecated flutter_markdown with gpt_markdown (active, AI-optimized) - Real-time markdown rendering during streaming (was showing raw syntax) - Solid block cursor (█) instead of AnimationController blink - 80ms token throttle buffer reducing rebuilds from per-token to ~12.5/sec - RepaintBoundary isolation for markdown widget repaints - StreamTextWidget simplified from StatefulWidget to StatelessWidget 2. Multimodal Image Input (camera + gallery + display) - Flutter: image_picker for gallery/camera, base64 encoding, attachment preview strip with delete, thumbnails in sent messages - Data layer: List<String>? → List<Map<String, dynamic>>? for structured attachment payloads through datasource/repository/usecase - ChatAttachment model with base64Data, mediaType, fileName - ChatMessage entity + ChatMessageModel both support attachments field - Backend DTO, Entity (JSONB), Controller, ConversationContextService all extended to receive, store, and reconstruct Anthropic image content blocks in loadContext() - Claude API engine skips duplicate user message when history already ends with multimodal content blocks - NestJS body parser limit raised to 10MB for base64 image payloads - Android CAMERA permission added to manifest - Image.memory uses cacheWidth/cacheHeight for memory efficiency - Max 5 images per message enforced in UI Data flow: ImagePicker → base64Encode → ChatAttachment → POST body → DB (JSONB) → loadContext → Anthropic image content blocks → Claude API Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-28 03:24:17 -08:00
hailin	50dbb641a3	fix: comprehensive hardening of agent task cancel/inject/approve flows 6 rounds of systematic audit identified and fixed 14 bugs across backend controller and Flutter client: ## Backend (agent.controller.ts) Security & Tenant Isolation: - Add @TenantId + ForbiddenException check to cancelTask, injectMessage, approveCommand — all 4 write endpoints now enforce tenant isolation - Add tenantId check on session reuse in executeTask to prevent cross-tenant session hijacking Architecture & Correctness: - Extract shared runTaskStream() from inline fire-and-forget block, used by both executeTask and injectMessage to reduce duplication - Use session.engineType (not getActiveEngine()) in cancelTask, injectMessage, approveCommand — fixes wrong-engine-cancel when global engine config is switched after task creation - Add concurrent task prevention: executeTask checks for existing RUNNING task on same session and cancels it before starting new one - Add runningTasks Map to track task promises, awaitTaskCleanup() helper with 3s timeout for inject to wait for partial text save - captureSdkSessionId() captures SDK session ID into metadata without DB save (callers persist), preventing fire-and-forget race Cancel/Reject Improvements: - cancelTask: idempotent (returns early if already CANCELLED/COMPLETED), session stays 'active' (was 'cancelled'), emits cancelled WS event - approveCommand reject: session stays 'active' (was 'cancelled'), now emits cancelled WS event so Flutter stream listeners clean up - approveCommand approved: collect text events and save assistant response to conversation history on completion (was missing) Minor: - task.result! non-null assertion → task.result ?? 'Unknown error' - Add findRunningBySessionId() to TaskRepository ## Flutter API Contract Fix: - approveCommand: route changed from /api/v1/ops/approvals/:id/approve to /api/v1/agent/tasks/:id/approve with {approved: true} body - rejectCommand: route changed from /api/v1/ops/approvals/:id/reject to /api/v1/agent/tasks/:id/approve with {approved: false} body Resource Management: - ChatNotifier.dispose() now disconnects WebSocket to prevent connection leak when navigating away from chat Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-27 22:20:46 -08:00
hailin	cc0f06e2be	feat: SDK engine native resume with per-tenant HOME isolation Replace prompt-prefix workaround with SDK's native resume mechanism. Each tenant gets isolated HOME directory (/data/claude-tenants/{tenantId}) to prevent cross-tenant session file mixing. SDK session IDs are persisted in session.metadata for cross-request resume support. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-25 02:27:38 -08:00
hailin	2403ce5636	feat: multi-turn conversation context management with session history UI Implement DB-based conversation message storage (engine-agnostic) that works across both Claude API and Agent SDK engines. Add ChatGPT/Claude-style conversation history drawer in Flutter with date-grouped session list, session switching, and new chat functionality. Backend: entity, repository, context service, migration 004, session/message API endpoints. Flutter: ConversationDrawer, sessionId flow from backend response via SessionInfoEvent, session list/switch/delete support. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-24 19:04:35 -08:00
hailin	5d4fd96d43	feat: streaming claude-api engine, engineType override, fix voice test page - Claude API engine now uses streaming API (messages.stream) for real-time text delta output instead of waiting for full response - Agent controller accepts optional engineType body parameter to allow callers (e.g. voice pipeline) to select a specific engine - Fix voice_test_page.dart compilation error: replace audioplayers (not installed) with flutter_sound (already in pubspec.yaml) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-24 05:30:11 -08:00
hailin	2a150dcff5	fix: prevent error event from overriding completed status in controller Add finished guard so that once a task reaches completed/error terminal state, subsequent events don't flip the status back. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-24 03:49:21 -08:00
hailin	a7b42e6b98	feat: add detailed logging to agent engine and task controller Log every SDK message type, event emission, and stream lifecycle to diagnose why text events are missing in voice-agent flow. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-24 02:56:09 -08:00
hailin	806113554b	fix: remove AuthGuard('jwt') from all service controllers Kong handles JWT validation at the gateway level. Service-level AuthGuard('jwt') fails because services don't register a Passport JWT strategy (only auth-service does). Removed from 17 controllers across ops, inventory, monitor, comm, audit, and agent services. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-21 23:42:37 -08:00
hailin	00f8801d51	Initial commit: IT0 AI-powered server cluster operations platform Full-stack monorepo with DDD + Clean Architecture: - Backend: 7 NestJS microservices + 5 shared libraries (TypeScript) - Mobile: Flutter app with Riverpod (Dart) - Web Admin: Next.js dashboard with Zustand + React Query - Voice: Python voice service (STT/TTS/VAD) - Infra: Docker Compose, K8s manifests, Turborepo build Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-08 22:54:37 -08:00

19 Commits