iconsulting

Commit Graph

Author	SHA1	Message	Date
hailin	4ef5fce924	fix(llm-gateway): fix TS error — embeddings handler has no injection variable Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-26 02:03:17 -08:00
hailin	5683185a47	feat(llm-gateway): add system prompt injection to OpenAI chat proxy - Add injectSystemPromptOpenAI() for OpenAI messages format (role: system) - Integrate injection into createOpenAIChatProxy before upstream call - Update audit logs to track injection status - Enables brand identity override for both Anthropic and OpenAI endpoints Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-26 02:01:29 -08:00
hailin	a4fa4f47d6	fix(gateway): strip service_tier and usage details from OpenAI responses Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-26 01:52:10 -08:00
hailin	00056c5405	feat(gateway): deep response sanitization to mask provider identity Replace Anthropic msg_xxx IDs with opaque IDs, strip cache_creation, service_tier, inference_geo fields. Replace OpenAI chatcmpl-xxx IDs, strip system_fingerprint. Applied to both streaming and non-streaming. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-26 00:50:20 -08:00
hailin	e898e6551d	feat(gateway): add per-key model override and alias for transparent model routing Admin can configure modelOverride (actual upstream model) and modelAlias (name shown to users) per API key. When set, users don't need to specify the real model — the gateway substitutes it transparently in both requests and responses (including SSE streams). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-26 00:31:26 -08:00
hailin	dd765ed7a4	fix(llm-gateway): strip trailing /v1 from OpenAI upstream URL to avoid double path OPENAI_BASE_URL may already include /v1, causing requests to hit /v1/v1/embeddings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-25 23:25:16 -08:00
hailin	0114e9896d	fix(admin-client): remove unused imports in llm-gateway components Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-25 23:13:40 -08:00
hailin	6476bd868f	feat(llm-gateway): 新增对外 LLM API 代理服务 — 完整的监管注入、内容审查和管理后台 ## 新增微服务: llm-gateway (端口 3008) 对外提供与 Anthropic/OpenAI 完全兼容的 API 接口，中间拦截实现： - API Key 认证：由我们分配 Key 给外部用户，SHA-256 哈希存储 - System Prompt 注入：在请求转发前注入监管合规内容（支持 prepend/append） - 内容审查过滤：对用户消息进行关键词/正则匹配，支持 block/warn/log 三种动作 - 用量记录：异步批量写入，跟踪 token 消耗和费用估算 - 审计日志：记录每次请求的来源 IP、过滤状态、注入状态等 - 速率限制：基于内存滑动窗口的 RPM 限制 ### 技术选型 - Fastify (非 NestJS)：纯代理场景无需 DI 容器，路由开销 ~2ms - SSE 流式管道：零缓冲直通，支持 Anthropic streaming 和 OpenAI streaming - 规则缓存：30 秒 TTL，避免每次请求查库 ### API 端点 - POST /v1/messages — Anthropic Messages API 代理（流式+非流式） - POST /v1/embeddings — OpenAI Embeddings API 代理 - POST /v1/chat/completions — OpenAI Chat Completions API 代理 - GET /health — 健康检查 ## 数据库 (5 张新表) - gateway_api_keys: 外部用户 API Key（权限、限速、预算、过期时间） - gateway_injection_rules: 监管内容注入规则（位置、匹配模型、匹配 Key） - gateway_content_rules: 内容审查规则（关键词/正则、block/warn/log） - gateway_usage_logs: Token 用量记录（按 Key、模型、提供商统计） - gateway_audit_logs: 请求审计日志（IP、过滤状态、注入状态） ## Admin 后端 (conversation-service) 4 个 NestJS 控制器，挂载在 /conversations/admin/gateway/ 下： - AdminGatewayKeysController: Key 的 CRUD + toggle - AdminGatewayInjectionRulesController: 注入规则 CRUD + toggle - AdminGatewayContentRulesController: 内容审查规则 CRUD + toggle - AdminGatewayDashboardController: 仪表盘汇总、用量查询、审计日志查询 5 个 ORM 实体文件对应 5 张数据库表。 ## Admin 前端 (admin-client) 新增 features/llm-gateway 模块，Tabs 布局包含 5 个管理面板： - API Key Tab: 创建/删除/启停 Key，创建时一次性显示完整 Key - 注入规则 Tab: 配置监管内容（前置/追加到 system prompt） - 内容审查 Tab: 配置关键词/正则过滤规则 - 用量统计 Tab: 查看 token 消耗、费用、响应时间 - 审计日志 Tab: 查看请求记录、过滤命中、注入状态菜单项: GatewayOutlined + "LLM 网关"，位于"系统总监"和"数据分析"之间。 ## 基础设施 - docker-compose.yml: 新增 llm-gateway 服务定义 - kong.yml: 新增 /v1/messages、/v1/embeddings、/v1/chat/completions 路由 - 超时设置 300 秒（LLM 长响应） - CORS 新增 X-Api-Key、anthropic-version、anthropic-beta 头 - init-db.sql: 新增 5 张 gateway 表的建表语句 ## 架构说明内部服务（conversation-service、knowledge-service、evolution-service）继续直连 API， llm-gateway 仅服务外部用户。两者通过共享 PostgreSQL 数据库关联配置。 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-25 22:32:25 -08:00
hailin	021afd8677	fix(prompts): remove '无需雇主，自由就业' from QMAS overview template User wants exact wording: "基本门槛条件12项满足6项即可申请" only. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-10 08:28:29 -08:00
hailin	e6e62b4fc6	fix(prompts): add verbatim template for immigration overview responses When users ask "香港有哪些移民途径" or similar overview questions, the AI must use exact standard descriptions for each category. QMAS: "基本门槛条件12项满足6项即可申请" — no mention of 综合计分. Explicit ❌ prohibition on adding extra scoring criteria in overview. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-10 08:23:25 -08:00
hailin	17fb542292	fix(prompts): remove '综合计分≥80' from QMAS overview — only '12项门槛满足6项' The QMAS brief description in the category comparison table (Section 10.7) should only state the 12-item threshold requirement, not the composite score. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-10 08:14:50 -08:00
hailin	385621bbea	fix(prompts): 修正类别概览表和Tier示例中的QMAS/TTPS不准确描述类别对比表(Section 10.7)和Tier 1示例中仍有旧描述： - 优才："综合计分≥80" → "12项门槛满足6项 + 综合计分≥80" - 高才通："年薪250万港币" → "全年收入250万港币"、"百强大学" → "百强大学学士学位" - Tier示例"大专能申优才"：补充12项门槛说明 AI在回答概览类问题时参考此表生成简介，不修正会导致仍输出旧版不准确信息。 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-10 08:11:36 -08:00
hailin	0ca3d3e922	fix(agents): 严格对标官方政策修正高才通(TTPS) A/B/C类准入条件 + 修复GEP/TTPS命名混淆 ## 问题对比官方政策原文与机器人回答，发现高才通A/B/C类条件描述存在多处不准确： - A类："年薪"应为"全年收入"（含底薪+奖金+津贴+股票期权等） - B类：缺少"学士学位"限定，缺少"申请前五年内"时间窗口 - C类：缺少"学士学位"和"五年内"限定，完全缺失排除条款和"先到先得"分配方式 - 系统全局存在GEP/TTPS命名混淆（GEP=一般就业政策 vs TTPS=高才通） ## 修改内容（3层保障） ### Layer 1: Assessment Expert Prompt (assessment-expert-prompt.ts) - 修正类别列表命名：GEP/TTPS → TTPS（高才通），TTPS/GEP → GEP（一般就业政策） - A类：明确"全年收入"定义（底薪+奖金+津贴+期权变现），不限学校不限行业 - B类：要求"学士学位"（非泛指学位），限定"申请前五年内"累积三年经验 - C类：要求"申请前五年内"获颁"学士学位"，新增排除条款（在港非本地毕业生不适用），注明先到先得分配方式，引导至IANG - 新增评估输出要求：必须标注subClass (A/B/C/none) ### Layer 2: Coordinator System Prompt (coordinator-system-prompt.ts) - DEFAULT_CATEGORIES命名修正：GEP↔TTPS互换为正确对应 - Section 10.2 完全重写：严格按官方政策的A/B/C条件表格 - 新增"关键细节"区块：5条回答必须准确的要点 - 新增常见问答："硕士算不算？"、"收入包括哪些？" - Section 10.4 命名修正：(GEP/TTPS) → (GEP) ### Layer 3: Code-level Post-validation (immigration-tools.service.ts) - Step 4.6: TTPS后校验逻辑（安全网） - C类排除条款检测：扫描highlights/concerns中"在港"/"IANG"/"香港院校"关键词 - subClass缺失警告：eligible=true但未指定A/B/C时自动添加concern - 兼容命名混淆期：同时检查 'TTPS' 和 'GEP' category Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-10 07:58:38 -08:00
hailin	60a74fc3b0	feat(agents): enforce QMAS 12-item eligibility threshold with 3-layer validation 根据2024年11月1日更新的优才计划政策，实现12项基本门槛评核准则的系统级强制校验（需满足至少6项才具备申请资格）。 Layer 1 — Assessment Expert Prompt (assessment-expert-prompt.ts): - QMAS评估新增强制性3步流程：门槛评核 → 成就计分制 → 综合计分制 - 12项评核准则逐一列出，含判定依据（年龄≤50、硕士/博士、STEM、双语能力、英文能力、≥5年工作经验、跨国/知名企业≥3年、特定行业≥3年、国际经验≥2年、年收入≥100万港币、业务实体盈利≥500万港币、上市公司） - 每项判定为 met/not_met/unknown，unknown不计为符合 - 门槛不通过 → eligible=false, score上限29分 - 输出JSON新增 thresholdCheck 结构化字段（items数组+metCount+passed） Layer 2 — Code-level 后置校验 (immigration-tools.service.ts): - Step 4.5 安全网：解析评估结果后校验QMAS thresholdCheck一致性 - 门槛不通过但score>29 → 自动降级修正（score=29, eligible=false） - 门槛通过但eligible=false且score>29 → 自动修正eligible=true - 缺少thresholdCheck → 记录警告日志 Layer 3 — Coordinator System Prompt (coordinator-system-prompt.ts): - Section 10.1 新增"基本门槛（2024年11月更新）"小节 - 明确说明门槛不通过者不具备申请资格（即使计分制达80分） - 更新4条常见问题速答，融入门槛准则解释数据收集增强 (collection-expert-prompt.ts): - 新增3个附加字段：company_type、business_ownership、listed_company - QMAS类别映射扩展，覆盖门槛评核12项所需全部数据点 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-10 07:40:47 -08:00
hailin	1f6d473649	feat(admin): add multimodal image paste support to all admin chat interfaces 支持管理员在3个管理聊天界面（系统总监、评估指令、收集指令）中通过粘贴板粘贴图片，实现与管理Agent的多模态对话。新增文件: - `shared/hooks/useImagePaste.ts`: 共享 hook，处理剪贴板图片粘贴、 base64 转换、待发送图片管理、多模态内容块构建后端改动 (conversation-service): - 3个管理聊天服务 (system-supervisor-chat, directive-chat, collection-directive-chat): chat() 方法参数类型从 `content: string` 改为 `content: Anthropic.MessageParam['content']`，支持接收图片块 - 3个管理控制器 (admin-supervisor, admin-assessment-directive, admin-collection-directive): DTO content 类型改为 `any` 以透传前端发送的多模态内容前端改动 (admin-client): - 3个 API 类型文件: ChatMessage.content 类型扩展为 `string \| ContentBlock[]` - SupervisorPage: 集成 useImagePaste hook，添加 onPaste 处理、待发送图片预览（64x64 缩略图+删除按钮）、消息中图片渲染 - DirectiveChatDrawer: 同上，48x48 缩略图适配 Drawer 宽度 - CollectionChatDrawer: 同上 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-08 21:18:57 -08:00
hailin	3b6e1586b7	fix(admin): add Markdown rendering to assessment & collection chat drawers Both directive chat drawers were rendering AI responses as plain text. Apply the same ReactMarkdown + remark-gfm treatment used in supervisor. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-08 21:03:22 -08:00
hailin	12e622040a	fix(admin): add Markdown rendering to System Supervisor chat Supervisor responses contain rich Markdown (tables, headers, bold, lists, code). Previously rendered as plain text with pre-wrap. - Install react-markdown + remark-gfm for GFM table support - Wrap assistant messages in ReactMarkdown component - Add .supervisor-markdown CSS styles (tables, headings, lists, hr, code) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-08 21:01:20 -08:00
hailin	5034ef4a70	feat(admin): add System Supervisor — global system status chat interface Add a "系统总监" (System Supervisor) feature that provides admins with a natural language chat interface to query the entire iConsulting system's operational status, including all 7 specialist agents, directives, token usage, conversation statistics, and system health. Backend: - SystemSupervisorChatService: Haiku 4.5 with 7 read-only tools - get_agent_configs: list all 7 agent model/parameter configs - get_agent_execution_stats: execution counts, success rates, latency - get_directives_summary: assessment + collection directive overview - get_token_usage_stats: token consumption and cost by model - get_conversation_stats: conversation counts, conversion rates, stages - get_evaluation_rules: quality gate rule configuration - get_system_health: circuit breakers, Redis, service availability - AdminSupervisorController: POST /conversations/admin/supervisor/chat - Registered in AgentsModule (provider + export) and ConversationModule - Added AgentExecutionORM to TypeOrmModule.forFeature in AgentsModule Frontend (admin-client): - features/supervisor/ with Clean Architecture layers: - infrastructure/supervisor.api.ts: HTTP client - application/useSupervisor.ts: React Query mutation hook - presentation/pages/SupervisorPage.tsx: full-page chat UI - Quick action buttons: 系统概况, Agent统计, 成本报告, 健康检查 - Route: /supervisor, menu icon: EyeOutlined (between 收集指令 and 数据分析) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-08 20:49:57 -08:00
hailin	8e4bd95dda	feat(agents): add Collection Expert specialist + admin directive system ## Part A: Collection Expert Specialist Agent (7th specialist) - New specialist: CollectionExpertService (Haiku 4.5, maxTurns 2) - Analyzes user info completeness against 12-item weighted checklist - Identifies missing fields, recommends next questions - Category-specific priority adjustments (QMAS/GEP/IANG/TTPS/CIES/TechTAS) - Tools: search_knowledge, get_user_context - Admin directive injection: loads active directives from DB before each run - Prompt: collection-expert-prompt.ts (completeness calc, validation rules, JSON output) - Coordinator integration: invoke_collection_expert tool + case in executeAgentTool - System prompt: section 2.6 usage guide, section 4.3 optional invocation reference ## Part B: Admin Directive System (parallel to assessment directives) - ORM: CollectionDirectiveORM (collection_directives table) - Types: general, priority, category, validation - Multi-tenant with tenant_id + enabled indexes - SQL: CREATE TABLE collection_directives in init-db.sql - Controller: /conversations/admin/collection-directives (10 REST endpoints) - CRUD + toggle + reset + preview + AI chat - Chat Service: CollectionDirectiveChatService (Haiku 4.5 tool loop) - 5 tools: list/create/update/delete/reset directives - mutated flag for frontend cache invalidation ## Part C: Frontend Admin-Client - Feature module: features/collection-config/ (5 files) - API client, React Query hooks, Config page, Chat drawer - Directive types: 通用指令/优先级调整/类别配置/验证规则 - Route: /collection-config in App.tsx - Sidebar: FormOutlined icon, label '收集指令' in MainLayout.tsx Files: 11 new, 9 modified \| Backend + frontend compile clean Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-08 20:07:33 -08:00
hailin	22bca31690	feat(agents): add AI chat interface for directive management Add a conversational chat drawer to the assessment config admin page, allowing admins to manage assessment directives via natural language. Backend: - DirectiveChatService: Haiku 4.5 LLM with 5 tools (list, create, update, delete, reset) and iterative tool loop (max 5 turns) - System prompt dynamically includes current directive state from DB - POST /chat endpoint on admin-assessment-directive controller - Registered in AgentsModule (global), injected via @Optional() Frontend: - DirectiveChatDrawer: Ant Design Drawer (480px) with message list, input box (Enter to send, Shift+Enter for newline), loading state - useDirectiveChat hook: React Query mutation, auto-invalidates directive queries when response.mutated === true - "AI 助手" button added to AssessmentConfigPage header Files: - NEW: agents/admin/directive-chat.service.ts (LLM tool-loop service) - NEW: components/DirectiveChatDrawer.tsx (chat drawer UI) - MOD: agents.module.ts (register + export DirectiveChatService) - MOD: admin-assessment-directive.controller.ts (POST /chat endpoint) - MOD: assessment-config.api.ts (chat API method + types) - MOD: useAssessmentConfig.ts (useDirectiveChat hook) - MOD: AssessmentConfigPage.tsx (AI button + drawer integration) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-08 19:17:07 -08:00
hailin	9feb03153b	feat(agents): add admin assessment directive system for dynamic prompt injection Admins can now write natural-language directives that get injected into the assessment expert's system prompt. Directives are stored in DB, loaded per execution, and support incremental additions, toggling, and full reset. Backend: - New assessment_directives table + ORM entity - Admin CRUD API at /conversations/admin/assessment-directives - buildAssessmentExpertPrompt() accepts optional adminDirectives param - AssessmentExpertService loads active directives from DB before each execution - Fail-safe: missing repo/tenant context → default prompt (no directives) Frontend (admin-client): - New "评估指令" page with table, create/edit modals, toggle switches - Prompt preview panel showing assembled directive text - Reset-to-default with confirmation - React Query hooks for all CRUD operations Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-08 18:21:21 -08:00
hailin	b09538144c	fix(security): make invoke_assessment_expert payment gate fail-closed Previously the payment check on invoke_assessment_expert used `&& this.paymentClient` which silently skipped the entire gate when PaymentClientService was unavailable (DI failure / optional inject). Now returns an explicit error when the payment service is unreachable, preventing unpaid assessments from executing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-08 17:46:04 -08:00
hailin	f06ed09c3c	feat(agents): v2.0 interruptible assessment with abort chain + admin toggle Full abort signal chain: Gateway → ConversationService → Coordinator → ToolExecutor → BaseSpecialist → Claude API stream. Admin can toggle between v1 (post-completion re-evaluation) and v2 (interruptible) via REST API. Changes: - Gateway: add cancel_stream WebSocket handler + active stream tracking - Gateway: abort active stream on client disconnect - ConversationService: accept + forward AbortSignal - CoordinatorAgentService: link external AbortSignal to internal controller, thread through tool executor, read assessment mode from Redis feature flag - BaseSpecialistService: hard abort (throw) instead of soft break, add abort signal to Promise.race in callClaude(), abort stream on cancel - ImmigrationToolsService: thread abortSignal to assessment expert - AdminObservabilityController: GET/PUT feature-flags/assessment-mode (Redis-backed, defaults to v1) v1 and v2 coexist — admin controls which mode is active. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-08 17:39:57 -08:00
hailin	aa9f31ff20	feat(agents): v1.0 post-completion re-evaluation with forceReassess parameter When users correct or update personal info after assessment completion, Coordinator can now re-run run_professional_assessment with forceReassess: true to bypass the 30-day dedup and produce an updated report. Changes: - Add forceReassess boolean param to run_professional_assessment tool definition - Skip already_assessed check when forceReassess=true in handler - Add prompt rules for identifying info corrections and triggering re-evaluation - Document the re-evaluation flow in sections 3.5 and 4.4 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-08 17:30:24 -08:00
hailin	a72e718510	fix(agents): add payment gate on invoke_assessment_expert + progress streaming for assessment Two hardening fixes for the professional assessment pipeline: 1. Code-level payment verification before dispatching invoke_assessment_expert (prevents bypassing the prompt-only gate) 2. Thread onProgress callback through direct tool chain so run_professional_assessment streams agent_progress events during the 30-45s assessment expert execution Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-08 17:16:28 -08:00
hailin	e809740fdb	feat(agents): add run_professional_assessment tool with payment gate + artifact persistence Replaces ad-hoc assessment flow with structured pipeline: - Code-level payment verification (checks PAID ASSESSMENT order) - Info completeness validation (age, nationality, education, work exp) - Assessment expert invocation with result parsing - Automatic persistence as UserArtifact (assessment_report type) - 30-day dedup (existing report within 30 days returns cached) - Frontend rendering for all status codes (completed, payment_required, info_incomplete, already_assessed, error) - System prompt updated to mandate new tool for paid assessments - Post-assessment auto-generation of checklist + timeline Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-08 17:01:56 -08:00
hailin	95f36752c9	feat(agents): add prompt-driven execution tools with DB persistence Add 4 new tools (generate_document, manage_checklist, create_timeline, query_user_artifacts) enabling the agent to create and manage persistent user artifacts. Artifacts are saved to PostgreSQL and support dedup by title, update-in-place, and cross-session querying. Frontend renders rich UI cards for each artifact type. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-08 07:35:08 -08:00
hailin	85c78b0775	feat(admin): add system observability dashboard with circuit breaker monitoring Backend: expose circuit breaker status via new AdminObservabilityController (health, circuit-breakers, redis endpoints). Frontend: new observability feature in admin-client with auto-refreshing status cards. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-08 05:28:24 -08:00
hailin	0d488ac68b	feat(agents): add Redis checkpoint for agent loop crash recovery - New RedisClientService: optional ioredis wrapper, gracefully degrades without REDIS_URL - New RedisModule: global NestJS module providing Redis connectivity - AgentCheckpoint interface: captures turn, messages, cost, agents, timestamp - Agent loop saves checkpoint after each tool execution batch (TTL=10min) - On restart with same conversationId+requestId, loads checkpoint and resumes from saved state - Checkpoint auto-deleted after load to prevent stale recovery - Coordinator injects @Optional() RedisClientService, builds save/load callbacks - Zero impact when Redis is not configured — checkpoint silently skipped Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-08 04:27:25 -08:00
hailin	1a1573dda3	feat(resilience): add circuit breaker for downstream services - New CircuitBreaker class: CLOSED → OPEN → HALF_OPEN three-state model - Zero external dependencies, ~90 lines, fail-open semantics - KnowledgeClientService: threshold=5, cooldown=60s, protects all 9 endpoints - PaymentClientService: threshold=3, cooldown=30s, protects all 7 endpoints - Both services refactored to use protectedFetch() — cleaner code, fewer try-catch - Replaces verbose per-method error handling with centralized circuit breaker - When tripped: returns null/empty fallback instantly, no network call Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-08 04:21:30 -08:00
hailin	2ebc8e6da6	feat(agents): stream specialist agent progress to frontend - Convert BaseSpecialistService.callClaude() from sync .create() to streaming .stream() - Add onProgress callback to SpecialistExecutionOptions for real-time text delta reporting - All 6 specialist convenience methods now accept optional options parameter - Coordinator creates throttled progress callback (every 300 chars) pushing agent_progress events - Agent loop drains accumulated progress events after each tool execution batch - WebSocket gateway forwards agent_progress events to frontend - Progress event sink shared between tool executor and agent loop via closure Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-08 04:17:37 -08:00
hailin	198ff4b349	feat(agents): add PreToolUse/PostToolUse hook system for tool call interception - New ToolHooksService with dynamic hook registration (pre/post) - Built-in audit logging: tool name, type, user, duration, success/failure - Fail-open design: individual hook failures don't block tool execution - Integrated into coordinator's createToolExecutor with full context - Hook context includes: toolName, toolType (agent/direct/mcp), traceId, timing - Supports future extensions: rate limiting, permission checks, analytics Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-08 04:09:14 -08:00
hailin	02ee4311dc	feat(observability): add trace ID propagation across agent pipeline - Extend TenantStore with traceId + traceStartTime in AsyncLocalStorage - Generate traceId (UUID-12) at WebSocket gateway entry point - Propagate traceId through AgentLoopParams → agentLoop → specialists - Add [trace:xxx] prefix to all logger calls in agent-loop, coordinator, and specialists - Replace console.log with NestJS Logger in ConversationGateway - Include traceId in stream_start event for frontend correlation - Add traceId to AgentExecutionRecord and BaseStreamEvent Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-08 04:05:24 -08:00
hailin	ea6dff3a4e	feat(agents): add model identity protection — prompt rules + code-level output filter Prevent AI from revealing underlying model name (Claude/GPT/etc.) under any circumstance including jailbreak attempts. Two defense layers: - Prompt: anti-jailbreak rules + "小艾引擎" branding in coordinator system prompt - Code: sanitizeModelLeaks() regex filter in agent-loop.ts streaming output Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-07 13:02:22 -08:00
hailin	2a8a15fcb6	fix: resolve ClaudeModule DI crash + historical QR code display bug 1. ClaudeModule missing ConversationORM in TypeOrmModule.forFeature — ImmigrationToolsService now depends on ConversationORMRepository (added in query_user_profile), but ClaudeModule only had TokenUsageORM. Fix: add ConversationORM to ClaudeModule's TypeORM imports. 2. Historical messages show "支付创建失败" for payment QR codes — toolCall.result is stored as JSON string in DB metadata JSONB. Live streaming (useChat.ts) parses it correctly, but REST API load (chatStore.ts → MessageBubble.tsx) does not. Fix: normalize toolCall.result in ToolCallResult component — JSON.parse if string, pass through if already object. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-07 12:24:15 -08:00
hailin	43d4102e1f	feat(agents): add query_user_profile tool for user info lookup 新增 query_user_profile 工具，让 AI agent 能回答用户关于自身信息的查询，例如"这是我第几次咨询？"、"你记得我的信息吗？"、"我之前咨询过什么？" ## 问题背景当用户问"这是我第几次跟你咨询？"时，AI 无法回答，因为没有任何工具能查询用户的历史咨询数据。 ## 实现方案：双层设计 ### 第一层：被动上下文注入（Context Injector） - context-injector.service.ts 注入 ConversationORM repo + TenantContextService - buildConversationStatsBlock() 现在自动查询用户累计咨询次数 - 每次对话自动注入 `用户累计咨询次数: N 次（含本次对话）` - 简单问题（"这是第几次？"）AI 可直接从上下文回答，零工具调用 ### 第二层：主动工具调用（query_user_profile）用户需要详细信息时，AI 调用此工具，返回完整档案： - 咨询统计：累计次数、首次/最近咨询时间、类别分布 - 最近对话：最近 10 个对话的标题、类别、阶段 - 用户画像：系统记忆中的事实（学历/年龄/职业）、偏好、意图 - 订单统计：总单数、已支付、待支付 ## 修改文件 - agents.module.ts: 添加 ConversationORM 到 TypeORM imports - coordinator-tools.ts: 新增 query_user_profile 工具定义（只读） - immigration-tools.service.ts: 注入 ConversationORM repo + TenantContextService，实现 queryUserProfile() 方法（并行查询对话+记忆+订单） - coordinator-system-prompt.ts: 第3.3节添加工具文档和使用指引 - context-injector.service.ts: 注入 repo，conversation_stats 块添加累计咨询次数 ## 依赖关系 - 无循环依赖：直接使用 TypeORM Repository<ConversationORM>（数据访问层），不依赖 ConversationService（避免 AgentsModule ↔ ConversationModule 循环） - TenantContextService 全局可用，确保多租户隔离 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-07 12:17:23 -08:00
hailin	389f975e33	fix(payment): return paymentUrl from adapters, strip base64 from tool output Alipay/WeChat adapters now return the source payment URL alongside the QR base64. The generate_payment tool only returns paymentUrl (short text) to Claude API — base64 qrCodeUrl is stripped to prevent AI from dumping raw data:image into text responses. Frontend QRCodeSVG renders from paymentUrl instead of base64. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-07 11:44:54 -08:00
hailin	6609e50100	fix(payment): add /api/v1 prefix to PaymentClientService URLs payment-service uses setGlobalPrefix('api/v1'), so all routes are under /api/v1/orders, /api/v1/payments, etc. PaymentClientService was calling /orders directly, resulting in 404: Cannot POST /orders → 创建订单失败 Fixed all 7 endpoint URLs to include the /api/v1 prefix. Same pattern as file-service fix (FILE_SERVICE_URL). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-07 11:26:26 -08:00
hailin	6767215f83	refactor(agents): remove Structured Output (Layer 2) to enable true streaming 背景：在 commit `bb1a113` 中引入了 4 层回复质量控制体系： - Layer 1: System Prompt (1095行详细指导) - Layer 2: Structured Output (Zod schema → output_config) - Layer 3: LLM-as-Judge (Haiku 4.5 评分) - Layer 4: Per-intent hard truncation (已在 `db8617d` 移除) Layer 2 (Structured Output) 的问题： 1. 阻塞流式输出 — output_config 强制模型输出 JSON，JSON 片段无法展示给用户，导致整个响应缓冲后才一次性输出 2. Zod 验证频繁崩溃 — intent 枚举值不匹配时 SDK 抛错，已出现 4 次 hotfix (`b55cd4b`, `db8617d`, `7af8c4d`, 及本次) 3. followUp 字段导致内容丢失 — 模型将回答内容分到 followUp 后被过滤 4. intent 分类仅用于日志，对用户体验无价值 5. z.string() 无 .max() 约束 — 实际不控制回答长度移除后，回答质量由以下机制保证（全部保留）： - Layer 1: System Prompt — 意图分类表、回答风格、长度指导 - Layer 3: LLM-Judge — 相关性/简洁性/噪音评分，不合格则自动重试 - API max_tokens: 2048 — 硬限制输出上限改动： - coordinator-agent.service.ts: 移除 zodOutputFormat/CoordinatorResponseSchema import 和 outputConfig 参数 - agent-loop.ts: 移除 text_delta 中的 outputConfig 守卫（文本现在直接流式输出）、移除 output_config API 参数、移除两个 Structured Output 验证失败恢复 catch 块、移除 JSON 解析 + safety net 块 - agent.types.ts: 从 AgentLoopParams 接口移除 outputConfig 字段 - coordinator-response.schema.ts: 清空 Zod schema/工具函数，保留历史备注效果： - 用户现在能看到逐字流式输出（token-by-token streaming） - 消除了 Structured Output 相关的所有崩溃风险 - 代码净减 ~130 行 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-07 11:15:48 -08:00
hailin	913a3fd375	fix(prompt): defer 99元 mention — never in first response to new users ## Problem User asks "我适合哪种移民方式?" as their first message → AI immediately mentions 99元 paid assessment. This is aggressive and off-putting for new users who are just exploring. ## Root Cause Intent table classified "适合什么" as assessment_request with instruction to immediately mention 99元. This conflicts with the conversion philosophy section that says "免费问答建立信任 → 付费评估". ## Fix (3 changes in coordinator-system-prompt.ts) 1. Intent table: assessment_request no longer says "immediately mention 99元". Instead references new handling rules below the table. 2. New "评估请求处理规则" section (after intent table): - Early conversation + no user info → exploratory question, NOT assessment request. Collect info first, give initial direction. - User shared info + explicitly asks "做个评估" → real assessment request, mention 99元. - User shared info but didn't ask → give free initial direction, don't proactively mention payment. 3. Assessment suggestion timing (section 5.6): - Added 3 prerequisites before mentioning 99元: a. At least 3 key info items collected b. Already gave free initial direction (user felt value) c. Conversation has gone 3-4+ rounds - Added absolute prohibition: never mention 99元 in first response. 4. Conversion boundary example: Changed misleading "我适合走高才通吗 → 需要评估" to nuanced guidance that distinguishes exploration from genuine assessment requests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-07 10:52:37 -08:00
hailin	7af8c4d8de	fix(agents): graceful recovery from structured output validation errors ## Problem SDK's Zod validation for `output_config` occasionally fails with: "Failed to parse structured output: invalid_value at path [intent]" This crashes the entire response — user sees nothing despite model generating a valid answer. ## Root Cause The Anthropic SDK validates streamed structured output against the Zod schema (CoordinatorResponseSchema) after streaming completes. When the model returns an intent value not in the z.enum() (rare but happens), the SDK throws during stream iteration or finalMessage(). ## Fix 1. Catch "Failed to parse structured output" errors in both: - Stream iteration catch block (for-await loop) - stream.finalMessage() catch block 2. Recover by extracting accumulated text from assistantBlocks 3. Manual JSON.parse (skips Zod validation — intent enum mismatch doesn't affect user-facing content) 4. Yield parsed.answer + parsed.followUp normally ## Also Included (from previous commit) - Removed INTENT_MAX_ANSWER_LENGTH hard truncation (弊大于利) - Only 2000-char safety net remains for extreme edge cases - followUp: non-question content always appended (prevents content loss) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-07 10:42:52 -08:00
hailin	db8617dda8	refactor(agents): remove per-intent hard truncation, keep 2000-char safety net Hard-coded INTENT_MAX_ANSWER_LENGTH limits caused mid-sentence truncation and content loss. Length control now relies on prompt + schema description + LLM-Judge (3 layers). Only a 2000-char safety net remains for extreme edge cases. Also simplified followUp: non-question followUp is now always appended (prevents model content split from silently dropping text). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-07 10:32:53 -08:00
hailin	b55cd4bc1e	fix(agents): widen answer length limits and preserve followUp continuations INTENT_MAX_ANSWER_LENGTH was too tight (objection_expression 200 chars truncated good responses). Bumped all limits ~25-50%. Also fixed followUp filter that silently dropped content when model split answer across answer+followUp fields — now appends followUp as continuation when answer ends mid-sentence. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-07 10:25:59 -08:00
hailin	366a9cda3a	feat(agents): tiered tool-calling system + KB coverage hint for smart routing P0: Enrich Chapter 10 with detailed policy facts (QMAS scoring, GEP A/B/C conditions, FAQ quick answers) so Claude can answer common questions directly without tool calls. Replace absolute rule "never answer from memory" with 3-tier system: Tier 1 (direct from Ch10), Tier 2 (search_knowledge), Tier 3 (invoke_policy_expert). P1: Context injector now always returns a kb_coverage_hint block — when KB has results it tells Claude to prefer KB over web_search; when KB has no results it suggests considering web_search. Web_search tool description updated to reference the hint. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-07 10:06:29 -08:00
hailin	b7e84ba3b6	fix(agents): only run InputGate on first message to prevent mid-conversation misclassification Short follow-up answers like "计算机，信息技术" were being classified as OFF_TOPIC (0.85) because the InputGate has no conversation context. Now the gate only runs when there are no previous messages (first message in conversation). Mid-conversation topic management is handled by the Coordinator prompt. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-07 09:43:33 -08:00
hailin	f5820f9c7f	feat(agents): add subtle conversion guidance to coordinator prompt Add section 5.6 "隐性转化引导" with trust-first conversion philosophy: - Free facts vs paid analysis boundary - "Taste-then-sell" strategy with positive but vague hints - Assessment suggestion limited to max once per conversation - Natural urgency only when fact-supported - Post-assessment → full service transition only when user asks - Anti-annoyance red line: never make user feel pushed to pay Recalibrate info exchange (4.3): warm acknowledgment without deep analysis. Add value framing (4.4) and post-assessment guidance (4.5). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-07 09:33:52 -08:00
hailin	fb966244bc	fix(agents): enforce 99 RMB assessment fee — remove "free assessment" language Update coordinator system prompt to enforce pricing rules: - All assessments cost 99 RMB (one-time per user), no free assessments - Must collect payment before calling assessment expert - Add fee inquiry intent type to response strategy table - Update generate_payment tool description with fixed pricing - Replace "免费初步咨询" with tiered service model Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-07 09:22:37 -08:00
hailin	636b10b733	feat(web): auto-scroll on all state changes + completed agent badges auto-fade Fix auto-scroll by adding missing dependencies (currentConversationId, isStreaming, completedAgents). Completed agent badges now show for 2.5s then smoothly fade out instead of accumulating, keeping the status area clean. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-07 09:12:32 -08:00
hailin	af8aea6b03	feat(web): move agent status inline with typing indicator for better UX Instead of showing agent status in a separate panel below the chat, display it inline beneath the typing dots ("...") in the message flow. The dots remain the primary waiting indicator; agent status appears below as supplementary context during specialist agent invocations. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-07 09:05:41 -08:00
hailin	40a0513b05	fix(agents): strip markdown code fences from InputGate Haiku response Haiku sometimes returns JSON wrapped in ```json ... ``` code blocks, causing JSON.parse to fail. Strip markdown fences before parsing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-07 08:59:19 -08:00

1 2 3 4

187 Commits