Commit Graph

214 Commits

Author SHA1 Message Date
hailin 1a1573dda3 feat(resilience): add circuit breaker for downstream services
- New CircuitBreaker class: CLOSED → OPEN → HALF_OPEN three-state model
- Zero external dependencies, ~90 lines, fail-open semantics
- KnowledgeClientService: threshold=5, cooldown=60s, protects all 9 endpoints
- PaymentClientService: threshold=3, cooldown=30s, protects all 7 endpoints
- Both services refactored to use protectedFetch() — cleaner code, fewer try-catch
- Replaces verbose per-method error handling with centralized circuit breaker
- When tripped: returns null/empty fallback instantly, no network call

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 04:21:30 -08:00
hailin 2ebc8e6da6 feat(agents): stream specialist agent progress to frontend
- Convert BaseSpecialistService.callClaude() from sync .create() to streaming .stream()
- Add onProgress callback to SpecialistExecutionOptions for real-time text delta reporting
- All 6 specialist convenience methods now accept optional options parameter
- Coordinator creates throttled progress callback (every 300 chars) pushing agent_progress events
- Agent loop drains accumulated progress events after each tool execution batch
- WebSocket gateway forwards agent_progress events to frontend
- Progress event sink shared between tool executor and agent loop via closure

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 04:17:37 -08:00
hailin 198ff4b349 feat(agents): add PreToolUse/PostToolUse hook system for tool call interception
- New ToolHooksService with dynamic hook registration (pre/post)
- Built-in audit logging: tool name, type, user, duration, success/failure
- Fail-open design: individual hook failures don't block tool execution
- Integrated into coordinator's createToolExecutor with full context
- Hook context includes: toolName, toolType (agent/direct/mcp), traceId, timing
- Supports future extensions: rate limiting, permission checks, analytics

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 04:09:14 -08:00
hailin 02ee4311dc feat(observability): add trace ID propagation across agent pipeline
- Extend TenantStore with traceId + traceStartTime in AsyncLocalStorage
- Generate traceId (UUID-12) at WebSocket gateway entry point
- Propagate traceId through AgentLoopParams → agentLoop → specialists
- Add [trace:xxx] prefix to all logger calls in agent-loop, coordinator, and specialists
- Replace console.log with NestJS Logger in ConversationGateway
- Include traceId in stream_start event for frontend correlation
- Add traceId to AgentExecutionRecord and BaseStreamEvent

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 04:05:24 -08:00
hailin ea6dff3a4e feat(agents): add model identity protection — prompt rules + code-level output filter
Prevent AI from revealing underlying model name (Claude/GPT/etc.) under any
circumstance including jailbreak attempts. Two defense layers:
- Prompt: anti-jailbreak rules + "小艾引擎" branding in coordinator system prompt
- Code: sanitizeModelLeaks() regex filter in agent-loop.ts streaming output

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 13:02:22 -08:00
hailin 2a8a15fcb6 fix: resolve ClaudeModule DI crash + historical QR code display bug
1. ClaudeModule missing ConversationORM in TypeOrmModule.forFeature —
   ImmigrationToolsService now depends on ConversationORMRepository
   (added in query_user_profile), but ClaudeModule only had TokenUsageORM.
   Fix: add ConversationORM to ClaudeModule's TypeORM imports.

2. Historical messages show "支付创建失败" for payment QR codes —
   toolCall.result is stored as JSON string in DB metadata JSONB.
   Live streaming (useChat.ts) parses it correctly, but REST API
   load (chatStore.ts → MessageBubble.tsx) does not.
   Fix: normalize toolCall.result in ToolCallResult component —
   JSON.parse if string, pass through if already object.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 12:24:15 -08:00
hailin 43d4102e1f feat(agents): add query_user_profile tool for user info lookup
新增 query_user_profile 工具,让 AI agent 能回答用户关于自身信息的查询,
例如"这是我第几次咨询?"、"你记得我的信息吗?"、"我之前咨询过什么?"

## 问题背景
当用户问"这是我第几次跟你咨询?"时,AI 无法回答,因为没有任何工具
能查询用户的历史咨询数据。

## 实现方案:双层设计

### 第一层:被动上下文注入(Context Injector)
- context-injector.service.ts 注入 ConversationORM repo + TenantContextService
- buildConversationStatsBlock() 现在自动查询用户累计咨询次数
- 每次对话自动注入 `用户累计咨询次数: N 次(含本次对话)`
- 简单问题("这是第几次?")AI 可直接从上下文回答,零工具调用

### 第二层:主动工具调用(query_user_profile)
用户需要详细信息时,AI 调用此工具,返回完整档案:
- 咨询统计:累计次数、首次/最近咨询时间、类别分布
- 最近对话:最近 10 个对话的标题、类别、阶段
- 用户画像:系统记忆中的事实(学历/年龄/职业)、偏好、意图
- 订单统计:总单数、已支付、待支付

## 修改文件
- agents.module.ts: 添加 ConversationORM 到 TypeORM imports
- coordinator-tools.ts: 新增 query_user_profile 工具定义(只读)
- immigration-tools.service.ts: 注入 ConversationORM repo + TenantContextService,
  实现 queryUserProfile() 方法(并行查询对话+记忆+订单)
- coordinator-system-prompt.ts: 第3.3节添加工具文档和使用指引
- context-injector.service.ts: 注入 repo,conversation_stats 块添加累计咨询次数

## 依赖关系
- 无循环依赖:直接使用 TypeORM Repository<ConversationORM>(数据访问层),
  不依赖 ConversationService(避免 AgentsModule ↔ ConversationModule 循环)
- TenantContextService 全局可用,确保多租户隔离

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 12:17:23 -08:00
hailin 389f975e33 fix(payment): return paymentUrl from adapters, strip base64 from tool output
Alipay/WeChat adapters now return the source payment URL alongside the
QR base64. The generate_payment tool only returns paymentUrl (short text)
to Claude API — base64 qrCodeUrl is stripped to prevent AI from dumping
raw data:image into text responses. Frontend QRCodeSVG renders from
paymentUrl instead of base64.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 11:44:54 -08:00
hailin 6609e50100 fix(payment): add /api/v1 prefix to PaymentClientService URLs
payment-service uses setGlobalPrefix('api/v1'), so all routes are
under /api/v1/orders, /api/v1/payments, etc. PaymentClientService
was calling /orders directly, resulting in 404:

  Cannot POST /orders → 创建订单失败

Fixed all 7 endpoint URLs to include the /api/v1 prefix.
Same pattern as file-service fix (FILE_SERVICE_URL).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 11:26:26 -08:00
hailin 6767215f83 refactor(agents): remove Structured Output (Layer 2) to enable true streaming
背景:
在 commit bb1a113 中引入了 4 层回复质量控制体系:
- Layer 1: System Prompt (1095行详细指导)
- Layer 2: Structured Output (Zod schema → output_config)
- Layer 3: LLM-as-Judge (Haiku 4.5 评分)
- Layer 4: Per-intent hard truncation (已在 db8617d 移除)

Layer 2 (Structured Output) 的问题:
1. 阻塞流式输出 — output_config 强制模型输出 JSON,JSON 片段无法展示给
   用户,导致整个响应缓冲后才一次性输出
2. Zod 验证频繁崩溃 — intent 枚举值不匹配时 SDK 抛错,已出现 4 次 hotfix
   (b55cd4b, db8617d, 7af8c4d, 及本次)
3. followUp 字段导致内容丢失 — 模型将回答内容分到 followUp 后被过滤
4. intent 分类仅用于日志,对用户体验无价值
5. z.string() 无 .max() 约束 — 实际不控制回答长度

移除后,回答质量由以下机制保证(全部保留):
- Layer 1: System Prompt — 意图分类表、回答风格、长度指导
- Layer 3: LLM-Judge — 相关性/简洁性/噪音评分,不合格则自动重试
- API max_tokens: 2048 — 硬限制输出上限

改动:
- coordinator-agent.service.ts: 移除 zodOutputFormat/CoordinatorResponseSchema
  import 和 outputConfig 参数
- agent-loop.ts: 移除 text_delta 中的 outputConfig 守卫(文本现在直接流式
  输出)、移除 output_config API 参数、移除两个 Structured Output 验证失败
  恢复 catch 块、移除 JSON 解析 + safety net 块
- agent.types.ts: 从 AgentLoopParams 接口移除 outputConfig 字段
- coordinator-response.schema.ts: 清空 Zod schema/工具函数,保留历史备注

效果:
- 用户现在能看到逐字流式输出(token-by-token streaming)
- 消除了 Structured Output 相关的所有崩溃风险
- 代码净减 ~130 行

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 11:15:48 -08:00
hailin 913a3fd375 fix(prompt): defer 99元 mention — never in first response to new users
## Problem
User asks "我适合哪种移民方式?" as their first message → AI immediately
mentions 99元 paid assessment. This is aggressive and off-putting for new
users who are just exploring.

## Root Cause
Intent table classified "适合什么" as assessment_request with instruction
to immediately mention 99元. This conflicts with the conversion philosophy
section that says "免费问答建立信任 → 付费评估".

## Fix (3 changes in coordinator-system-prompt.ts)

1. **Intent table**: assessment_request no longer says "immediately mention
   99元". Instead references new handling rules below the table.

2. **New "评估请求处理规则" section** (after intent table):
   - Early conversation + no user info → exploratory question, NOT
     assessment request. Collect info first, give initial direction.
   - User shared info + explicitly asks "做个评估" → real assessment
     request, mention 99元.
   - User shared info but didn't ask → give free initial direction,
     don't proactively mention payment.

3. **Assessment suggestion timing** (section 5.6):
   - Added 3 prerequisites before mentioning 99元:
     a. At least 3 key info items collected
     b. Already gave free initial direction (user felt value)
     c. Conversation has gone 3-4+ rounds
   - Added absolute prohibition: never mention 99元 in first response.

4. **Conversion boundary example**: Changed misleading "我适合走高才通吗
   → 需要评估" to nuanced guidance that distinguishes exploration from
   genuine assessment requests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 10:52:37 -08:00
hailin 7af8c4d8de fix(agents): graceful recovery from structured output validation errors
## Problem
SDK's Zod validation for `output_config` occasionally fails with:
  "Failed to parse structured output: invalid_value at path [intent]"
This crashes the entire response — user sees nothing despite model
generating a valid answer.

## Root Cause
The Anthropic SDK validates streamed structured output against the Zod
schema (CoordinatorResponseSchema) after streaming completes. When the
model returns an intent value not in the z.enum() (rare but happens),
the SDK throws during stream iteration or finalMessage().

## Fix
1. Catch "Failed to parse structured output" errors in both:
   - Stream iteration catch block (for-await loop)
   - stream.finalMessage() catch block
2. Recover by extracting accumulated text from assistantBlocks
3. Manual JSON.parse (skips Zod validation — intent enum mismatch
   doesn't affect user-facing content)
4. Yield parsed.answer + parsed.followUp normally

## Also Included (from previous commit)
- Removed INTENT_MAX_ANSWER_LENGTH hard truncation (弊大于利)
- Only 2000-char safety net remains for extreme edge cases
- followUp: non-question content always appended (prevents content loss)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 10:42:52 -08:00
hailin db8617dda8 refactor(agents): remove per-intent hard truncation, keep 2000-char safety net
Hard-coded INTENT_MAX_ANSWER_LENGTH limits caused mid-sentence truncation and
content loss. Length control now relies on prompt + schema description + LLM-Judge
(3 layers). Only a 2000-char safety net remains for extreme edge cases.

Also simplified followUp: non-question followUp is now always appended (prevents
model content split from silently dropping text).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 10:32:53 -08:00
hailin b55cd4bc1e fix(agents): widen answer length limits and preserve followUp continuations
INTENT_MAX_ANSWER_LENGTH was too tight (objection_expression 200 chars truncated
good responses). Bumped all limits ~25-50%. Also fixed followUp filter that silently
dropped content when model split answer across answer+followUp fields — now appends
followUp as continuation when answer ends mid-sentence.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 10:25:59 -08:00
hailin 366a9cda3a feat(agents): tiered tool-calling system + KB coverage hint for smart routing
P0: Enrich Chapter 10 with detailed policy facts (QMAS scoring, GEP A/B/C
conditions, FAQ quick answers) so Claude can answer common questions directly
without tool calls. Replace absolute rule "never answer from memory" with
3-tier system: Tier 1 (direct from Ch10), Tier 2 (search_knowledge), Tier 3
(invoke_policy_expert).

P1: Context injector now always returns a kb_coverage_hint block — when KB has
results it tells Claude to prefer KB over web_search; when KB has no results
it suggests considering web_search. Web_search tool description updated to
reference the hint.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 10:06:29 -08:00
hailin b7e84ba3b6 fix(agents): only run InputGate on first message to prevent mid-conversation misclassification
Short follow-up answers like "计算机,信息技术" were being classified as
OFF_TOPIC (0.85) because the InputGate has no conversation context. Now the
gate only runs when there are no previous messages (first message in conversation).
Mid-conversation topic management is handled by the Coordinator prompt.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 09:43:33 -08:00
hailin f5820f9c7f feat(agents): add subtle conversion guidance to coordinator prompt
Add section 5.6 "隐性转化引导" with trust-first conversion philosophy:
- Free facts vs paid analysis boundary
- "Taste-then-sell" strategy with positive but vague hints
- Assessment suggestion limited to max once per conversation
- Natural urgency only when fact-supported
- Post-assessment → full service transition only when user asks
- Anti-annoyance red line: never make user feel pushed to pay

Recalibrate info exchange (4.3): warm acknowledgment without deep analysis.
Add value framing (4.4) and post-assessment guidance (4.5).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 09:33:52 -08:00
hailin fb966244bc fix(agents): enforce 99 RMB assessment fee — remove "free assessment" language
Update coordinator system prompt to enforce pricing rules:
- All assessments cost 99 RMB (one-time per user), no free assessments
- Must collect payment before calling assessment expert
- Add fee inquiry intent type to response strategy table
- Update generate_payment tool description with fixed pricing
- Replace "免费初步咨询" with tiered service model

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 09:22:37 -08:00
hailin 636b10b733 feat(web): auto-scroll on all state changes + completed agent badges auto-fade
Fix auto-scroll by adding missing dependencies (currentConversationId, isStreaming,
completedAgents). Completed agent badges now show for 2.5s then smoothly fade out
instead of accumulating, keeping the status area clean.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 09:12:32 -08:00
hailin af8aea6b03 feat(web): move agent status inline with typing indicator for better UX
Instead of showing agent status in a separate panel below the chat,
display it inline beneath the typing dots ("...") in the message flow.
The dots remain the primary waiting indicator; agent status appears
below as supplementary context during specialist agent invocations.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 09:05:41 -08:00
hailin 40a0513b05 fix(agents): strip markdown code fences from InputGate Haiku response
Haiku sometimes returns JSON wrapped in ```json ... ``` code blocks,
causing JSON.parse to fail. Strip markdown fences before parsing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 08:59:19 -08:00
hailin fa835e4f56 fix(agents): preserve image content blocks in context injection — fixes 209K token overflow
injectIntoMessages() was JSON.stringify-ing array content (with image blocks),
turning base64 data into text tokens (~170K) instead of image tokens (~1,600).
Fix: append context as a new text block in the array, preserving image block format.

Also fixes token estimation to count images at ~1,600 tokens instead of base64 char length,
and adds debug logging for API call token composition.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 08:51:24 -08:00
hailin 7dc364d9b3 fix(agents): raise compaction threshold to 160K (80% of 200K limit)
80K was too aggressive and caused premature context loss. Now triggers
at 160K tokens with a target of 80K after compaction.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 08:40:18 -08:00
hailin 033e476a6f fix(agents): wire up autoCompactIfNeeded to prevent token overflow
The auto-compaction logic (threshold 80K tokens, summarize older
messages via Haiku) existed but was never called in sendMessage flow.
Now called after context injection, before agent loop.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 08:29:36 -08:00
hailin f2903bd67a fix(agents): use text placeholders for historical attachments to avoid token overflow
Historical images/PDFs were being re-downloaded and base64-encoded for
every API call, causing 200K+ token requests. Now only the current
message includes full attachment blocks; historical ones use text
placeholders like "[用户上传了图片: photo.png]".

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 08:22:02 -08:00
hailin 06226a3d74 fix(db): drop files_user_id_fkey — anonymous users may not exist in users table
The conversations table has no FK on user_id, but files had one, causing
500 errors on file upload when the anonymous user wasn't registered.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 08:11:37 -08:00
hailin 308dd7798e fix(web): load historical messages when opening a conversation
ChatPage only set currentConversationId but never fetched messages from
the API, causing historical conversations to show the welcome screen.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 08:00:26 -08:00
hailin 15d42315ed fix(docling): align volume mount with HF default cache path
Build preloads models to /root/.cache/huggingface (HF default).
Volume must mount there too, not a separate /models path.
Remove HF_HOME env override to keep paths consistent.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 07:38:28 -08:00
hailin 9b357fe01c chore(docling): add .gitignore and .dockerignore
Exclude __pycache__, .pyc files from git tracking and Docker build context.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 07:30:05 -08:00
hailin 73dee93d19 feat(docling): persist model cache via Docker volume
- Add docling_models volume mounted at /models in container
- Set HF_HOME=/models/huggingface at runtime (via docker-compose env)
- Models download once → persist in volume → survive container rebuilds
- Build-time preload uses || to not block build if network unavailable

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 07:18:14 -08:00
hailin 764613bd86 fix(docling): use standalone script for model pre-download
Inline Python one-liner had syntax errors (try/except/finally can't be
single-line). Move to scripts/preload_models.py for reliable execution.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 07:16:20 -08:00
hailin d725864cd6 fix(docling): pre-download models during Docker build
DocumentConverter() constructor only sets up config, models are lazily
downloaded on first convert(). Fix by running an actual PDF conversion
during build to trigger HuggingFace model download and cache.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 07:13:54 -08:00
hailin 0985214ab7 feat(deploy): add docling service support to deploy.sh
Add docling (Python document parsing service) to all deploy.sh operations:
- SERVICE_PORTS, DOCKER_SERVICES maps
- build, rebuild, start, stop, restart commands
- start_all_backend (ordered before knowledge-service, which depends on it)
- Help text and examples updated

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 05:34:23 -08:00
hailin 57d21526a5 feat(knowledge): add Docling document parsing microservice
Add IBM Docling as a Python FastAPI microservice for high-quality document
parsing with table structure recognition (TableFormer ~94% accuracy) and
OCR support, replacing pdf-parse/mammoth as the primary text extractor.

Architecture:
- New docling-service (Python FastAPI, port 3007) in Docker network
- knowledge-service calls docling-service via HTTP POST multipart/form-data
- Graceful fallback: if Docling fails, falls back to pdf-parse/mammoth
- Text/Markdown files skip Docling (no benefit for plain text)

Changes:
- New: packages/services/docling-service/ (main.py, Dockerfile, requirements.txt)
- docker-compose.yml: add docling-service, wire DOCLING_SERVICE_URL to
  knowledge-service, add missing FILE_SERVICE_URL to conversation-service
- text-extraction.service.ts: inject ConfigService, add extractViaDocling()
  with automatic fallback to legacy extractors
- .env.example: add FILE_SERVICE_PORT/URL and DOCLING_SERVICE_PORT/URL

Inter-service communication map:
  conversation-service → file-service (FILE_SERVICE_URL, attachments)
  conversation-service → knowledge-service (KNOWLEDGE_SERVICE_URL, RAG)
  knowledge-service → docling-service (DOCLING_SERVICE_URL, document parsing)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 05:24:10 -08:00
hailin 470ec9a64e feat(web-client): add per-type file size validation on upload
Enforce Claude API file size limits at upload time with user-friendly
error messages:
- Images: max 5MB (Claude API hard limit)
- PDF: max 25MB (32MB request limit minus headroom)
- Other documents: max 50MB (general upload limit)

Replaced duplicate ALLOWED_TYPES/MAX_FILE_SIZE in InputArea with shared
validateFile() from fileService, showing alert() on rejection.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 04:57:54 -08:00
hailin 5338bdfc0f fix(agents): correct Claude API file size limits (image 5MB, PDF 25MB)
Claude API enforces a hard 5MB limit per image (not 20MB as previously
set). PDFs have a 32MB total request limit; set individual PDF cap to
25MB to leave room for prompt/messages. The downloadAsBase64 method now
accepts a per-type maxSize parameter.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 04:55:54 -08:00
hailin e867ba5529 fix(files): replace MinIO presigned URLs with API proxy + base64 for Claude
MinIO presigned URLs use Docker-internal hostname (minio:9000), making
them inaccessible from both Claude API servers and user browsers.

Changes:
- file-service: add /files/:id/content and /files/:id/thumbnail proxy
  endpoints that stream file data from MinIO
- file-service: toResponseDto now returns API proxy paths instead of
  MinIO presigned URLs
- coordinator: buildAttachmentBlocks now downloads files via file-service
  internal API (http://file-service:3006) and converts to base64 for
  Claude API (images, PDFs) or embeds text content directly
- Configurable FILE_SERVICE_URL env var for service-to-service calls

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 04:49:39 -08:00
hailin 8f7b633041 fix(agents): download text file content for Claude instead of passing URL
Claude API cannot fetch arbitrary URLs. Text-based attachments (txt, csv,
json, md) are now downloaded via their presigned MinIO URL and embedded
directly as text blocks. PDF uses Claude's native document block. Added
50KB size limit with truncation for large text files.

buildMessages() is now async to support text content fetching.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 04:36:17 -08:00
hailin 5076e16cc7 feat: add document support for Claude + image thumbnail generation
1. Coordinator now sends all attachment types to Claude:
   - Images → native image blocks (existing)
   - PDF → native document blocks (Claude PDF support)
   - Text files (txt, csv, json, md) → text blocks with filename
   Extracted common buildAttachmentBlocks() helper.

2. File-service generates thumbnails on image upload:
   - Uses sharp to resize to 400x400 max (inside fit, no upscale)
   - Output as WebP at 80% quality for smaller file size
   - Stored in MinIO under thumbnails/ prefix
   - Generated for both direct upload and presigned URL confirm
   - Non-blocking: thumbnail failure doesn't break upload

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 04:31:53 -08:00
hailin cf2fd07ead fix(file-service): sync tenantId in domain entity and add migration
FileORM had tenant_id column but FileEntity domain class was missing it,
causing "column FileORM.tenant_id does not exist" errors on production.

- Add tenantId to FileEntity (constructor, create, fromPersistence)
- Pass tenantId in repository toEntity() mapping
- Add idempotent migration script for files.tenant_id + indexes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 04:21:43 -08:00
hailin 6592e72758 fix(agents): filter out internal strategy notes from followUp output
The model was putting internal notes like "引导回移民话题" in the followUp
field instead of actual user-facing questions. Two fixes:

1. Schema: describe followUp as "必须以?结尾,禁止填写内部策略备注"
2. agent-loop: only yield followUp if it contains ?or ? (question mark)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 02:06:08 -08:00
hailin d608403535 fix(agents): upgrade coordinator model to Sonnet 4.5 for structured output support
claude-sonnet-4-20250514 does not support output_config (structured outputs).
Changed coordinator model to claude-sonnet-4-5-20250929 which supports it.
Specialist agents remain on their original models (no output_config needed).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 01:58:31 -08:00
hailin 9e9865acb0 fix(tools): 修复 coordinator-tools 与 immigration-tools 之间的 input_schema 不一致
## 问题

全链路检查发现 coordinator-tools.ts(Claude 实际使用的工具定义)与
immigration-tools.service.ts(实际执行器)之间有 4 处 input_schema 不一致,
会导致 Claude 发送的参数无法被正确解析。

## 修复

### check_off_topic
- coordinator 发 `query`,handler 读 `question`
- Fix: handler 同时支持 `query` 和 `question` 两个字段名

### collect_assessment_info
- coordinator 发 `{ userId, field, value }`(单字段模式)
- handler 读 `{ category, age, education, ... }`(批量模式)
- Fix: handler 同时支持两种输入格式

### generate_payment
- coordinator 旧 schema: `{ userId, serviceType, amount, description }`
- handler 需要: `{ serviceType, category, paymentMethod }`
- Fix: 更新 coordinator schema 为 `{ serviceType, category, paymentMethod }`
  - serviceType enum 改为 ASSESSMENT/CONSULTATION/DOCUMENT_REVIEW(匹配 payment-service)
  - 添加 category enum 和 paymentMethod enum
  - 移除 userId(从 context 获取)和 amount(由 payment-service 定价)

### save_user_memory
- coordinator 旧 schema 多余 `userId`(handler 用 context.userId)
- coordinator 发 `importance` 但 handler 不读
- handler 支持 `category` 但 coordinator 未定义
- Fix: coordinator schema 移除 userId,移除 importance,添加 category

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 01:31:38 -08:00
hailin a3f2be078b feat(payment): P2 — 订单管理增强,支持取消订单和订单详情查询
## 后端改动

### PaymentClientService 增强
- 新增 `getOrderDetail(orderId)` — 获取完整订单信息(含支付详情)
- 新增 `cancelOrder(orderId)` — 取消未支付订单(调用 POST /orders/:id/cancel)

### 新增 cancel_order 工具
- 工具定义: 接收 orderId,取消未支付订单
- 实现: 调用 PaymentClientService.cancelOrder()
- 成功返回 { success, orderId, status, message }
- 失败返回友好错误信息(如"只有未支付的订单才能取消")
- coordinator-tools.ts 注册,concurrency map 标记 false(写操作)

## 前端改动

### cancel_order 结果渲染
- 成功: 绿色卡片 + CheckCircle 图标 + 成功提示
- 失败: 红色卡片 + AlertCircle 图标 + 错误原因
- 显示订单号

## 注意事项
- payment-service 暂无退款 API,cancel_order 仅限未支付订单
- 退款功能待 payment-service 侧实现后再扩展

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 01:23:28 -08:00
hailin db7964a461 feat(chat): P1 — 评估结果可视化卡片,Assessment Expert 输出渲染为结构化报告
## 新建 AssessmentResultCard 组件

- 渲染 Assessment Expert 返回的结构化评估数据
- 综合适合度分数(顶部大字展示,颜色编码)
- 6 个移民类别评估卡片(QMAS/GEP/IANG/TTPS/CIES/TECHTAS)
  - 分数条形图(CSS 实现,无需 chart 库)
  - 颜色梯度:绿色(90+) → 蓝色(70+) → 黄色(50+) → 橙色(30+) → 红色
  - 推荐类别高亮(primary 边框 + Award 图标)
  - 优势(highlights)、风险(concerns)、缺失信息(missingInfo) 分组展示
  - 子类别标签(如 A类、综合计分制)
- 排序:推荐类别优先,其次按分数降序
- 底部建议区块

## ToolCallResult 集成

- 识别 invoke_assessment_expert 工具结果
- 自动 JSON.parse(assessment expert 返回 JSON 字符串)
- 存在 assessments 数组时渲染 AssessmentResultCard

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 01:20:12 -08:00
hailin df754ce8b8 feat(payment): P0 — 支付闭环,Agent 可创建真实订单并生成支付二维码
## 后端改动

### 新增 PaymentClientService
- 新建 `infrastructure/payment/payment-client.service.ts`
  - HTTP 客户端封装,调用 payment-service API(端口 3002)
  - 方法: createOrder, createPayment, checkPaymentStatus, getOrderStatus, getUserOrders
  - 基于 native fetch,模式与 KnowledgeClientService 一致
- 新建 `infrastructure/payment/payment.module.ts`
- AgentsModule 导入 PaymentModule

### 重写 generate_payment 工具
- 删除所有 MOCK 数据(fake orderId, placeholder QR URL)
- 实际调用 payment-service: createOrder → createPayment → 返回真实 QR URL
- 返回 orderId, paymentId, qrCodeUrl, paymentUrl, expiresAt

### 新增 check_payment_status 工具
- 查询订单支付状态(调用 payment-service GET /orders/:id/status)
- 返回 status, statusLabel(中文映射), paidAt
- 在 coordinator-tools.ts 和 concurrency map 中注册(只读 safe=true)

### 新增 query_order_history 工具
- 查询用户历史订单列表(调用 payment-service GET /orders)
- 返回 orders 数组含 orderId, serviceType, amount, status, createdAt
- 在 coordinator-tools.ts 和 concurrency map 中注册(只读 safe=true)

## 前端改动

### QR 码渲染
- 安装 qrcode.react 4.2.0
- ToolCallResult 组件使用 QRCodeSVG 渲染真实二维码
- 支持 qrCodeUrl(二维码)和 paymentUrl(跳转链接)两种支付方式
- 显示订单号、金额、过期时间

### 支付状态卡片
- check_payment_status 结果渲染为彩色状态卡片
- 已支付=绿色, 待支付=黄色, 已取消=红色, 已退款=橙色

### 订单历史列表
- query_order_history 结果渲染为订单列表卡片
- 每行显示: 类别、日期、金额、状态标签

### WebSocket 工具事件处理
- tool_result 事件收集到 pendingToolResults(chatStore 新增状态)
- stream_end 时将 toolResults 注入消息 metadata.toolCalls
- stream_start 时清空 pendingToolResults

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 01:17:14 -08:00
hailin bb1a1139a3 feat(agents): add 4-layer response quality control — structured outputs, LLM judge, smart truncation
AI回复质量硬约束系统,解决核心问题:AI无法用最少的语言精准回答用户问题。

## 四层防线架构

### Layer 1 — Prompt 优化 (软约束)
- coordinator-system-prompt.ts: 新增"最高优先级原则:精准回答"章节
  - 意图分类表(7种)+ 每种对应长度和回答策略
  - 错误示范 vs 正确示范对比
  - "宁可太短,不可太长"原则
  - 最终提醒三条:精准回答 > 准确性 > 简洁就是专业
- policy-expert-prompt.ts: 精简输出格式
- objection-handler-prompt.ts: 微调

### Layer 2 — Structured Outputs (格式约束)
- 新文件 coordinator-response.schema.ts: Zod schema 定义
  - intent: 7种意图分类 (factual/yes_no/comparison/assessment/objection/detailed/casual)
  - answer: 回复文本
  - followUp: 可选跟进问题
- agent-loop.ts: 通过 output_config 传入 Claude API,强制 JSON 输出
  - 流式模式下抑制 text delta(JSON 片段不展示给用户)
  - 流结束后解析 JSON,提取 answer 字段 yield 给前端
  - JSON 解析失败时回退到原始文本(安全降级)
- coordinator-agent.service.ts: 传入 zodOutputFormat(CoordinatorResponseSchema)
- agent.types.ts: AgentLoopParams 新增 outputConfig 字段

### Layer 3 — LLM-as-Judge (语义质检)
- evaluation-rule.entity.ts: 新增 LLM_JUDGE 规则类型(第9种)
- evaluation-gate.service.ts:
  - 注入 ConfigService + 初始化 Anthropic client (Haiku 4.5)
  - evaluateRule 改为 async(支持异步 LLM 调用)
  - 新增 checkLlmJudge():评估 relevance/conciseness/noise 三维度
  - 可配置阈值:minRelevance(7), minConciseness(6), maxNoise(3)
  - 5s 超时 + 异常默认通过(非阻塞)
  - EvaluationContext 新增 userMessage 字段
- coordinator-agent.service.ts: 传入 userMessage 到评估门控

### Layer 4 — 程序级硬截断 (物理约束)
- coordinator-response.schema.ts:
  - INTENT_MAX_ANSWER_LENGTH: 按意图限制字符数
    factual=200, yes_no=120, comparison=250, assessment=400,
    objection=200, detailed=500, casual=80
  - MAX_FOLLOWUP_LENGTH: 80 字符
  - smartTruncate(): 在句子边界处智能截断(中英文标点)
- agent-loop.ts: JSON 解析后按 intent 强制截断 answer 和 followUp
- max_tokens 从 4096 降至 2048

## Bug 修复
- agent-loop.ts: currentTextContent 在 content_block_stop 时被重置为空字符串,
  导致评估门控收到空文本。改为从 finalMessage.content 提取 responseText。

## 依赖升级
- @anthropic-ai/sdk: 0.52.0 → 0.73.0 (支持 output_config)
- 新增 zod@4.3.6 (Structured Output schema 定义)

## 文件清单 (1 new + 10 modified)
- NEW: agents/schemas/coordinator-response.schema.ts
- MOD: agents/coordinator/agent-loop.ts (核心改造)
- MOD: agents/coordinator/coordinator-agent.service.ts
- MOD: agents/coordinator/evaluation-gate.service.ts
- MOD: agents/types/agent.types.ts
- MOD: agents/prompts/coordinator-system-prompt.ts
- MOD: agents/prompts/policy-expert-prompt.ts
- MOD: agents/prompts/objection-handler-prompt.ts
- MOD: domain/entities/evaluation-rule.entity.ts
- MOD: package.json + pnpm-lock.yaml

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 01:01:05 -08:00
hailin 93ed3343de refactor(knowledge): separate file upload into independent entry point
将知识库的"新建文章"和"上传文件"拆分为两个独立入口:

UI 改动:
- 移除 Segmented 切换器,"新建文章"弹窗恢复为纯手动输入
- 新增独立的"上传文件"按钮 + 上传弹窗(Upload.Dragger)
- 上传提取完成后自动打开"确认提取内容"弹窗,预填标题+内容
- 管理员编辑确认后保存,文章来源标记为 EXTRACT

后端改动:
- CreateArticleDto 新增可选 source 字段
- Controller 使用 dto.source || MANUAL(不再硬编码 MANUAL)

流程:
- 新建文章 → 手动输入 → source = MANUAL
- 上传文件 → 提取文本 → 编辑确认 → source = EXTRACT

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 23:29:37 -08:00
hailin fc9e0cd17b fix(nginx): fix admin SPA routing fallback to correct index.html
try_files 的最后一项 /index.html 会落到 web-client 的根 index.html,
导致 /admin/* 子路由(如 /admin/knowledge)加载 web-client 而非 admin-client。
修改为 /admin/index.html 以正确返回管理后台的 SPA 入口。

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 23:24:22 -08:00
hailin e16ec7930d feat(knowledge): add file upload with text extraction for knowledge base
支持在管理后台知识库页面上传文件(PDF、Word、TXT、Markdown),
自动提取文本内容,管理员预览编辑后保存为知识库文章。

## 后端 (knowledge-service)

- 新增 TextExtractionService:文件文本提取服务
  - PDF 提取:使用 pdf-parse v2 (PDFParse class API)
  - Word (.docx) 提取:使用 mammoth.extractRawText()
  - TXT/Markdown:直接 UTF-8 解码
  - 支持中英文混合字数统计
  - 文件大小限制 200MB,类型校验(MIME 白名单)
  - 空文本 PDF(扫描件/图片)返回友好错误提示

- 新增上传接口:POST /knowledge/articles/upload
  - 使用 NestJS FileInterceptor 处理 multipart/form-data
  - 仅提取文本并返回,不直接创建文章(两步流程)
  - 返回:extractedText, suggestedTitle, wordCount, pageCount

- 新增 ExtractedTextResponse DTO
- KnowledgeModule 注册 TextExtractionService

## 前端 (admin-client)

- knowledge.api.ts:新增 uploadFile() 方法(FormData + 120s 超时)
- useKnowledge.ts:新增 useUploadKnowledgeFile hook
- KnowledgePage.tsx:
  - 新增 Segmented 切换器(手动输入 / 文件上传),仅新建时显示
  - 文件上传模式显示 Upload.Dragger 拖拽上传区域
  - 上传后自动提取文本,填入标题+内容字段
  - 提取完成自动切回手动模式,管理员可预览编辑后保存
  - 显示提取结果(字数、页数)

## 用户流程

新建文章 → 切换"文件上传" → 拖入/选择文件 → 系统提取文本
→ 自动填入标题+内容 → 管理员编辑确认 → 点击保存

## 依赖

- pdf-parse@^2.4.5(PDF 文本提取)
- mammoth@^1.8.0(Word 文档文本提取)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 22:58:19 -08:00