Hard-coded INTENT_MAX_ANSWER_LENGTH limits caused mid-sentence truncation and
content loss. Length control now relies on prompt + schema description + LLM-Judge
(3 layers). Only a 2000-char safety net remains for extreme edge cases.
Also simplified followUp: non-question followUp is now always appended (prevents
model content split from silently dropping text).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
INTENT_MAX_ANSWER_LENGTH was too tight (objection_expression 200 chars truncated
good responses). Bumped all limits ~25-50%. Also fixed followUp filter that silently
dropped content when model split answer across answer+followUp fields — now appends
followUp as continuation when answer ends mid-sentence.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
P0: Enrich Chapter 10 with detailed policy facts (QMAS scoring, GEP A/B/C
conditions, FAQ quick answers) so Claude can answer common questions directly
without tool calls. Replace absolute rule "never answer from memory" with
3-tier system: Tier 1 (direct from Ch10), Tier 2 (search_knowledge), Tier 3
(invoke_policy_expert).
P1: Context injector now always returns a kb_coverage_hint block — when KB has
results it tells Claude to prefer KB over web_search; when KB has no results
it suggests considering web_search. Web_search tool description updated to
reference the hint.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Short follow-up answers like "计算机,信息技术" were being classified as
OFF_TOPIC (0.85) because the InputGate has no conversation context. Now the
gate only runs when there are no previous messages (first message in conversation).
Mid-conversation topic management is handled by the Coordinator prompt.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add section 5.6 "隐性转化引导" with trust-first conversion philosophy:
- Free facts vs paid analysis boundary
- "Taste-then-sell" strategy with positive but vague hints
- Assessment suggestion limited to max once per conversation
- Natural urgency only when fact-supported
- Post-assessment → full service transition only when user asks
- Anti-annoyance red line: never make user feel pushed to pay
Recalibrate info exchange (4.3): warm acknowledgment without deep analysis.
Add value framing (4.4) and post-assessment guidance (4.5).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Update coordinator system prompt to enforce pricing rules:
- All assessments cost 99 RMB (one-time per user), no free assessments
- Must collect payment before calling assessment expert
- Add fee inquiry intent type to response strategy table
- Update generate_payment tool description with fixed pricing
- Replace "免费初步咨询" with tiered service model
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Haiku sometimes returns JSON wrapped in ```json ... ``` code blocks,
causing JSON.parse to fail. Strip markdown fences before parsing.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
injectIntoMessages() was JSON.stringify-ing array content (with image blocks),
turning base64 data into text tokens (~170K) instead of image tokens (~1,600).
Fix: append context as a new text block in the array, preserving image block format.
Also fixes token estimation to count images at ~1,600 tokens instead of base64 char length,
and adds debug logging for API call token composition.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
80K was too aggressive and caused premature context loss. Now triggers
at 160K tokens with a target of 80K after compaction.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The auto-compaction logic (threshold 80K tokens, summarize older
messages via Haiku) existed but was never called in sendMessage flow.
Now called after context injection, before agent loop.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Historical images/PDFs were being re-downloaded and base64-encoded for
every API call, causing 200K+ token requests. Now only the current
message includes full attachment blocks; historical ones use text
placeholders like "[用户上传了图片: photo.png]".
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Claude API enforces a hard 5MB limit per image (not 20MB as previously
set). PDFs have a 32MB total request limit; set individual PDF cap to
25MB to leave room for prompt/messages. The downloadAsBase64 method now
accepts a per-type maxSize parameter.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
MinIO presigned URLs use Docker-internal hostname (minio:9000), making
them inaccessible from both Claude API servers and user browsers.
Changes:
- file-service: add /files/:id/content and /files/:id/thumbnail proxy
endpoints that stream file data from MinIO
- file-service: toResponseDto now returns API proxy paths instead of
MinIO presigned URLs
- coordinator: buildAttachmentBlocks now downloads files via file-service
internal API (http://file-service:3006) and converts to base64 for
Claude API (images, PDFs) or embeds text content directly
- Configurable FILE_SERVICE_URL env var for service-to-service calls
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Claude API cannot fetch arbitrary URLs. Text-based attachments (txt, csv,
json, md) are now downloaded via their presigned MinIO URL and embedded
directly as text blocks. PDF uses Claude's native document block. Added
50KB size limit with truncation for large text files.
buildMessages() is now async to support text content fetching.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1. Coordinator now sends all attachment types to Claude:
- Images → native image blocks (existing)
- PDF → native document blocks (Claude PDF support)
- Text files (txt, csv, json, md) → text blocks with filename
Extracted common buildAttachmentBlocks() helper.
2. File-service generates thumbnails on image upload:
- Uses sharp to resize to 400x400 max (inside fit, no upscale)
- Output as WebP at 80% quality for smaller file size
- Stored in MinIO under thumbnails/ prefix
- Generated for both direct upload and presigned URL confirm
- Non-blocking: thumbnail failure doesn't break upload
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The model was putting internal notes like "引导回移民话题" in the followUp
field instead of actual user-facing questions. Two fixes:
1. Schema: describe followUp as "必须以?结尾,禁止填写内部策略备注"
2. agent-loop: only yield followUp if it contains ?or ? (question mark)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
claude-sonnet-4-20250514 does not support output_config (structured outputs).
Changed coordinator model to claude-sonnet-4-5-20250929 which supports it.
Specialist agents remain on their original models (no output_config needed).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The AdminEvaluationRuleController in ConversationModule needs the
EVALUATION_RULE_REPOSITORY token. Even though AgentsModule is @Global(),
Symbol-based providers must be explicitly exported to be available
in other modules.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
claude-haiku-3-5-20241022 returns 404 on the proxy. Updated to
claude-haiku-4-5-20251001 in agent configs and context injector.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The mapEventToStreamChunk was mapping both 'usage' (per-turn) and 'end'
(final) events to type 'end', causing the gateway to emit multiple
stream_end events. This made the frontend create a separate message
bubble (with its own bot avatar) for each agent loop turn.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add tenant_id column to conversations, messages, token_usages tables
- Create standalone migration SQL script for production deployment
- Add agent_executions table to init-db.sql for new installations
- Fix MessageORM created_at nullable mismatch with database schema
- Backfill existing data with default tenant ID
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace IKnowledgeClient TypeScript interface (erased at runtime) with KnowledgeClientService concrete class injection
- Fix method signatures to match KnowledgeClientService API:
- getUserMemories() → getUserTopMemories(), field type → memoryType
- retrieveForPrompt(query, userId) → retrieveForPrompt({ query, userId })
- getRelevantExperiences(query, n) → searchExperiences({ query, limit: n }), field type → experienceType
- Remove unused ContextData import
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add 'error' chunk type to StreamChunk for partial token capture
- Record partial tokens to token_usage table even on API errors
- Capture error chunk tokens in conversation.service.ts
- Save partial response and tokens before re-throwing errors
- Add token aggregation from token_usage table for accurate stats
- Display detailed token info in admin (cache tokens, cost, API calls)
- Export TokenDetails type for frontend consumption
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Problem:
- Token usage was recorded to token_usage table but not to conversation entity
- Message count was not being incremented
- Dashboard showed 0 tokens for all conversations
Solution:
- Add inputTokens/outputTokens fields to StreamChunk interface
- Return token usage in 'end' chunk from ClaudeAgentServiceV2
- Capture token usage in conversation.service.ts sendMessage
- Call conversation.addTokens() and incrementMessageCount() after each exchange
- Consolidate conversation updates into single repo.update() call
Files changed:
- claude-agent-v2.service.ts: Add token fields to StreamChunk, return in 'end'
- conversation.service.ts: Track tokens and message counts properly
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add 8-stage consulting workflow (greeting → handoff)
- Create StrategyEngineService for state management and transitions
- Add ClaudeAgentServiceV2 with integrated strategy guidance
- Support old user recognition via get_user_context tool
- Add device info (IP, fingerprint) for new user icebreaking
- Extend ConversationEntity with consulting state fields
- Add database migration for new JSONB columns
Stages: greeting, needs_discovery, info_collection, assessment,
recommendation, objection_handling, conversion, handoff
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix Usage type cast by using unknown intermediate type
- Add PricingTier interface and proper Record type for PRICING
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add TokenUsageEntity to store per-request token consumption
- Add TokenUsageService with cost calculation and statistics APIs
- Record input/output/cache tokens per API call
- Calculate estimated cost based on Claude pricing
- Provide user/conversation/global stats aggregation
- Support daily stats and top users ranking
- Integrate token tracking in ClaudeAgentService
- Track latency, tool calls, response length
- Accumulate tokens across tool loop iterations
- Add token_usages table to init-db.sql with proper indexes
This enables:
- Per-user token consumption tracking
- Cost analysis and optimization
- Future billing/quota features
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Reduce tokensPerChar from 2 to 1.8 for more accurate Chinese token estimation
- Use min() instead of max() to enforce upper limits on token counts
- CHAT: max 200 tokens (was min 256)
- SIMPLE_QUERY: max 600 tokens (was min 512)
- CLARIFICATION: max 300 tokens (was min 256)
- CONFIRMATION: max 400 tokens (was min 384)
- DEEP_CONSULTATION: 800-1600 tokens (was 1024-4096)
- ACTION_NEEDED: 500-1000 tokens (was 768-2048)
This should result in more concise AI responses that better match
the intent classifier's suggested length limits.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add the following real-time tools to ImmigrationToolsService:
- get_current_datetime: Get current date/time with timezone support
- web_search: Search internet for latest immigration news/policies (Google CSE)
- get_exchange_rate: Query real-time currency exchange rates (for investment immigration)
- fetch_immigration_news: Fetch latest immigration announcements
All tools include graceful degradation with fallback responses when external APIs are unavailable.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add MinIO object storage to docker-compose infrastructure
- Create file-service microservice for upload management with presigned URLs
- Add files table to database schema
- Update nginx and Kong for MinIO proxy routes
- Implement file upload UI in chat InputArea with drag-and-drop
- Add attachment preview in MessageBubble component
- Update conversation-service to handle multimodal messages
- Add Claude Vision API integration for image analysis
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>