Hard-coded INTENT_MAX_ANSWER_LENGTH limits caused mid-sentence truncation and
content loss. Length control now relies on prompt + schema description + LLM-Judge
(3 layers). Only a 2000-char safety net remains for extreme edge cases.
Also simplified followUp: non-question followUp is now always appended (prevents
model content split from silently dropping text).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
INTENT_MAX_ANSWER_LENGTH was too tight (objection_expression 200 chars truncated
good responses). Bumped all limits ~25-50%. Also fixed followUp filter that silently
dropped content when model split answer across answer+followUp fields — now appends
followUp as continuation when answer ends mid-sentence.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
P0: Enrich Chapter 10 with detailed policy facts (QMAS scoring, GEP A/B/C
conditions, FAQ quick answers) so Claude can answer common questions directly
without tool calls. Replace absolute rule "never answer from memory" with
3-tier system: Tier 1 (direct from Ch10), Tier 2 (search_knowledge), Tier 3
(invoke_policy_expert).
P1: Context injector now always returns a kb_coverage_hint block — when KB has
results it tells Claude to prefer KB over web_search; when KB has no results
it suggests considering web_search. Web_search tool description updated to
reference the hint.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Short follow-up answers like "计算机,信息技术" were being classified as
OFF_TOPIC (0.85) because the InputGate has no conversation context. Now the
gate only runs when there are no previous messages (first message in conversation).
Mid-conversation topic management is handled by the Coordinator prompt.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add section 5.6 "隐性转化引导" with trust-first conversion philosophy:
- Free facts vs paid analysis boundary
- "Taste-then-sell" strategy with positive but vague hints
- Assessment suggestion limited to max once per conversation
- Natural urgency only when fact-supported
- Post-assessment → full service transition only when user asks
- Anti-annoyance red line: never make user feel pushed to pay
Recalibrate info exchange (4.3): warm acknowledgment without deep analysis.
Add value framing (4.4) and post-assessment guidance (4.5).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Update coordinator system prompt to enforce pricing rules:
- All assessments cost 99 RMB (one-time per user), no free assessments
- Must collect payment before calling assessment expert
- Add fee inquiry intent type to response strategy table
- Update generate_payment tool description with fixed pricing
- Replace "免费初步咨询" with tiered service model
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Haiku sometimes returns JSON wrapped in ```json ... ``` code blocks,
causing JSON.parse to fail. Strip markdown fences before parsing.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
injectIntoMessages() was JSON.stringify-ing array content (with image blocks),
turning base64 data into text tokens (~170K) instead of image tokens (~1,600).
Fix: append context as a new text block in the array, preserving image block format.
Also fixes token estimation to count images at ~1,600 tokens instead of base64 char length,
and adds debug logging for API call token composition.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
80K was too aggressive and caused premature context loss. Now triggers
at 160K tokens with a target of 80K after compaction.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The auto-compaction logic (threshold 80K tokens, summarize older
messages via Haiku) existed but was never called in sendMessage flow.
Now called after context injection, before agent loop.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Historical images/PDFs were being re-downloaded and base64-encoded for
every API call, causing 200K+ token requests. Now only the current
message includes full attachment blocks; historical ones use text
placeholders like "[用户上传了图片: photo.png]".
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Claude API enforces a hard 5MB limit per image (not 20MB as previously
set). PDFs have a 32MB total request limit; set individual PDF cap to
25MB to leave room for prompt/messages. The downloadAsBase64 method now
accepts a per-type maxSize parameter.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
MinIO presigned URLs use Docker-internal hostname (minio:9000), making
them inaccessible from both Claude API servers and user browsers.
Changes:
- file-service: add /files/:id/content and /files/:id/thumbnail proxy
endpoints that stream file data from MinIO
- file-service: toResponseDto now returns API proxy paths instead of
MinIO presigned URLs
- coordinator: buildAttachmentBlocks now downloads files via file-service
internal API (http://file-service:3006) and converts to base64 for
Claude API (images, PDFs) or embeds text content directly
- Configurable FILE_SERVICE_URL env var for service-to-service calls
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Claude API cannot fetch arbitrary URLs. Text-based attachments (txt, csv,
json, md) are now downloaded via their presigned MinIO URL and embedded
directly as text blocks. PDF uses Claude's native document block. Added
50KB size limit with truncation for large text files.
buildMessages() is now async to support text content fetching.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1. Coordinator now sends all attachment types to Claude:
- Images → native image blocks (existing)
- PDF → native document blocks (Claude PDF support)
- Text files (txt, csv, json, md) → text blocks with filename
Extracted common buildAttachmentBlocks() helper.
2. File-service generates thumbnails on image upload:
- Uses sharp to resize to 400x400 max (inside fit, no upscale)
- Output as WebP at 80% quality for smaller file size
- Stored in MinIO under thumbnails/ prefix
- Generated for both direct upload and presigned URL confirm
- Non-blocking: thumbnail failure doesn't break upload
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The model was putting internal notes like "引导回移民话题" in the followUp
field instead of actual user-facing questions. Two fixes:
1. Schema: describe followUp as "必须以?结尾,禁止填写内部策略备注"
2. agent-loop: only yield followUp if it contains ?or ? (question mark)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
claude-sonnet-4-20250514 does not support output_config (structured outputs).
Changed coordinator model to claude-sonnet-4-5-20250929 which supports it.
Specialist agents remain on their original models (no output_config needed).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- findForEvolution() now excludes DELETED conversations (should not learn from deleted data)
- getConversation() rejects DELETED conversations for user-facing operations (sendMessage, getMessages, etc.)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The delete conversation endpoint was a no-op — it verified ownership but
never actually modified the record. Users saw conversations disappear
(frontend optimistic removal) but they reappeared on refresh.
Changes:
- conversation.entity.ts: Add DELETED status, softDelete() and isDeleted()
- conversation.service.ts: Call softDelete() + update instead of no-op
- conversation-postgres.repository.ts: Exclude DELETED conversations
from findByUserId() queries so they don't appear in user's list
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
AdminConversationController's GET /:id was intercepting requests to
AdminEvaluationRuleController (matching "evaluation-rules" as an id param).
Similarly, DELETE /:id was matching "cache" as an id.
Changes:
- conversation.module.ts: Register AdminMcpController and
AdminEvaluationRuleController before AdminConversationController
(more specific prefixes must come first in NestJS)
- admin-evaluation-rule.controller.ts: Move static routes (POST /test,
DELETE /cache) before dynamic routes (GET/:id, DELETE/:id)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The AdminEvaluationRuleController in ConversationModule needs the
EVALUATION_RULE_REPOSITORY token. Even though AgentsModule is @Global(),
Symbol-based providers must be explicitly exported to be available
in other modules.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
claude-haiku-3-5-20241022 returns 404 on the proxy. Updated to
claude-haiku-4-5-20251001 in agent configs and context injector.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The mapEventToStreamChunk was mapping both 'usage' (per-turn) and 'end'
(final) events to type 'end', causing the gateway to emit multiple
stream_end events. This made the frontend create a separate message
bubble (with its own bot avatar) for each agent loop turn.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
WebSocket gateway was missing AsyncLocalStorage tenant context setup,
causing 'Tenant context not set' error on every message. Now extracts
tenantId from handshake and wraps handleMessage in tenantContext.runAsync().
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add tenant_id column to conversations, messages, token_usages tables
- Create standalone migration SQL script for production deployment
- Add agent_executions table to init-db.sql for new installations
- Fix MessageORM created_at nullable mismatch with database schema
- Backfill existing data with default tenant ID
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace IKnowledgeClient TypeScript interface (erased at runtime) with KnowledgeClientService concrete class injection
- Fix method signatures to match KnowledgeClientService API:
- getUserMemories() → getUserTopMemories(), field type → memoryType
- retrieveForPrompt(query, userId) → retrieveForPrompt({ query, userId })
- getRelevantExperiences(query, n) → searchExperiences({ query, limit: n }), field type → experienceType
- Remove unused ContextData import
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
All services were providing TenantContextService directly without
making it global, causing DI resolution failures in child modules.
Now using TenantContextModule.forRoot() which exports TenantContextService
globally so all repositories can access it.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
npm install was clearing the @iconsulting/shared folder that was
copied before it. Moving the COPY command after npm install ensures
the shared package remains in node_modules.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Apply TenantContextMiddleware to all 6 services
- Add SimpleTenantFinder for services without direct tenant DB access
- Add TenantFinderService for evolution-service with database access
- Refactor 8 repositories to extend BaseTenantRepository:
- user-postgres.repository.ts
- verification-code-postgres.repository.ts
- conversation-postgres.repository.ts
- message-postgres.repository.ts
- token-usage-postgres.repository.ts
- file-postgres.repository.ts
- order-postgres.repository.ts
- payment-postgres.repository.ts
- Add @iconsulting/shared dependency to evolution-service and knowledge-service
- Configure middleware to exclude health and super-admin paths
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add token aggregation to statistics/overview endpoint
- Include total tokens, cost, and API calls for all time
- Include today's token usage and cost breakdown
- Display token stats in ConversationsPage with 2 rows of cards
- Add formatNumber helper for K/M number formatting
- Export GlobalTokenStats and TodayTokenStats types
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add 'error' chunk type to StreamChunk for partial token capture
- Record partial tokens to token_usage table even on API errors
- Capture error chunk tokens in conversation.service.ts
- Save partial response and tokens before re-throwing errors
- Add token aggregation from token_usage table for accurate stats
- Display detailed token info in admin (cache tokens, cost, API calls)
- Export TokenDetails type for frontend consumption
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Problem:
- Token usage was recorded to token_usage table but not to conversation entity
- Message count was not being incremented
- Dashboard showed 0 tokens for all conversations
Solution:
- Add inputTokens/outputTokens fields to StreamChunk interface
- Return token usage in 'end' chunk from ClaudeAgentServiceV2
- Capture token usage in conversation.service.ts sendMessage
- Call conversation.addTokens() and incrementMessageCount() after each exchange
- Consolidate conversation updates into single repo.update() call
Files changed:
- claude-agent-v2.service.ts: Add token fields to StreamChunk, return in 'end'
- conversation.service.ts: Track tokens and message counts properly
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
## Device Tracking (conversation-service)
- Add DeviceInfoDto class for validating device information
- Extract client IP from X-Forwarded-For and X-Real-IP headers
- Capture User-Agent header automatically on conversation creation
- Support optional fingerprint and region from client
- Pass deviceInfo through service layer to entity for persistence
Files changed:
- conversation.controller.ts: Add extractClientIp() method and header capture
- conversation.dto.ts: Add DeviceInfoDto with validation decorators
- conversation.service.ts: Update CreateConversationParams interface
## Build Optimization (admin-client)
- Implement code splitting via Rollup manualChunks
- Separate vendor libraries into cacheable chunks:
- vendor-react: react, react-dom, react-router-dom (160KB)
- vendor-antd: antd, @ant-design/icons (1013KB)
- vendor-charts: recharts (409KB)
- vendor-data: @tanstack/react-query, axios, zustand (82KB)
- Main bundle reduced from 1732KB to 61KB (96% reduction)
- Set chunkSizeWarningLimit to 1100KB for antd
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add TransactionService for atomic database operations with optimistic lock retry
- Implement pessimistic locking in payment callback handling to prevent race conditions
- Add idempotency check via transactionId unique index to prevent duplicate processing
- Add version columns to PaymentORM and OrderORM for optimistic locking
- Add composite indexes for performance (order_status, transaction_id)
- Optimize connection pool settings for both payment and conversation services
- Update init-db.sql with version columns and new indexes
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>