iconsulting

History

hailin 04dbc61131 feat(agents): add capability boundary guardrails — input gate, cascading fallback, output gate rules Four guardrail improvements to enforce agent capability boundaries: 1. Cascading Fallback (Fix 1+4): - Rewrite searchKnowledge() in immigration-tools.service.ts with 3-tier fallback: KB (similarity >= 0.55) → Web Search → Built-in Knowledge (clearly labeled) - Rewrite executeTool() in policy-expert.service.ts to use retrieveKnowledge() with confidence threshold; returns [KB_EMPTY]/[KB_LOW_CONFIDENCE]/[KB_ERROR] markers so the model knows to label source reliability 2. Input Gate (Fix 2): - New InputGateService using Haiku for lightweight pre-classification - Classifications: ON_TOPIC / OFF_TOPIC (threshold >= 0.7) / HARMFUL (>= 0.6) - Short messages (< 5 chars) fast-path to ON_TOPIC - Gate failure is non-fatal (allows message through) - Integrated in CoordinatorAgentService.sendMessage() before agent loop entry - OFF_TOPIC/HARMFUL messages get fixed responses without entering agent loop 3. Output Gate Enhancement (Fix 3): - Add TOPIC_BOUNDARY and NO_FABRICATION to EvaluationRuleType - TOPIC_BOUNDARY: regex detection for code blocks, programming keywords, AI identity exposure, off-topic indicators in agent responses - NO_FABRICATION: detects policy claims without policy_expert invocation or source markers; ensures factual claims are knowledge-backed - Both rule types are admin-configurable (zero rules = zero checks) - No DB migration needed (ruleType is varchar(50)) Files changed: - NEW: agents/coordinator/input-gate.service.ts - MOD: agents/coordinator/coordinator-agent.service.ts (inject InputGate + gate check) - MOD: agents/agents.module.ts (register InputGateService) - MOD: agents/coordinator/evaluation-gate.service.ts (2 new evaluators) - MOD: domain/entities/evaluation-rule.entity.ts (2 new rule types) - MOD: agents/specialists/policy-expert.service.ts (RAG confidence threshold) - MOD: claude/tools/immigration-tools.service.ts (cascading fallback) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>		2026-02-06 21:59:10 -08:00
..
admin-client	feat(mcp): add MCP Server management — backend API + admin UI	2026-02-06 18:29:02 -08:00
services	feat(agents): add capability boundary guardrails — input gate, cascading fallback, output gate rules	2026-02-06 21:59:10 -08:00
shared	feat(agents): implement multi-agent collaboration architecture	2026-02-06 04:26:39 -08:00
web-client	feat(agents): implement multi-agent collaboration architecture	2026-02-06 04:26:39 -08:00