hailin
4ef5fce924
fix(llm-gateway): fix TS error — embeddings handler has no injection variable
...
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 02:03:17 -08:00
hailin
5683185a47
feat(llm-gateway): add system prompt injection to OpenAI chat proxy
...
- Add injectSystemPromptOpenAI() for OpenAI messages format (role: system)
- Integrate injection into createOpenAIChatProxy before upstream call
- Update audit logs to track injection status
- Enables brand identity override for both Anthropic and OpenAI endpoints
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 02:01:29 -08:00
hailin
a4fa4f47d6
fix(gateway): strip service_tier and usage details from OpenAI responses
...
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 01:52:10 -08:00
hailin
00056c5405
feat(gateway): deep response sanitization to mask provider identity
...
Replace Anthropic msg_xxx IDs with opaque IDs, strip cache_creation,
service_tier, inference_geo fields. Replace OpenAI chatcmpl-xxx IDs,
strip system_fingerprint. Applied to both streaming and non-streaming.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 00:50:20 -08:00
hailin
e898e6551d
feat(gateway): add per-key model override and alias for transparent model routing
...
Admin can configure modelOverride (actual upstream model) and modelAlias
(name shown to users) per API key. When set, users don't need to specify
the real model — the gateway substitutes it transparently in both requests
and responses (including SSE streams).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 00:31:26 -08:00
hailin
dd765ed7a4
fix(llm-gateway): strip trailing /v1 from OpenAI upstream URL to avoid double path
...
OPENAI_BASE_URL may already include /v1, causing requests to hit /v1/v1/embeddings.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 23:25:16 -08:00
hailin
6476bd868f
feat(llm-gateway): 新增对外 LLM API 代理服务 — 完整的监管注入、内容审查和管理后台
...
## 新增微服务: llm-gateway (端口 3008)
对外提供与 Anthropic/OpenAI 完全兼容的 API 接口,中间拦截实现:
- API Key 认证:由我们分配 Key 给外部用户,SHA-256 哈希存储
- System Prompt 注入:在请求转发前注入监管合规内容(支持 prepend/append)
- 内容审查过滤:对用户消息进行关键词/正则匹配,支持 block/warn/log 三种动作
- 用量记录:异步批量写入,跟踪 token 消耗和费用估算
- 审计日志:记录每次请求的来源 IP、过滤状态、注入状态等
- 速率限制:基于内存滑动窗口的 RPM 限制
### 技术选型
- Fastify (非 NestJS):纯代理场景无需 DI 容器,路由开销 ~2ms
- SSE 流式管道:零缓冲直通,支持 Anthropic streaming 和 OpenAI streaming
- 规则缓存:30 秒 TTL,避免每次请求查库
### API 端点
- POST /v1/messages — Anthropic Messages API 代理(流式+非流式)
- POST /v1/embeddings — OpenAI Embeddings API 代理
- POST /v1/chat/completions — OpenAI Chat Completions API 代理
- GET /health — 健康检查
## 数据库 (5 张新表)
- gateway_api_keys: 外部用户 API Key(权限、限速、预算、过期时间)
- gateway_injection_rules: 监管内容注入规则(位置、匹配模型、匹配 Key)
- gateway_content_rules: 内容审查规则(关键词/正则、block/warn/log)
- gateway_usage_logs: Token 用量记录(按 Key、模型、提供商统计)
- gateway_audit_logs: 请求审计日志(IP、过滤状态、注入状态)
## Admin 后端 (conversation-service)
4 个 NestJS 控制器,挂载在 /conversations/admin/gateway/ 下:
- AdminGatewayKeysController: Key 的 CRUD + toggle
- AdminGatewayInjectionRulesController: 注入规则 CRUD + toggle
- AdminGatewayContentRulesController: 内容审查规则 CRUD + toggle
- AdminGatewayDashboardController: 仪表盘汇总、用量查询、审计日志查询
5 个 ORM 实体文件对应 5 张数据库表。
## Admin 前端 (admin-client)
新增 features/llm-gateway 模块,Tabs 布局包含 5 个管理面板:
- API Key Tab: 创建/删除/启停 Key,创建时一次性显示完整 Key
- 注入规则 Tab: 配置监管内容(前置/追加到 system prompt)
- 内容审查 Tab: 配置关键词/正则过滤规则
- 用量统计 Tab: 查看 token 消耗、费用、响应时间
- 审计日志 Tab: 查看请求记录、过滤命中、注入状态
菜单项: GatewayOutlined + "LLM 网关",位于"系统总监"和"数据分析"之间。
## 基础设施
- docker-compose.yml: 新增 llm-gateway 服务定义
- kong.yml: 新增 /v1/messages、/v1/embeddings、/v1/chat/completions 路由
- 超时设置 300 秒(LLM 长响应)
- CORS 新增 X-Api-Key、anthropic-version、anthropic-beta 头
- init-db.sql: 新增 5 张 gateway 表的建表语句
## 架构说明
内部服务(conversation-service、knowledge-service、evolution-service)继续直连 API,
llm-gateway 仅服务外部用户。两者通过共享 PostgreSQL 数据库关联配置。
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 22:32:25 -08:00