Go to file
hailin 94a14b3104 feat: migrate voice call from WebSocket/PCM to LiveKit WebRTC
实时语音对话架构迁移:WebSocket → LiveKit WebRTC

## 背景
原语音通话架构基于 FastAPI WebSocket 传输原始 PCM,管道串行执行
(VAD → 批量STT → Agent → 攒句 → 批量TTS),首音频延迟约 6 秒。
迁移到 LiveKit Agents 框架后,利用 WebRTC 传输 + 流水线并行,
预期延迟降至 1.5-2 秒。

## 架构
Flutter App ←── WebRTC (Opus/UDP) ──→ LiveKit Server ←──→ Voice Agent
  livekit_client                      (自部署, Go)       (Python, LiveKit Agents SDK)
                                                          ├─ VAD (Silero)
                                                          ├─ STT (faster-whisper / OpenAI)
                                                          ├─ LLM (自定义插件 → agent-service)
                                                          └─ TTS (Kokoro / OpenAI)

关键设计:LLM 不直接调用 Claude API,而是通过自定义插件代理到现有
agent-service,保留 Tool Use、会话历史、租户隔离等能力。

## 新增服务

### voice-agent (packages/services/voice-agent/)
LiveKit Agent Worker,包含:
- agent.py: 入口,prewarm() 预加载模型,entrypoint() 编排会话
- plugins/agent_llm.py: 自定义 LLM 插件,代理 agent-service API
  - POST /api/v1/agent/tasks 创建任务
  - WS /ws/agent 订阅流式事件 (stream_event)
  - 跨轮复用 session_id 保持对话上下文
- plugins/whisper_stt.py: 本地 faster-whisper STT (批量识别)
- plugins/kokoro_tts.py: 本地 Kokoro-82M TTS (24kHz PCM)
- config.py: pydantic-settings 配置

### LiveKit Server (deploy/docker/)
- livekit.yaml: 信令端口 7880, RTC TCP 7881, UDP 50000-50200
- docker-compose.yml: 新增 livekit-server + voice-agent 容器

### LiveKit Token 端点
- voice-service/src/api/livekit_token.py:
  POST /api/v1/voice/livekit/token
  生成 Room JWT,嵌入 auth_header 到 AgentDispatch metadata

## Flutter 客户端改造
- agent_call_page.dart: 从 ~814 行简化到 ~380 行
  - 替换: WebSocketChannel, AudioRecorder, PcmPlayer, 手动心跳/重连
  - 使用: Room.connect(), setMicrophoneEnabled(true), LiveKit 事件监听
  - 波形动画改用 participant.audioLevel
- pubspec.yaml: 添加 livekit_client: ^2.3.0
- app_config.dart: 增加 livekitUrl 字段
- api_endpoints.dart: 增加 livekitToken 端点

## 配置说明 (环境变量)
- STT_PROVIDER: local (默认, faster-whisper) / openai
- TTS_PROVIDER: local (默认, Kokoro) / openai
- WHISPER_MODEL: base (默认) / small / medium / large
- WHISPER_LANGUAGE: zh (默认)
- KOKORO_VOICE: zf_xiaoxiao (默认)
- DEVICE: cpu (默认) / cuda

## 不变的部分
- agent-service: 完全不改,voice-agent 通过现有 API 调用
- voice-service 核心: pipeline/STT/TTS/VAD 保留 (Twilio 备用)
- Kong 网关: 现有路由不变
- 数据库: 无 schema 变更

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 08:55:33 -08:00
deploy feat: migrate voice call from WebSocket/PCM to LiveKit WebRTC 2026-02-28 08:55:33 -08:00
docs docs: add comprehensive deployment guide 2026-02-18 16:54:00 -08:00
it0-web-admin feat: complete tenant member management (CRUD + delete tenant) 2026-02-26 10:00:09 -08:00
it0_app feat: migrate voice call from WebSocket/PCM to LiveKit WebRTC 2026-02-28 08:55:33 -08:00
packages feat: migrate voice call from WebSocket/PCM to LiveKit WebRTC 2026-02-28 08:55:33 -08:00
.dockerignore fix: add Dockerfiles and fix docker-compose build configuration 2026-02-19 04:31:23 -08:00
.env.example Initial commit: IT0 AI-powered server cluster operations platform 2026-02-08 22:54:37 -08:00
.gitignore fix: 修复 .gitignore 误忽略 Flutter data/models/ 源码导致构建失败 2026-02-22 16:29:03 -08:00
Dockerfile.service refactor: clean up agent SSH setup after fixing host-local routing 2026-02-26 18:11:44 -08:00
README.md Initial commit: IT0 AI-powered server cluster operations platform 2026-02-08 22:54:37 -08:00
entrypoint.sh refactor: clean up agent SSH setup after fixing host-local routing 2026-02-26 18:11:44 -08:00
logo.svg feat: rename app from IT0 to iAgent (我智能体) 2026-02-22 06:39:40 -08:00
package.json Initial commit: IT0 AI-powered server cluster operations platform 2026-02-08 22:54:37 -08:00
pnpm-lock.yaml chore: upgrade claude-agent-sdk to ^0.2.52 2026-02-24 04:12:03 -08:00
pnpm-workspace.yaml Initial commit: IT0 AI-powered server cluster operations platform 2026-02-08 22:54:37 -08:00
tsconfig.base.json Initial commit: IT0 AI-powered server cluster operations platform 2026-02-08 22:54:37 -08:00
turbo.json fix: rename turbo.json pipeline to tasks for Turbo 2.x compatibility 2026-02-19 04:44:25 -08:00

README.md

IT0 — AI-Powered Server Cluster Operations Platform

Intelligent operations platform that combines AI agents with human oversight for managing server clusters.

Architecture

  • Backend: NestJS microservices (TypeScript) with DDD + Clean Architecture
  • Mobile: Flutter app with Riverpod state management
  • Web Admin: Next.js dashboard with Zustand + React Query
  • Voice: Python service for voice-based interaction (STT/TTS/VAD)

Services

Service Description
auth-service Authentication, RBAC, API key management
agent-service AI agent orchestration (Claude CLI + API)
inventory-service Server, cluster, credential management
monitor-service Metrics collection, alerting, health checks
ops-service Task execution, approvals, standing orders
comm-service Multi-channel notifications, escalation
audit-service Audit logging, compliance trail
voice-service Voice pipeline (Python)

Quick Start

# Backend
pnpm install
pnpm dev

# Flutter
cd it0_app && flutter pub get && flutter run

# Web Admin
cd it0-web-admin && pnpm install && pnpm dev

Tech Stack

  • Runtime: Node.js 20+, Dart 3.x, Python 3.11+
  • Database: PostgreSQL (schema-per-tenant)
  • Cache/Events: Redis Streams
  • AI: Anthropic Claude (CLI + API)
  • Build: pnpm workspaces + Turborepo