Go to file
hailin a6cd3c20d9 feat: add WebSocket robustness to voice call (heartbeat, reconnect, jitter buffer)
Addresses reliability gaps in the real-time voice WebSocket connection
between Flutter client and Python voice-service backend.

Backend (voice-service):
- Heartbeat: new _heartbeat_sender coroutine sends JSON ping text frames
  every 15s alongside the Pipecat pipeline; failed send = dead connection
- Session preservation: on WebSocket disconnect, sessions are now marked
  "disconnected" with a timestamp instead of being deleted, allowing
  reconnection within a configurable TTL (default 60s)
- Reconnect endpoint: POST /sessions/{id}/reconnect verifies the session
  is alive and in "disconnected" state, returns fresh websocket_url
- Reconnect-aware WS handler: detects "disconnected" sessions, cancels
  stale pipeline tasks, creates a new pipeline, sends "session.resumed"
- Background cleanup: asyncio loop every 30s removes sessions that have
  been disconnected longer than session_ttl
- Structured event protocol: text frames = JSON control messages
  (ping/pong/session.resumed/session.ended/error), binary = PCM audio
- New settings: session_ttl (60s), heartbeat_interval (15s),
  heartbeat_timeout (45s)

Flutter (agent_call_page.dart):
- Heartbeat monitoring: tracks last server ping timestamp, triggers
  reconnect if no ping received in 45s (3 missed intervals)
- Auto-reconnect: exponential backoff (1s→2s→4s→8s→16s), max 5 attempts;
  calls /reconnect endpoint to verify session, rebuilds WebSocket,
  resets audio buffer, restarts heartbeat
- Reconnecting UI: yellow warning banner "重新连接中... (N/5)" with
  spinner overlay during reconnection attempts
- WebSocket data routing: _onWsData distinguishes String (JSON control)
  from binary (audio) frames, handles ping/session.resumed/session.ended
- User-initiated disconnect guard: _userEndedCall flag prevents reconnect
  attempts when user intentionally hangs up
- session_id field compatibility: supports session_id/sessionId/id

Flutter (pcm_player.dart):
- Jitter buffer: queues incoming PCM chunks, starts playback only after
  accumulating 4800 bytes (150ms at 16kHz 16-bit mono) to smooth out
  network timing variance
- reset() method: clears buffer on reconnect to discard stale audio
- Buffer underrun handling: re-enters buffering phase if queue empties

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 07:32:19 -08:00
deploy fix: run services as non-root user for SDK bypassPermissions 2026-02-23 06:41:10 -08:00
docs docs: add comprehensive deployment guide 2026-02-18 16:54:00 -08:00
it0-web-admin feat: rename app from IT0 to iAgent (我智能体) 2026-02-22 06:39:40 -08:00
it0_app feat: add WebSocket robustness to voice call (heartbeat, reconnect, jitter buffer) 2026-02-23 07:32:19 -08:00
packages feat: add WebSocket robustness to voice call (heartbeat, reconnect, jitter buffer) 2026-02-23 07:32:19 -08:00
.dockerignore fix: add Dockerfiles and fix docker-compose build configuration 2026-02-19 04:31:23 -08:00
.env.example Initial commit: IT0 AI-powered server cluster operations platform 2026-02-08 22:54:37 -08:00
.gitignore fix: 修复 .gitignore 误忽略 Flutter data/models/ 源码导致构建失败 2026-02-22 16:29:03 -08:00
Dockerfile.service fix: install bash in Alpine container for Agent SDK shell access 2026-02-23 06:52:23 -08:00
README.md Initial commit: IT0 AI-powered server cluster operations platform 2026-02-08 22:54:37 -08:00
logo.svg feat: rename app from IT0 to iAgent (我智能体) 2026-02-22 06:39:40 -08:00
package.json Initial commit: IT0 AI-powered server cluster operations platform 2026-02-08 22:54:37 -08:00
pnpm-lock.yaml fix: update pnpm-lock.yaml for @anthropic-ai/claude-agent-sdk dependency 2026-02-21 22:07:21 -08:00
pnpm-workspace.yaml Initial commit: IT0 AI-powered server cluster operations platform 2026-02-08 22:54:37 -08:00
tsconfig.base.json Initial commit: IT0 AI-powered server cluster operations platform 2026-02-08 22:54:37 -08:00
turbo.json fix: rename turbo.json pipeline to tasks for Turbo 2.x compatibility 2026-02-19 04:44:25 -08:00

README.md

IT0 — AI-Powered Server Cluster Operations Platform

Intelligent operations platform that combines AI agents with human oversight for managing server clusters.

Architecture

  • Backend: NestJS microservices (TypeScript) with DDD + Clean Architecture
  • Mobile: Flutter app with Riverpod state management
  • Web Admin: Next.js dashboard with Zustand + React Query
  • Voice: Python service for voice-based interaction (STT/TTS/VAD)

Services

Service Description
auth-service Authentication, RBAC, API key management
agent-service AI agent orchestration (Claude CLI + API)
inventory-service Server, cluster, credential management
monitor-service Metrics collection, alerting, health checks
ops-service Task execution, approvals, standing orders
comm-service Multi-channel notifications, escalation
audit-service Audit logging, compliance trail
voice-service Voice pipeline (Python)

Quick Start

# Backend
pnpm install
pnpm dev

# Flutter
cd it0_app && flutter pub get && flutter run

# Web Admin
cd it0-web-admin && pnpm install && pnpm dev

Tech Stack

  • Runtime: Node.js 20+, Dart 3.x, Python 3.11+
  • Database: PostgreSQL (schema-per-tenant)
  • Cache/Events: Redis Streams
  • AI: Anthropic Claude (CLI + API)
  • Build: pnpm workspaces + Turborepo