Flutter (agent_call_page.dart): - Add ConnectOptions with 15s timeouts for connection/peerConnection/iceRestart - Add RoomReconnectingEvent/RoomAttemptReconnectEvent/RoomReconnectedEvent listeners with "网络重连中" UI indicator during reconnection - Add TimeoutException detection in _friendlyError() voice-agent (agent.py): - Wrap entrypoint() in try-except with full traceback logging - Register room disconnect listener to close httpx clients (instead of finally block, since session.start() returns while session runs in bg) - Add asyncio import for ensure_future cleanup voice-agent LLM proxy (agent_llm.py): - Add retry with exponential backoff (max 2 retries, 1s/3s delays) for network errors (ConnectError/ConnectTimeout/OSError) and WS InvalidStatusCode - Extract _do_stream() method for single-attempt logic - Add WebSocket connection params: open_timeout=10, ping_interval=20, ping_timeout=10 for keepalive and faster dead-connection detection - Use granular httpx.Timeout(connect=10, read=30, write=10, pool=10) - Increase WS recv timeout from 5s to 30s to reduce unnecessary loops Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|---|---|---|
| deploy | ||
| docs | ||
| it0-web-admin | ||
| it0_app | ||
| packages | ||
| .dockerignore | ||
| .env.example | ||
| .gitignore | ||
| Dockerfile.service | ||
| README.md | ||
| entrypoint.sh | ||
| logo.svg | ||
| package.json | ||
| pnpm-lock.yaml | ||
| pnpm-workspace.yaml | ||
| tsconfig.base.json | ||
| turbo.json | ||
README.md
IT0 — AI-Powered Server Cluster Operations Platform
Intelligent operations platform that combines AI agents with human oversight for managing server clusters.
Architecture
- Backend: NestJS microservices (TypeScript) with DDD + Clean Architecture
- Mobile: Flutter app with Riverpod state management
- Web Admin: Next.js dashboard with Zustand + React Query
- Voice: Python service for voice-based interaction (STT/TTS/VAD)
Services
| Service | Description |
|---|---|
| auth-service | Authentication, RBAC, API key management |
| agent-service | AI agent orchestration (Claude CLI + API) |
| inventory-service | Server, cluster, credential management |
| monitor-service | Metrics collection, alerting, health checks |
| ops-service | Task execution, approvals, standing orders |
| comm-service | Multi-channel notifications, escalation |
| audit-service | Audit logging, compliance trail |
| voice-service | Voice pipeline (Python) |
Quick Start
# Backend
pnpm install
pnpm dev
# Flutter
cd it0_app && flutter pub get && flutter run
# Web Admin
cd it0-web-admin && pnpm install && pnpm dev
Tech Stack
- Runtime: Node.js 20+, Dart 3.x, Python 3.11+
- Database: PostgreSQL (schema-per-tenant)
- Cache/Events: Redis Streams
- AI: Anthropic Claude (CLI + API)
- Build: pnpm workspaces + Turborepo