Commit Graph

6 Commits

Author SHA1 Message Date
hailin 7afbd54fce fix: rewrite voice pipeline for direct WebSocket I/O, fix TTS and navigation
Root cause: Pipecat's WebsocketServerTransport creates its own WebSocket
server on (host,port) and expects FrameProcessor subclasses. Our code was
passing a FastAPI WebSocket object as 'host' and using plain STT/TTS/VAD
service classes that aren't FrameProcessors. The pipeline crashed immediately
when receiving audio, causing "disconnects when speaking".

Changes:
- **base_pipeline.py**: Complete rewrite — replaced Pipecat Pipeline with
  direct async loop: WebSocket → VAD → STT → Claude LLM → TTS → WebSocket.
  Supports barge-in (interrupt TTS when user speaks), audio chunking, and
  24kHz→16kHz TTS resampling.
- **session_router.py**: Pass WebSocket directly to pipeline instead of
  wrapping in AppTransport.
- **app_transport.py**: Deprecated (no longer needed).
- **kokoro_service.py**: Fix misaki compatibility (MutableToken→MToken
  rename), use correct Chinese voice 'zf_xiaoxiao', handle torch tensors.
- **main.py**: Apply misaki monkey-patch before importing kokoro.
- **settings.py**: Change default TTS voice from 'zh_female_1' (non-existent)
  to 'zf_xiaoxiao' (valid Kokoro-82M Chinese female voice).
- **requirements.txt**: Remove pipecat-ai dependency, pin kokoro==0.3.5 +
  misaki==0.7.17, add Chinese NLP deps (pypinyin, cn2an, jieba, ordered-set).
- **agent_call_page.dart**: Wrap each cleanup step in try/catch to ensure
  Navigator.pop() always executes after call ends. Add 3s timeout on session
  delete request.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 23:34:35 -08:00
hailin 9cdc4933dc fix: add python-multipart dependency for voice-service
Required by FastAPI for form/file upload parsing. Missing dependency
may cause import errors and container restart loops.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 16:10:50 -08:00
hailin 39718a9a09 fix: resolve runtime errors for NestJS, Kong, and voice-service
- Dockerfile.service: fix entry point path (dist/services/{name}/src/main)
  due to tsconfig paths widening rootDir during compilation
- Kong config: remove unsupported ws/wss protocols (WebSocket works
  automatically over http/https in Kong 3.7)
- voice-service: fix pipecat import path for v0.0.30 API
  (pipecat.transports.network.websocket_server with lowercase class names)
- voice-service: add openai dependency required by pipecat anthropic service

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 19:00:03 -08:00
hailin 93c4a21f06 fix: upgrade faster-whisper to 1.2.1 to resolve av build failure
faster-whisper 1.0.0 depends on av==11.* which has no prebuilt wheels
and fails to compile. Version 1.2.1 uses av 12+ with prebuilt wheels.
Also removed unnecessary FFmpeg dev libraries from Dockerfile.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 16:40:04 -08:00
hailin 9a95cdc4a9 fix: update numpy to 1.26.4 for pipecat-ai compatibility
pipecat-ai==0.0.30 requires numpy~=1.26.4, conflicting with 1.26.0.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 05:09:01 -08:00
hailin 00f8801d51 Initial commit: IT0 AI-powered server cluster operations platform
Full-stack monorepo with DDD + Clean Architecture:
- Backend: 7 NestJS microservices + 5 shared libraries (TypeScript)
- Mobile: Flutter app with Riverpod (Dart)
- Web Admin: Next.js dashboard with Zustand + React Query
- Voice: Python voice service (STT/TTS/VAD)
- Infra: Docker Compose, K8s manifests, Turborepo build

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 22:54:37 -08:00