The await on sink.close() blocks indefinitely when the server doesn't
respond to the close handshake. Use fire-and-forget with unawaited()
so the new connection can proceed immediately.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When sending a second message in the same session, the old WebSocket
connection was not closed, causing both connections to subscribe to the
same session room. This resulted in each text event being received twice,
producing garbled/duplicated output text.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
WebSocketChannel.connect does not accept headers parameter in
web_socket_channel 2.4.0. Use IOWebSocketChannel.connect instead.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
WebSocket connections to /ws/agent were rejected by Kong (401)
because the Authorization header was not included. Now reads
access_token from secure storage and passes it in the WebSocket
upgrade request headers.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Addresses reliability gaps in the real-time voice WebSocket connection
between Flutter client and Python voice-service backend.
Backend (voice-service):
- Heartbeat: new _heartbeat_sender coroutine sends JSON ping text frames
every 15s alongside the Pipecat pipeline; failed send = dead connection
- Session preservation: on WebSocket disconnect, sessions are now marked
"disconnected" with a timestamp instead of being deleted, allowing
reconnection within a configurable TTL (default 60s)
- Reconnect endpoint: POST /sessions/{id}/reconnect verifies the session
is alive and in "disconnected" state, returns fresh websocket_url
- Reconnect-aware WS handler: detects "disconnected" sessions, cancels
stale pipeline tasks, creates a new pipeline, sends "session.resumed"
- Background cleanup: asyncio loop every 30s removes sessions that have
been disconnected longer than session_ttl
- Structured event protocol: text frames = JSON control messages
(ping/pong/session.resumed/session.ended/error), binary = PCM audio
- New settings: session_ttl (60s), heartbeat_interval (15s),
heartbeat_timeout (45s)
Flutter (agent_call_page.dart):
- Heartbeat monitoring: tracks last server ping timestamp, triggers
reconnect if no ping received in 45s (3 missed intervals)
- Auto-reconnect: exponential backoff (1s→2s→4s→8s→16s), max 5 attempts;
calls /reconnect endpoint to verify session, rebuilds WebSocket,
resets audio buffer, restarts heartbeat
- Reconnecting UI: yellow warning banner "重新连接中... (N/5)" with
spinner overlay during reconnection attempts
- WebSocket data routing: _onWsData distinguishes String (JSON control)
from binary (audio) frames, handles ping/session.resumed/session.ended
- User-initiated disconnect guard: _userEndedCall flag prevents reconnect
attempts when user intentionally hangs up
- session_id field compatibility: supports session_id/sessionId/id
Flutter (pcm_player.dart):
- Jitter buffer: queues incoming PCM chunks, starts playback only after
accumulating 4800 bytes (150ms at 16kHz 16-bit mono) to smooth out
network timing variance
- reset() method: clears buffer on reconnect to discard stale audio
- Buffer underrun handling: re-enters buffering phase if queue empties
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace traditional on-device speech_to_text with a modern pipeline:
- Record audio via `record` package with hardware noise suppression
- Apply GTCRN neural denoising (sherpa-onnx, ICASSP 2024, 48K params)
- Trim silence, POST to backend /voice/transcribe (faster-whisper)
Changes:
- Add /transcribe endpoint to voice-service for audio file upload
- Add SpeechEnhancer wrapper for sherpa-onnx GTCRN model (523KB)
- Rewrite chat_page.dart voice input: record → denoise → transcribe
- Keep NoiseReducer.trimSilence for silence removal only
- Upgrade record to v6.2.0, add sherpa_onnx, path_provider
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>