Commit Graph

66 Commits

Author SHA1 Message Date
hailin 5460be8c04 feat: add TTS voice and style settings to Flutter app
Add user-configurable TTS voice and tone style settings that flow from
the Flutter app through the backend to the voice-agent at call time.

## Flutter App (it0_app)

### Domain Layer
- app_settings.dart: Add `ttsVoice` (default: 'coral') and `ttsStyle`
  (default: '') fields to AppSettings entity with copyWith support

### Data Layer
- settings_datasource.dart: Add SharedPreferences keys
  `settings_tts_voice` and `settings_tts_style` for local persistence
  in loadSettings(), saveSettings(), and clearSettings()

### Presentation Layer
- settings_providers.dart: Add `setTtsVoice()` and `setTtsStyle()`
  methods to SettingsNotifier for Riverpod state management
- settings_page.dart: Add "语音" settings group between Notifications
  and Security groups with:
  - Voice picker: 13 OpenAI voices with gender/style labels
    (e.g. "女 · 温暖", "男 · 沉稳", "中性") in a BottomSheet
  - Style picker: 5 presets (专业干练/温柔耐心/轻松活泼/严肃正式/科幻AI)
    as ChoiceChips + custom text input field + reset button

### Call Flow
- agent_call_page.dart: Send `tts_voice` and `tts_style` in the POST
  body when requesting a LiveKit token at call initiation

## Backend

### voice-service (Python/FastAPI)
- livekit_token.py: Accept optional `tts_voice` and `tts_style` via
  Pydantic TokenRequest body model; embed them in RoomAgentDispatch
  metadata JSON alongside auth_header (backward compatible)

### voice-agent (Python/LiveKit Agents)
- agent.py: Extract `tts_voice` and `tts_style` from ctx.job.metadata;
  use them when creating openai_plugin.TTS() — user-selected voice
  overrides config default, user-selected style overrides default
  instructions. Falls back to config defaults when not provided.

## Data Flow
Flutter Settings → SharedPreferences → POST /livekit/token body →
voice-service embeds in RoomAgentDispatch metadata →
voice-agent reads from ctx.job.metadata → TTS creation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 09:38:15 -08:00
hailin 46a2d06be3 fix: implement speaker/earpiece toggle on voice call page
Use Hardware.instance.setSpeakerphoneOn() to switch between speaker
and earpiece modes. Default to speaker on.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 11:11:29 -08:00
hailin 19efeec26d fix: remove unsupported audioBitrate param from AudioPublishOptions
livekit_client 2.6.4 no longer has audioBitrate parameter.
Default AudioPublishOptions auto-selects optimal speech bitrate.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 10:13:11 -08:00
hailin 94a14b3104 feat: migrate voice call from WebSocket/PCM to LiveKit WebRTC
实时语音对话架构迁移:WebSocket → LiveKit WebRTC

## 背景
原语音通话架构基于 FastAPI WebSocket 传输原始 PCM,管道串行执行
(VAD → 批量STT → Agent → 攒句 → 批量TTS),首音频延迟约 6 秒。
迁移到 LiveKit Agents 框架后,利用 WebRTC 传输 + 流水线并行,
预期延迟降至 1.5-2 秒。

## 架构
Flutter App ←── WebRTC (Opus/UDP) ──→ LiveKit Server ←──→ Voice Agent
  livekit_client                      (自部署, Go)       (Python, LiveKit Agents SDK)
                                                          ├─ VAD (Silero)
                                                          ├─ STT (faster-whisper / OpenAI)
                                                          ├─ LLM (自定义插件 → agent-service)
                                                          └─ TTS (Kokoro / OpenAI)

关键设计:LLM 不直接调用 Claude API,而是通过自定义插件代理到现有
agent-service,保留 Tool Use、会话历史、租户隔离等能力。

## 新增服务

### voice-agent (packages/services/voice-agent/)
LiveKit Agent Worker,包含:
- agent.py: 入口,prewarm() 预加载模型,entrypoint() 编排会话
- plugins/agent_llm.py: 自定义 LLM 插件,代理 agent-service API
  - POST /api/v1/agent/tasks 创建任务
  - WS /ws/agent 订阅流式事件 (stream_event)
  - 跨轮复用 session_id 保持对话上下文
- plugins/whisper_stt.py: 本地 faster-whisper STT (批量识别)
- plugins/kokoro_tts.py: 本地 Kokoro-82M TTS (24kHz PCM)
- config.py: pydantic-settings 配置

### LiveKit Server (deploy/docker/)
- livekit.yaml: 信令端口 7880, RTC TCP 7881, UDP 50000-50200
- docker-compose.yml: 新增 livekit-server + voice-agent 容器

### LiveKit Token 端点
- voice-service/src/api/livekit_token.py:
  POST /api/v1/voice/livekit/token
  生成 Room JWT,嵌入 auth_header 到 AgentDispatch metadata

## Flutter 客户端改造
- agent_call_page.dart: 从 ~814 行简化到 ~380 行
  - 替换: WebSocketChannel, AudioRecorder, PcmPlayer, 手动心跳/重连
  - 使用: Room.connect(), setMicrophoneEnabled(true), LiveKit 事件监听
  - 波形动画改用 participant.audioLevel
- pubspec.yaml: 添加 livekit_client: ^2.3.0
- app_config.dart: 增加 livekitUrl 字段
- api_endpoints.dart: 增加 livekitToken 端点

## 配置说明 (环境变量)
- STT_PROVIDER: local (默认, faster-whisper) / openai
- TTS_PROVIDER: local (默认, Kokoro) / openai
- WHISPER_MODEL: base (默认) / small / medium / large
- WHISPER_LANGUAGE: zh (默认)
- KOKORO_VOICE: zf_xiaoxiao (默认)
- DEVICE: cpu (默认) / cuda

## 不变的部分
- agent-service: 完全不改,voice-agent 通过现有 API 调用
- voice-service 核心: pipeline/STT/TTS/VAD 保留 (Twilio 备用)
- Kong 网关: 现有路由不变
- 数据库: 无 schema 变更

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 08:55:33 -08:00
hailin 7e44ddc358 fix: file picker now shows subdirectories on Android
FileType.custom with allowedExtensions causes Android system picker
to hide subdirectories on some devices. Changed to FileType.any with
post-selection extension validation instead.

- Unsupported file types are skipped with a SnackBar hint
- Allowed: jpg, jpeg, png, gif, webp, pdf

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 06:02:47 -08:00
hailin 3025910095 ui: transparent compact AppBar (64dp → 44dp)
- AppBar background transparent, merges with scaffold for seamless look
- toolbarHeight reduced from 64dp to 44dp (~20dp screen space saved)
- scrolledUnderElevation: 0 prevents Material 3 shadow on scroll
- Icons 24→20px with VisualDensity.compact for tighter action buttons
- Title fontSize 16 w600, less visual weight
- Both dark and light themes updated consistently

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 05:20:23 -08:00
hailin ed39518a71 feat: floating pill input bar + auto-scroll on history load
Input area redesign (ChatGPT/Claude App style):
- Replace fixed bottom bar with floating pill overlay using Stack+Positioned
- Semi-transparent background (surface 92% opacity) with rounded corners (28px)
- Drop shadow for depth separation from content
- Remove inner TextField border (InputBorder.none) for cleaner look
- ListView bottom padding increased to 80px to leave room under the pill
- Input pill floats 12px from edges, 8px from bottom

History scroll fix:
- Add jump parameter to _scrollToBottom() for instant positioning
- When loading conversation history (empty→many messages), use jumpTo
  instead of animateTo to avoid incomplete scroll on large message lists
- Double-frame jumpTo ensures layout settles before final scroll position

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 05:15:18 -08:00
hailin 1f1bf18a75 fix: remove clipboard paste menu item, fix timeline line overlap, dim input placeholder
- Remove redundant "从剪贴板粘贴" option from attachment menu (long-press to paste natively)
- Remove super_clipboard dependency (no longer needed)
- Fix timeline vertical line overlapping icon nodes by using dynamic dotRadius
- Dim input field placeholder color to AppColors.textMuted

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 05:05:27 -08:00
hailin cfc0a97da7 fix: correct super_clipboard getFile API call signature
getFile requires two positional args: format and callback.
Wrapped in Completer for async/await usage.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 04:45:19 -08:00
hailin 5f28605e13 feat: add clipboard paste, multi-image select, and file picker
- Add super_clipboard and file_picker dependencies
- Clipboard paste: reads PNG/JPEG image data from system clipboard
- Multi-image: pickMultiImage with remaining count limit
- File picker: supports images (jpg/png/gif/webp) and PDF files
- Updated attachment preview to show file icon for non-image types
- Bottom sheet now shows 4 options: gallery, camera, clipboard, file

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 04:32:16 -08:00
hailin e4c2505048 feat: add multimodal image input with streaming markdown optimization
Two major features in this commit:

1. Streaming Markdown Rendering Optimization
   - Replace deprecated flutter_markdown with gpt_markdown (active, AI-optimized)
   - Real-time markdown rendering during streaming (was showing raw syntax)
   - Solid block cursor (█) instead of AnimationController blink
   - 80ms token throttle buffer reducing rebuilds from per-token to ~12.5/sec
   - RepaintBoundary isolation for markdown widget repaints
   - StreamTextWidget simplified from StatefulWidget to StatelessWidget

2. Multimodal Image Input (camera + gallery + display)
   - Flutter: image_picker for gallery/camera, base64 encoding, attachment
     preview strip with delete, thumbnails in sent messages
   - Data layer: List<String>? → List<Map<String, dynamic>>? for structured
     attachment payloads through datasource/repository/usecase
   - ChatAttachment model with base64Data, mediaType, fileName
   - ChatMessage entity + ChatMessageModel both support attachments field
   - Backend DTO, Entity (JSONB), Controller, ConversationContextService
     all extended to receive, store, and reconstruct Anthropic image
     content blocks in loadContext()
   - Claude API engine skips duplicate user message when history already
     ends with multimodal content blocks
   - NestJS body parser limit raised to 10MB for base64 image payloads
   - Android CAMERA permission added to manifest
   - Image.memory uses cacheWidth/cacheHeight for memory efficiency
   - Max 5 images per message enforced in UI

Data flow:
  ImagePicker → base64Encode → ChatAttachment → POST body →
  DB (JSONB) → loadContext → Anthropic image content blocks → Claude API

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 03:24:17 -08:00
hailin 89f0f6134d fix: resolve bottom overflow issues in chat page timeline rendering
Three root causes fixed:

1. TimelineEventNode: Replaced IntrinsicHeight (which forces intrinsic
   height calculation on unbounded content) with CustomPaint-based
   _TimelineLinePainter that draws vertical lines based on actual
   rendered widget size. Also added maxLines/ellipsis to label text
   and mainAxisSize.min on inner Column.

2. ApprovalActionCard: Changed countdown + action buttons layout from
   Row with Spacer (which requires infinite width) to Wrap with
   spacing, preventing horizontal overflow on narrow screens.

3. AnimatedCrossFade in _CollapsibleCodeBlock and _CollapsibleThinking:
   Wrapped with ClipRect and added sizeCurve: Curves.easeInOut to
   prevent the outgoing child from extending beyond parent bounds
   during the cross-fade transition.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 01:38:37 -08:00
hailin 50dbb641a3 fix: comprehensive hardening of agent task cancel/inject/approve flows
6 rounds of systematic audit identified and fixed 14 bugs across
backend controller and Flutter client:

## Backend (agent.controller.ts)

Security & Tenant Isolation:
- Add @TenantId + ForbiddenException check to cancelTask, injectMessage,
  approveCommand — all 4 write endpoints now enforce tenant isolation
- Add tenantId check on session reuse in executeTask to prevent
  cross-tenant session hijacking

Architecture & Correctness:
- Extract shared runTaskStream() from inline fire-and-forget block,
  used by both executeTask and injectMessage to reduce duplication
- Use session.engineType (not getActiveEngine()) in cancelTask,
  injectMessage, approveCommand — fixes wrong-engine-cancel when
  global engine config is switched after task creation
- Add concurrent task prevention: executeTask checks for existing
  RUNNING task on same session and cancels it before starting new one
- Add runningTasks Map to track task promises, awaitTaskCleanup()
  helper with 3s timeout for inject to wait for partial text save
- captureSdkSessionId() captures SDK session ID into metadata
  without DB save (callers persist), preventing fire-and-forget race

Cancel/Reject Improvements:
- cancelTask: idempotent (returns early if already CANCELLED/COMPLETED),
  session stays 'active' (was 'cancelled'), emits cancelled WS event
- approveCommand reject: session stays 'active' (was 'cancelled'),
  now emits cancelled WS event so Flutter stream listeners clean up
- approveCommand approved: collect text events and save assistant
  response to conversation history on completion (was missing)

Minor:
- task.result! non-null assertion → task.result ?? 'Unknown error'
- Add findRunningBySessionId() to TaskRepository

## Flutter

API Contract Fix:
- approveCommand: route changed from /api/v1/ops/approvals/:id/approve
  to /api/v1/agent/tasks/:id/approve with {approved: true} body
- rejectCommand: route changed from /api/v1/ops/approvals/:id/reject
  to /api/v1/agent/tasks/:id/approve with {approved: false} body

Resource Management:
- ChatNotifier.dispose() now disconnects WebSocket to prevent
  connection leak when navigating away from chat

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 22:20:46 -08:00
hailin d5f663f7af feat: inject-message support for mid-stream task interruption
Backend (agent-engine.port.ts):
- Add `cancelled` event type: emitted when a task is cancelled (user-initiated
  or injection), so Flutter can close the old stream cleanly
- Add `task_info` event type: emitted after inject to pass the new taskId to
  the client, enabling cancel/re-inject on the replacement task

Flutter (features/chat/):
- ChatState: track current `taskId` alongside `sessionId`; clear on completion
  or error
- Handle `TaskInfoEvent`: update taskId in state when server issues a new task
- Handle `CancelledEvent`: treat as stream termination (agentStatus → idle)
- MessageType.interrupted: new UI node (warning style) for mid-stream cancels
- _inject(): send text as an inject request while streaming; backend cancels
  the current task and starts a new one with the injected message
- Input area: during streaming, hint changes to "追加指令...", Enter key calls
  _inject() instead of _send(), and both inject-send + stop buttons are shown
- isAwaitingApproval kept separate from isStreaming so approval flow is not
  blocked by inject mode

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 21:33:50 -08:00
hailin f5d9b1f04f feat: add app upgrade system with self-hosted APK update support
- Add core/updater module: version checker, download manager (resumable + SHA-256),
  APK installer, app market detector, self-hosted updater with progress dialogs
- Add Android native MethodChannels for APK installation and market detection
- Add FileProvider config and REQUEST_INSTALL_PACKAGES permission
- Wire UpdateService singleton into main.dart initialization
- Add auto-check on home entry with cooldown + app resume detection
- Add manual "检查更新" button and dynamic version display in settings
- Fix chat page: bottom overflow, bash spinner persistence, collapsible results
- Merge standing orders into tasks page as second tab

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 22:35:01 -08:00
hailin bc7e32061a fix: improve voice call reconnection robustness
Server side (session_router.py):
- /reconnect now accepts sessions in "active" state (not just "disconnected")
- When client reconnects to an active session, the old WebSocket/pipeline is
  automatically replaced when the new WebSocket connects
- Only truly terminal states (e.g. "ended") return 409

Flutter side (agent_call_page.dart):
- Distinguish terminal errors (404 session gone, 409 ended) from transient
  errors (network timeout, server unreachable) in reconnect loop
- Terminal errors break immediately instead of wasting retry attempts
- Extract _connectWebSocket() helper for cleaner reconnect flow
- Add DioException handling for proper HTTP status code inspection

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 07:33:34 -08:00
hailin 57fabb4653 fix: set interleaved=true for PcmPlayer streaming playback
FlutterSoundPlayer.feedUint8FromStream() requires interleaved mode.
With interleaved=false, every feed() call threw:
  "Cannot feed with UInt8 with non interleaved mode"

- feedUint8FromStream (Uint8List) → requires interleaved: true
- feedFromStream (Float32List) → requires interleaved: false
Since we feed raw PCM bytes (Uint8List), interleaved must be true.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 06:59:06 -08:00
hailin e706a4cdc7 fix: enable simultaneous playback + recording in voice call
Root cause: PcmPlayer called openPlayer() without audio session config,
so Android defaulted to earpiece-only mode. When the mic was actively
recording, playback was silently suppressed — the agent's TTS audio was
sent successfully over WebSocket but never reached the speaker.

Changes:

1. PcmPlayer (pcm_player.dart):
   - Added audio_session package for proper audio session management
   - Configure AudioSession with playAndRecord category so mic + speaker
     work simultaneously
   - Set voiceCommunication usage to enable Android hardware AEC (echo
     cancellation) — prevents feedback loops when speaker is active
   - defaultToSpeaker routes output to loudspeaker instead of earpiece
   - Restored setSpeakerOn() method stub (used by UI toggle)

2. AgentCallPage (agent_call_page.dart):
   - Fixed fire-and-forget bug: _pcmPlayer.feed() returns Future but was
     called without await, causing interleaved feedUint8FromStream calls
   - Added _feedChain serializer to guarantee sequential audio feeding

3. Dependencies:
   - Added audio_session package to pubspec.yaml

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 06:48:16 -08:00
hailin 4456550393 feat: lazy-load local TTS/STT models on first request
Local /synthesize and /transcribe endpoints now auto-load Kokoro/Whisper
models on first call instead of returning 503 when not pre-loaded at
startup. This allows switching between Local and OpenAI providers in the
Flutter test page without requiring server restart.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 04:38:49 -08:00
hailin 7b71a4f2fc fix: properly close WebSocket with subscription cancel + fire-and-forget
Root cause: IOWebSocketChannel.sink.close() can hang indefinitely
(dart-lang/web_socket_channel#185). Previous fix used unawaited close
but didn't cancel the stream subscription, so the old listener could
still push events to _messageController.

Fix: Extract _closeCurrentConnection() that:
1. Cancels StreamSubscription first (stops duplicate events immediately)
2. Fire-and-forget sink.close(goingAway) (frees underlying socket)

This follows the workaround recommended in the official issue tracker.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 03:45:43 -08:00
hailin 45eb6bc453 fix: use unawaited close to prevent WebSocket reconnect hang
The await on sink.close() blocks indefinitely when the server doesn't
respond to the close handshake. Use fire-and-forget with unawaited()
so the new connection can proceed immediately.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 03:41:13 -08:00
hailin 3185438f36 fix: close previous WebSocket before opening new connection
When sending a second message in the same session, the old WebSocket
connection was not closed, causing both connections to subscribe to the
same session room. This resulted in each text event being received twice,
producing garbled/duplicated output text.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 03:37:16 -08:00
hailin 2403ce5636 feat: multi-turn conversation context management with session history UI
Implement DB-based conversation message storage (engine-agnostic) that
works across both Claude API and Agent SDK engines. Add ChatGPT/Claude-style
conversation history drawer in Flutter with date-grouped session list,
session switching, and new chat functionality.

Backend: entity, repository, context service, migration 004, session/message
API endpoints. Flutter: ConversationDrawer, sessionId flow from backend
response via SessionInfoEvent, session list/switch/delete support.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 19:04:35 -08:00
hailin 7cda482e49 fix: simplify _dioBinary in voice test page to avoid interceptor conflicts
Remove shared interceptors from the binary Dio instance to prevent
request dedup/retry interceptors from interfering with audio downloads.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 17:57:58 -08:00
hailin f7d39d8544 fix: use theme-aware colors in voice test page for dark mode readability
Replace hardcoded Colors.grey with Theme.of(context).colorScheme for
result containers and status text so they're readable in both light
and dark themes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 09:21:06 -08:00
hailin d43baed3a5 feat: add OpenAI TTS/STT API endpoints for comparison testing
- Add openai package to voice-service requirements
- Add /api/v1/test/tts/synthesize-openai (tts-1/tts-1-hd/gpt-4o-mini-tts)
- Add /api/v1/test/stt/transcribe-openai (gpt-4o-transcribe/whisper-1)
- Add OPENAI_API_KEY and OPENAI_BASE_URL env vars to voice-service
- Flutter test page: SegmentedButton to toggle Local/OpenAI provider
- All endpoints maintain same response format for easy comparison

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 07:20:03 -08:00
hailin ac0b8ee1c6 fix: rewrite voice test page using flutter_sound for both record and play
- Remove record package dependency, use FlutterSoundRecorder instead
- Use permission_handler for microphone permission (already in pubspec)
- Proper temp file path via path_provider
- Cleanup temp files after upload
- Single package (flutter_sound) handles both recording and playback

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 05:41:10 -08:00
hailin d4783a3497 fix: use temp directory path for audio recording instead of empty string
The record package requires a valid file path. Empty string caused
ENOENT (No such file or directory) on Android.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 05:39:07 -08:00
hailin 5d4fd96d43 feat: streaming claude-api engine, engineType override, fix voice test page
- Claude API engine now uses streaming API (messages.stream) for real-time
  text delta output instead of waiting for full response
- Agent controller accepts optional engineType body parameter to allow
  callers (e.g. voice pipeline) to select a specific engine
- Fix voice_test_page.dart compilation error: replace audioplayers (not
  installed) with flutter_sound (already in pubspec.yaml)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 05:30:11 -08:00
hailin 6e832c7615 feat: add voice I/O test page in Flutter settings
- TTS: text input → Kokoro synthesis → audio playback
- STT: long-press record → faster-whisper transcription
- Round-trip: record → STT → TTS → playback
- Added /api/v1/test route to Kong gateway for voice-service
- Accessible from Settings → 语音 I/O 测试

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 05:16:10 -08:00
hailin e20936ee2a feat: collapsible thinking node in chat timeline
Thinking content auto-expands while streaming, auto-collapses when done.
User can toggle with "Thinking ∨" button, matching Claude Code VSCode UX.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 03:34:58 -08:00
hailin 7afbd54fce fix: rewrite voice pipeline for direct WebSocket I/O, fix TTS and navigation
Root cause: Pipecat's WebsocketServerTransport creates its own WebSocket
server on (host,port) and expects FrameProcessor subclasses. Our code was
passing a FastAPI WebSocket object as 'host' and using plain STT/TTS/VAD
service classes that aren't FrameProcessors. The pipeline crashed immediately
when receiving audio, causing "disconnects when speaking".

Changes:
- **base_pipeline.py**: Complete rewrite — replaced Pipecat Pipeline with
  direct async loop: WebSocket → VAD → STT → Claude LLM → TTS → WebSocket.
  Supports barge-in (interrupt TTS when user speaks), audio chunking, and
  24kHz→16kHz TTS resampling.
- **session_router.py**: Pass WebSocket directly to pipeline instead of
  wrapping in AppTransport.
- **app_transport.py**: Deprecated (no longer needed).
- **kokoro_service.py**: Fix misaki compatibility (MutableToken→MToken
  rename), use correct Chinese voice 'zf_xiaoxiao', handle torch tensors.
- **main.py**: Apply misaki monkey-patch before importing kokoro.
- **settings.py**: Change default TTS voice from 'zh_female_1' (non-existent)
  to 'zf_xiaoxiao' (valid Kokoro-82M Chinese female voice).
- **requirements.txt**: Remove pipecat-ai dependency, pin kokoro==0.3.5 +
  misaki==0.7.17, add Chinese NLP deps (pypinyin, cn2an, jieba, ordered-set).
- **agent_call_page.dart**: Wrap each cleanup step in try/catch to ensure
  Navigator.pop() always executes after call ends. Add 3s timeout on session
  delete request.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 23:34:35 -08:00
hailin 2a87dd346e fix: send empty JSON body to voice session endpoint (fixes 422)
FastAPI 的 create_session 端点声明了 Pydantic request body
(CreateSessionRequest),虽然所有字段都有默认值,但 FastAPI
仍要求请求包含有效 JSON body(至少 {})。Flutter 端 dio.post
未传 data 参数导致 Content-Type 缺失,FastAPI 返回 422
Unprocessable Entity。修复:添加 data: {} 发送空 JSON 对象。

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 21:21:48 -08:00
hailin 45c54acb87 fix: improve voice call page UI centering and error display
1. 布局居中:将 Column 包裹在 Center 中,所有文本添加
   textAlign: TextAlign.center,确保头像、标题、副标题
   在各种屏幕尺寸上居中显示。

2. 错误展示优化:将 SnackBar 大面积红色块替换为行内错误卡片,
   采用圆角容器 + error icon + 简洁文案,视觉上更融洽。
   新增 _errorMessage 状态字段 + _friendlyError() 方法,
   将 DioException 等异常转换为中文友好提示(如 "语音服务
   暂不可用 (503)"),避免用户看到大段英文 stacktrace。

3. 错误状态清理:点击接听时自动清除上一次的 _errorMessage。

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 21:10:49 -08:00
hailin b7814d42a9 fix: resolve 3 timeline UI bugs (blank start, spinner style, tool status)
1. 任务开始时空白状态:当 agent streaming 启动但尚无 assistant
   响应时,在时间线末尾插入虚拟 "处理中..." 节点(带脉动 * 动画),
   避免用户发送 prompt 后界面无任何反馈。
   (chat_page.dart: _needsWorkingNode + ListView itemCount+1)

2. * 动画从旋转改为脉动:_SpinnerDot 由 Transform.rotate 改为
   Transform.scale(0.6x↔1.2x 缩放 + 透明度 0.6↔1.0 呼吸),
   duration 从 1000ms 降至 800ms 并启用 reverse,视觉效果类似
   星星闪烁而非机械旋转。
   (timeline_event_node.dart: _SpinnerDotState)

3. 工具执行完成后状态卡在 spinner:ToolResultEvent 到达时仅创建
   新 toolResult 消息,未回溯更新对应 toolUse 消息的 ToolStatus,
   导致时间线上工具节点永远显示 executing spinner。修复:在
   ToolResultEvent handler 中向前查找最近的 executing 状态的
   toolUse 消息,将其 status 更新为 completed/error。
   (chat_providers.dart: ToolResultEvent case)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 18:13:30 -08:00
hailin 20325a84bd feat: redesign chat UI from bubble style to timeline workflow
Replace traditional chat bubble layout with a Claude Code-inspired
timeline/workflow design:
- Vertical gray line connecting sequential event nodes
- Colored dots for each event (green=done, red=error, yellow=warning)
- Animated spinning asterisk (*) on active nodes
- Streaming text with blinking cursor in timeline nodes
- Tool execution shown as code blocks within timeline
- User prompts as distinct nodes with person icon

New file: timeline_event_node.dart (TimelineEventNode, CodeBlock)
Rewritten: chat_page.dart (timeline layout, no more bubbles)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 17:33:42 -08:00
hailin 74be945e4a feat: enable token-level streaming and fix duplicate message bubble
Backend:
- Add includePartialMessages: true to SDK query options
- Handle stream_event/content_block_delta for real-time text streaming
- Skip text/thinking blocks from complete assistant messages (already
  streamed via deltas) to avoid duplication
- Change default result summary to empty string

Flutter:
- Only show CompletedEvent summary when no assistant text was streamed
  (prevents duplicate message bubble)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 17:24:48 -08:00
hailin 5e31b15dcf fix: use IOWebSocketChannel for headers support
WebSocketChannel.connect does not accept headers parameter in
web_socket_channel 2.4.0. Use IOWebSocketChannel.connect instead.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 16:45:35 -08:00
hailin 803cea0fe4 fix: pass JWT token in WebSocket connection headers
WebSocket connections to /ws/agent were rejected by Kong (401)
because the Authorization header was not included. Now reads
access_token from secure storage and passes it in the WebSocket
upgrade request headers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 16:43:31 -08:00
hailin a6cd3c20d9 feat: add WebSocket robustness to voice call (heartbeat, reconnect, jitter buffer)
Addresses reliability gaps in the real-time voice WebSocket connection
between Flutter client and Python voice-service backend.

Backend (voice-service):
- Heartbeat: new _heartbeat_sender coroutine sends JSON ping text frames
  every 15s alongside the Pipecat pipeline; failed send = dead connection
- Session preservation: on WebSocket disconnect, sessions are now marked
  "disconnected" with a timestamp instead of being deleted, allowing
  reconnection within a configurable TTL (default 60s)
- Reconnect endpoint: POST /sessions/{id}/reconnect verifies the session
  is alive and in "disconnected" state, returns fresh websocket_url
- Reconnect-aware WS handler: detects "disconnected" sessions, cancels
  stale pipeline tasks, creates a new pipeline, sends "session.resumed"
- Background cleanup: asyncio loop every 30s removes sessions that have
  been disconnected longer than session_ttl
- Structured event protocol: text frames = JSON control messages
  (ping/pong/session.resumed/session.ended/error), binary = PCM audio
- New settings: session_ttl (60s), heartbeat_interval (15s),
  heartbeat_timeout (45s)

Flutter (agent_call_page.dart):
- Heartbeat monitoring: tracks last server ping timestamp, triggers
  reconnect if no ping received in 45s (3 missed intervals)
- Auto-reconnect: exponential backoff (1s→2s→4s→8s→16s), max 5 attempts;
  calls /reconnect endpoint to verify session, rebuilds WebSocket,
  resets audio buffer, restarts heartbeat
- Reconnecting UI: yellow warning banner "重新连接中... (N/5)" with
  spinner overlay during reconnection attempts
- WebSocket data routing: _onWsData distinguishes String (JSON control)
  from binary (audio) frames, handles ping/session.resumed/session.ended
- User-initiated disconnect guard: _userEndedCall flag prevents reconnect
  attempts when user intentionally hangs up
- session_id field compatibility: supports session_id/sessionId/id

Flutter (pcm_player.dart):
- Jitter buffer: queues incoming PCM chunks, starts playback only after
  accumulating 4800 bytes (150ms at 16kHz 16-bit mono) to smooth out
  network timing variance
- reset() method: clears buffer on reconnect to discard stale audio
- Buffer underrun handling: re-enters buffering phase if queue empties

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 07:32:19 -08:00
hailin dfc541b571 feat: add Markdown rendering and phone-call voice entry to chat UI
Chat message rendering:
- MessageBubble: replace plain SelectableText with MarkdownBody for
  assistant messages, with full dark-theme stylesheet (headers, code
  blocks, tables, blockquotes, list bullets)
- StreamTextWidget: render completed messages as MarkdownBody, keep
  plain-text + blinking cursor for actively streaming messages

Voice interaction redesign:
- Remove all long-press-to-record code (~100 lines): AudioRecorder,
  SpeechEnhancer, mic pulse animation, voice indicator bar,
  SingleTickerProviderStateMixin
- Add phone-call button in AppBar (Icons.call) that navigates to the
  existing AgentCallPage for full-duplex voice conversation
- Add prominent "语音通话" entry button on empty chat state
- AgentCallPage was already fully implemented (ringing → connecting →
  active → ended, dual-direction WebSocket audio, GTCRN denoise,
  Kokoro TTS playback, waveform visualization) but previously unused

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 07:31:40 -08:00
hailin 6dcfe7cd9a chore: gitignore iOS 自动生成文件
- ios/Flutter/ephemeral/ — 构建临时文件
- ios/Runner/GeneratedPluginRegistrant.* — 插件注册表

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 05:10:54 -08:00
hailin 4651291468 style: 导航栏去掉蓝色药丸背景,改为图标/文字高亮
- indicatorColor: transparent 去掉 Material 3 默认的选中背景
- 选中项:图标 + 文字改为 primary 紫色,字重 w600
- 未选中项:图标 + 文字灰色 (textSecondary),字重 w400
- 与微信/支付宝/飞书的导航栏风格一致

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 05:09:34 -08:00
hailin 6a84519090 refactor: 移除 AppBar 刷新按钮,统一使用下拉刷新
6 个页面(仪表盘、服务器、任务、告警、审批、常驻指令)
删除右上角 IconButton(Icons.refresh),保留已有的 RefreshIndicator 下拉刷新。
Terminal 页面的刷新按钮是"重新连接"功能,保持不变。

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 05:08:16 -08:00
hailin 666b173906 fix: 根治 Unhandled Exception — async void 拦截器 + 全局错误兜底
根本原因:Dio interceptor 的 onError/onRequest 签名是 void,
标 async 后变成 Future<void> 但没人 await,内部异常全部变成
Unhandled Exception 崩溃。

修复:
- RetryInterceptor: onError 改为同步调度,retry 逻辑移到独立
  _retry() 方法并用 try/catch 包裹全部路径
- DedupInterceptor: 防止 Completer 重复 complete,retry 请求
  跳过去重避免与原始请求冲突
- TokenInterceptor: onRequest 和 onError 的 async lambda 全部
  包裹 try/catch,异常时 fallback 到 handler.next()
- main.dart: 三层全局错误兜底 —
  1) FlutterError.onError 捕获框架错误
  2) PlatformDispatcher.onError 捕获平台通道错误
  3) runZonedGuarded 捕获所有漏网的异步异常
- receiveTimeout/sendTimeout 不再触发重试(服务器已收到请求)
- 超时调整: connect 10s, send 30s, receive 30s
- 仪表盘卡片 IntrinsicHeight 等高对齐

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 04:37:39 -08:00
hailin 68004409a3 fix: 仪表盘概览卡片等高对齐
- Row 外层包 IntrinsicHeight + CrossAxisAlignment.stretch,三卡自动等高
- Loading/Error 卡片去掉固定 height:140,改为 padding + stretch 拉伸

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 04:28:40 -08:00
hailin 4e55e9a616 feat: 补齐大厂级网络层 — 401并发锁、请求去重、结构化错误日志
## 1. TokenRefreshLock(401 并发刷新竞态修复)
- 新增 `core/network/token_refresh_lock.dart`
- 使用 Completer 实现互斥锁:多个请求同时 401 时,
  仅第一个触发 refreshToken(),其余等待同一结果
- 防止 5 个页面同时 401 → 5 次 refresh → 4 次失败踢出用户

## 2. DedupInterceptor(请求去重)
- 新增 `core/network/dedup_interceptor.dart`
- 相同 GET URL 在飞行中时,后续请求复用第一个的结果
- 防止:用户快速点重试、页面切换重复加载、下拉刷新连点
- 仅限 GET,POST/PUT/DELETE 等写操作始终放行

## 3. ErrorLogInterceptor + ErrorLogger(结构化错误日志)
- 新增 `core/network/error_log_interceptor.dart` — Dio 拦截器
- 新增 `core/services/error_logger.dart` — 持久化日志服务
- 每个失败请求记录:时间戳、方法、URL、状态码、错误类型、重试次数
- 本地 SharedPreferences 存储最近 50 条,支持 summary 统计
- debug 模式同步 debugPrint 输出
- 预留 Sentry/Crashlytics flush 接口

## 4. Dio 拦截器管线优化
拦截器顺序调整为大厂标准管线:
1. DedupInterceptor — 去重(最先,防止重复请求进入管线)
2. TokenInterceptor — 注入 token + 401 刷新(带并发锁)
3. TenantInterceptor — 注入 X-Tenant-Id
4. RetryInterceptor — 指数退避重试
5. ErrorLogInterceptor — 错误日志(最后,记录最终失败)

移除 LogInterceptor(被 ErrorLogInterceptor 替代,且不再在
release 模式下打印请求 body 造成性能损耗)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 04:05:53 -08:00
hailin 94652857cd feat: 生产级 API 错误处理 — 重试拦截器、友好错误提示、网络监测、WebSocket 退避
## 问题
用户看到原始 DioException 堆栈(如 "DioException [unknown]: null Error:
HttpException: Connection reset by peer"),且无重试机制,网络抖动即报错。

## 变更

### 1. RetryInterceptor(指数退避自动重试)
- 新增 `core/network/retry_interceptor.dart`
- 自动重试:连接超时、发送超时、Connection reset、502/503/504/429
- 指数退避(800ms → 1.6s → 3.2s)+ 随机抖动防雪崩
- 最多 3 次重试,非瞬态错误(401/403/404)不重试
- 集成到 dio_client,优化超时:connect 8s、send 15s、receive 20s

### 2. ErrorHandler 全面升级(友好中文错误提示)
- 重写 `core/errors/error_handler.dart`,新增 `friendlyMessage()` 静态方法
- 所有 DioExceptionType 映射为具体中文:
  - Connection reset → "连接被服务器重置,请稍后重试"
  - Connection refused → "服务器拒绝连接,请确认服务是否启动"
  - Timeout → "连接超时,服务器无响应"
  - 401 → "登录已过期,请重新登录"
  - 403/404/429/500/502/503 各有独立提示
- 新增 TimeoutFailure 类型
- 所有 Failure.message 默认中文

### 3. 网络连接监测 + 离线 Banner
- 新增 `core/network/connectivity_provider.dart` — 每30秒探测服务器可达性
- 新增 `core/widgets/offline_banner.dart` — 黄色警告横幅 "网络连接不可用"
- 集成到 ScaffoldWithNav,所有页面顶部自动显示离线状态

### 4. 统一错误展示(杜绝 e.toString())
- 新增 `core/widgets/error_view.dart` — 统一错误 UI(图标 + 友好文案 + 重试按钮)
- 替换 6 个页面的内联错误 Column 为 ErrorView:
  tasks_page / servers_page / alerts_page / approvals_page / standing_orders_page
- 替换 dashboard 的 3 处 _SummaryCardError(message: e.toString())
- 替换 4 个 provider 的 e.toString(): chat / auth / settings / approvals
- 全项目零 e.toString() 残留(仅剩 time.minute.toString() 时间格式化)

### 5. WebSocket 重连增强
- 指数退避(1s → 2s → 4s → ... → 60s 上限)+ 随机抖动
- 最多 10 次自动重连,超限后停止
- disconnect() 阻止自动重连
- 新增 reconnect() 手动重连方法

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 04:01:04 -08:00
hailin 1075a6b265 refactor: 重构 chat_page 对接完整架构,集成全部 stream event widget
## 问题
chat_page.dart 包含内联的简化版 ChatMessage/ChatState/ChatNotifier(约180行),
绕过了已实现的完整 Clean Architecture 层:
- domain/entities/chat_message.dart(含 ToolExecution、ApprovalRequest)
- domain/entities/stream_event.dart(9种 sealed class 事件)
- chat_providers.dart(完整 ChatNotifier 支持审批/工具/常驻指令)
- 5 个独立 widget 全部闲置未使用

## 变更
1. 删除内联重复代码(~180行):ChatRole、ChatContentType、内联 ChatMessage、
   内联 ChatState、内联 ChatNotifier、chatMessagesProvider
2. 切换到正式 chatProvider(chat_providers.dart),支持全部 9 种 StreamEvent
3. 集成 5 个已有 widget:
   - MessageBubble — 用户/AI 消息气泡(带时间戳)
   - StreamTextWidget — AI 流式回复动画光标
   - ToolExecutionCard — 工具执行详情(名称/输入/输出/状态/风险等级)
   - ApprovalActionCard — 审批卡片(倒计时/通过/拒绝/过期处理)
   - AgentThinkingIndicator — 思考动画指示器
4. 新增 _AgentStatusBar — 实时状态条(思考中/执行中/等待审批)
5. 新增 _StandingOrderDraftCard — 常驻指令草案渲染
6. AppBar + 输入区添加停止按钮,审批等待时显示提示
7. 消息渲染按 MessageType 分发:text/thinking/toolUse/toolResult/approval/standingOrderDraft

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 03:21:44 -08:00
hailin 15e6fca6c0 fix: translate all remaining English UI strings to Chinese and remove dead code
- Translate approval_action_card (Approve/Reject/Cancel/Expired)
- Translate tool_execution_card status labels (Executing/Completed/Error)
- Translate chat_providers error messages and stream content
- Translate message_bubble "Thinking..." indicator
- Translate terminal page tooltips (Reconnect/Disconnect)
- Translate fallback values (Untitled/Unknown/No message) across all pages
- Translate auth error "Login failed" and stream error messages
- Remove dead voice_providers.dart (used speech_to_text which is not installed)
- Remove dead voice_input_button.dart (not referenced anywhere)
- Fix widget_test.dart (was referencing non-existent MyApp class)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 02:07:57 -08:00