Commit Graph

9 Commits

Author SHA1 Message Date
hailin 75eff6e8e7 fix(flutter): pin livekit_client to ^2.6.4, upgrade device_info_plus to ^12.3.0
livekit_client 2.3.1+hotfix.1 removed the `subscribe` parameter from Timeouts,
causing build failure. Pinning to 2.6.4 (which has subscribe) and bumping
device_info_plus to ^12.3.0 as required by livekit_client >=2.6.0.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-07 18:16:53 -08:00
hailin 5460be8c04 feat: add TTS voice and style settings to Flutter app
Add user-configurable TTS voice and tone style settings that flow from
the Flutter app through the backend to the voice-agent at call time.

## Flutter App (it0_app)

### Domain Layer
- app_settings.dart: Add `ttsVoice` (default: 'coral') and `ttsStyle`
  (default: '') fields to AppSettings entity with copyWith support

### Data Layer
- settings_datasource.dart: Add SharedPreferences keys
  `settings_tts_voice` and `settings_tts_style` for local persistence
  in loadSettings(), saveSettings(), and clearSettings()

### Presentation Layer
- settings_providers.dart: Add `setTtsVoice()` and `setTtsStyle()`
  methods to SettingsNotifier for Riverpod state management
- settings_page.dart: Add "语音" settings group between Notifications
  and Security groups with:
  - Voice picker: 13 OpenAI voices with gender/style labels
    (e.g. "女 · 温暖", "男 · 沉稳", "中性") in a BottomSheet
  - Style picker: 5 presets (专业干练/温柔耐心/轻松活泼/严肃正式/科幻AI)
    as ChoiceChips + custom text input field + reset button

### Call Flow
- agent_call_page.dart: Send `tts_voice` and `tts_style` in the POST
  body when requesting a LiveKit token at call initiation

## Backend

### voice-service (Python/FastAPI)
- livekit_token.py: Accept optional `tts_voice` and `tts_style` via
  Pydantic TokenRequest body model; embed them in RoomAgentDispatch
  metadata JSON alongside auth_header (backward compatible)

### voice-agent (Python/LiveKit Agents)
- agent.py: Extract `tts_voice` and `tts_style` from ctx.job.metadata;
  use them when creating openai_plugin.TTS() — user-selected voice
  overrides config default, user-selected style overrides default
  instructions. Falls back to config defaults when not provided.

## Data Flow
Flutter Settings → SharedPreferences → POST /livekit/token body →
voice-service embeds in RoomAgentDispatch metadata →
voice-agent reads from ctx.job.metadata → TTS creation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 09:38:15 -08:00
hailin e4c2505048 feat: add multimodal image input with streaming markdown optimization
Two major features in this commit:

1. Streaming Markdown Rendering Optimization
   - Replace deprecated flutter_markdown with gpt_markdown (active, AI-optimized)
   - Real-time markdown rendering during streaming (was showing raw syntax)
   - Solid block cursor (█) instead of AnimationController blink
   - 80ms token throttle buffer reducing rebuilds from per-token to ~12.5/sec
   - RepaintBoundary isolation for markdown widget repaints
   - StreamTextWidget simplified from StatefulWidget to StatelessWidget

2. Multimodal Image Input (camera + gallery + display)
   - Flutter: image_picker for gallery/camera, base64 encoding, attachment
     preview strip with delete, thumbnails in sent messages
   - Data layer: List<String>? → List<Map<String, dynamic>>? for structured
     attachment payloads through datasource/repository/usecase
   - ChatAttachment model with base64Data, mediaType, fileName
   - ChatMessage entity + ChatMessageModel both support attachments field
   - Backend DTO, Entity (JSONB), Controller, ConversationContextService
     all extended to receive, store, and reconstruct Anthropic image
     content blocks in loadContext()
   - Claude API engine skips duplicate user message when history already
     ends with multimodal content blocks
   - NestJS body parser limit raised to 10MB for base64 image payloads
   - Android CAMERA permission added to manifest
   - Image.memory uses cacheWidth/cacheHeight for memory efficiency
   - Max 5 images per message enforced in UI

Data flow:
  ImagePicker → base64Encode → ChatAttachment → POST body →
  DB (JSONB) → loadContext → Anthropic image content blocks → Claude API

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 03:24:17 -08:00
hailin f5d9b1f04f feat: add app upgrade system with self-hosted APK update support
- Add core/updater module: version checker, download manager (resumable + SHA-256),
  APK installer, app market detector, self-hosted updater with progress dialogs
- Add Android native MethodChannels for APK installation and market detection
- Add FileProvider config and REQUEST_INSTALL_PACKAGES permission
- Wire UpdateService singleton into main.dart initialization
- Add auto-check on home entry with cooldown + app resume detection
- Add manual "检查更新" button and dynamic version display in settings
- Fix chat page: bottom overflow, bash spinner persistence, collapsible results
- Merge standing orders into tasks page as second tab

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 22:35:01 -08:00
hailin e706a4cdc7 fix: enable simultaneous playback + recording in voice call
Root cause: PcmPlayer called openPlayer() without audio session config,
so Android defaulted to earpiece-only mode. When the mic was actively
recording, playback was silently suppressed — the agent's TTS audio was
sent successfully over WebSocket but never reached the speaker.

Changes:

1. PcmPlayer (pcm_player.dart):
   - Added audio_session package for proper audio session management
   - Configure AudioSession with playAndRecord category so mic + speaker
     work simultaneously
   - Set voiceCommunication usage to enable Android hardware AEC (echo
     cancellation) — prevents feedback loops when speaker is active
   - defaultToSpeaker routes output to loudspeaker instead of earpiece
   - Restored setSpeakerOn() method stub (used by UI toggle)

2. AgentCallPage (agent_call_page.dart):
   - Fixed fire-and-forget bug: _pcmPlayer.feed() returns Future but was
     called without await, causing interleaved feedUint8FromStream calls
   - Added _feedChain serializer to guarantee sequential audio feeding

3. Dependencies:
   - Added audio_session package to pubspec.yaml

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 06:48:16 -08:00
hailin 8015154a3e feat: replace default Flutter icon with iAgent robot logo
使用项目自有的绿色机器人 SVG logo 生成各分辨率 Android 启动图标,
替换默认的 Flutter 蓝色纸飞机。支持 Android Adaptive Icon(白底 + 机器人前景)。
同时生成 iOS 图标。

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 01:41:36 -08:00
hailin 092a561867 feat: 完成 iAgent App 三大功能 + 修复租户上下文
## 功能一:设置页(完整实现)
- 新增浅色主题(lightTheme),支持深色/浅色/跟随系统三种模式
- app.dart 接入 themeMode 动态切换
- 设置页完整重写:个人信息编辑、修改密码、主题切换、通知开关
- 新增 settings_remote_datasource 对接后端 admin/settings API
- settings_providers 新增 AccountProfileNotifier 管理远程个人资料

## 功能二:语音通话(音频集成)
- 添加 flutter_sound 依赖,创建 PcmPlayer 流式 PCM 播放器
- agent_call_page 替换空壳:真实麦克风采集(record + GTCRN 降噪)
- 真实 PCM 16kHz 流式播放,基于 RMS 能量驱动波形动画
- 修复 WebSocket URL 路径:/ws/voice/ → /api/v1/voice/ws/
- voice_repository_impl 支持后端返回相对路径自动拼接

## 功能三:推送通知(WebSocket MVP)
- 添加 flutter_local_notifications + socket_io_client 依赖
- 创建 AppNotification 实体、NotificationService(Socket.IO 连接 comm-service)
- 通知 providers:列表管理 + 未读计数
- 登录后自动连接通知服务,登出断开
- 底部导航 Alerts 标签添加未读角标(Badge)
- AndroidManifest 添加 POST_NOTIFICATIONS 权限
- main.dart 初始化本地通知插件

## 修复:租户上下文未初始化(500错误)
- 根因:登录后未设置 currentTenantIdProvider,导致 X-Tenant-Id 头缺失
- Flutter 端:login() 成功后从 JWT 设置 tenantId,logout 时清除
- 后端:tenant-context.middleware 增加 JWT tenantId 回退逻辑
- AuthUser 模型新增 tenantId 字段解析

新增 5 个文件,修改 16 个文件,添加 3 个依赖包

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 01:10:52 -08:00
hailin a568558585 feat: replace speech_to_text with GTCRN ML noise reduction + backend STT
Replace traditional on-device speech_to_text with a modern pipeline:
- Record audio via `record` package with hardware noise suppression
- Apply GTCRN neural denoising (sherpa-onnx, ICASSP 2024, 48K params)
- Trim silence, POST to backend /voice/transcribe (faster-whisper)

Changes:
- Add /transcribe endpoint to voice-service for audio file upload
- Add SpeechEnhancer wrapper for sherpa-onnx GTCRN model (523KB)
- Rewrite chat_page.dart voice input: record → denoise → transcribe
- Keep NoiseReducer.trimSilence for silence removal only
- Upgrade record to v6.2.0, add sherpa_onnx, path_provider

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 07:59:15 -08:00
hailin 00f8801d51 Initial commit: IT0 AI-powered server cluster operations platform
Full-stack monorepo with DDD + Clean Architecture:
- Backend: 7 NestJS microservices + 5 shared libraries (TypeScript)
- Mobile: Flutter app with Riverpod (Dart)
- Web Admin: Next.js dashboard with Zustand + React Query
- Voice: Python voice service (STT/TTS/VAD)
- Infra: Docker Compose, K8s manifests, Turborepo build

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 22:54:37 -08:00