it0/it0_app
hailin ce63ece340 feat: add mixed-mode input (text + images + files) during voice calls
Enable users to send text messages, images, and files to the Agent
while an active voice call is in progress. This addresses the case
where spoken instructions are unclear or screenshots/documents need
to be shared for analysis.

## Architecture

Data flows through LiveKit data channel (not direct HTTP):
  Flutter → publishData(topic='text_inject') → voice-agent
  → llm.inject_text_message() → POST /api/v1/agent/tasks (same session)
  → collect streamed response → session.say() → TTS playback

This preserves the constraint that voice-agent owns the agent-service
sessionId — Flutter never contacts agent-service directly.

## Flutter UI (agent_call_page.dart)
- Add keyboard toggle button to active call controls (4-button row)
- Collapsible text input area with attachment picker (+) and send button
- Attachment support: gallery multi-select, camera, file picker
  (images max 1024x1024 quality 80%, PDF supported, max 5 attachments)
- Horizontal scrolling attachment preview with delete buttons
- 200KB payload size check before LiveKit data channel send
- Layout adapts: Spacer flex 1/3 toggle, reduced bottom padding

## voice-agent (agent.py)
- Register data_received event listener after session.start()
- Filter for topic='text_inject', parse JSON payload
- Call llm.inject_text_message(text, attachments) and TTS via session.say()
- Use asyncio.ensure_future() wrapper for async handler (matches
  existing disconnect handler pattern for sync EventEmitter)

## AgentServiceLLM (agent_llm.py)
- New inject_text_message(text, attachments) method on AgentServiceLLM
- Reuses same _agent_session_id for conversation context continuity
- WS+HTTP streaming: connect, pre-subscribe, POST /tasks with
  attachments field, collect full text response, return string
- _injecting flag prevents concurrent _do_stream from clearing
  session ID on abort errors while inject is in progress
- Same systemPrompt/voiceMode/engineType as voice pipeline

No agent-service changes required — attachments already supported
end-to-end (JSONB storage → multimodal content blocks → Claude).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 05:38:04 -08:00
..
android feat: add TTS voice and style settings to Flutter app 2026-03-01 09:38:15 -08:00
assets feat: replace default Flutter icon with iAgent robot logo 2026-02-23 01:41:36 -08:00
ios/Runner/Assets.xcassets/AppIcon.appiconset feat: replace default Flutter icon with iAgent robot logo 2026-02-23 01:41:36 -08:00
lib feat: add mixed-mode input (text + images + files) during voice calls 2026-03-02 05:38:04 -08:00
test fix: translate all remaining English UI strings to Chinese and remove dead code 2026-02-23 02:07:57 -08:00
.gitignore chore: gitignore iOS 自动生成文件 2026-02-23 05:10:54 -08:00
.metadata fix: 提交完整的Android项目配置文件,修复跨机器构建失败 2026-02-22 16:17:18 -08:00
README.md chore: 提交 Flutter 项目默认 README 2026-02-22 22:12:27 -08:00
analysis_options.yaml Initial commit: IT0 AI-powered server cluster operations platform 2026-02-08 22:54:37 -08:00
pubspec.lock feat: add TTS voice and style settings to Flutter app 2026-03-01 09:38:15 -08:00
pubspec.yaml feat: migrate voice call from WebSocket/PCM to LiveKit WebRTC 2026-02-28 08:55:33 -08:00

README.md

it0_app

A new Flutter project.

Getting Started

This project is a starting point for a Flutter application.

A few resources to get you started if this is your first Flutter project:

For help getting started with Flutter development, view the online documentation, which offers tutorials, samples, guidance on mobile development, and a full API reference.