hailin/it0 - it0 - AI Wolves Team

Commit Graph

Author	SHA1	Message	Date
hailin	d097c64c81	feat(voice): add per-turn interrupt support to VoiceSessionManager Implements a two-level abort controller design to support real-time interruption when the user speaks while the agent is still responding: sessionAbortController (session-scoped) - Created once when startSession() is called - Fired only by terminateSession() (user hangs up) - Propagated into each turn via addEventListener turnAbort (per-turn, stored as handle.currentTurnAbort) - Created fresh at the start of each executeTurn() call - Stored on the VoiceSessionHandle so injectMessage() can abort it - When a new inject arrives while a turn is running, injectMessage() calls turnAbort.abort() BEFORE enqueuing the new message Interruption flow: 1. User speaks mid-response → LiveKit stops TTS playback (client-side) 2. STT utterance → POST voice/inject → injectMessage() fires 3. handle.currentTurnAbort.abort() called → sets aborted flag 4. for-await loop checks turnAbort.signal.aborted on next SDK event → break 5. catch block NOT reached (break ≠ exception) → no error event emitted 6. finally block saves partial text with "[中断]" suffix to history 7. New message dequeued → fresh executeTurn() starts immediately Why no "Agent error" message plays to the user: - break exits the for-await loop silently, not via exception - The catch block's error-event emission is guarded by err?.name !== 'AbortError' AND requires an actual exception; a plain break never enters catch - Empty or partial responses are filtered by `if response:` in agent.py Also update module-level JSDoc with full architecture explanation covering the long-lived run loop design, two-level abort hierarchy, tenant context injection pattern, and SDK session resume across turns. Update agent.py module docstring to document voice session lifecycle and interruption flow for future maintainers. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-04 04:25:57 -08:00
hailin	635cca18fa	feat(voice): long-lived agent session with proper hangup termination Replace the per-turn POST /tasks approach for voice calls with a long-lived agent run loop tied to the call lifecycle: agent-service: - Add AsyncQueue<T> utility for blocking message relay - Add VoiceSessionManager: spawns one background run loop per voice call, accepts injected messages, terminates cleanly on hangup - Add VoiceSessionController with 3 endpoints: POST /api/v1/agent/sessions/voice/start (call start) POST /api/v1/agent/sessions/:id/voice/inject (each speech turn) DELETE /api/v1/agent/sessions/:id/voice (user hung up) - Register VoiceSessionManager + VoiceSessionController in agent.module.ts voice-agent: - AgentServiceLLM: add start_voice_session(), terminate_voice_session(), inject_text_message() (voice/inject-aware), _do_inject_voice() - AgentServiceLLMStream._run(): use voice/inject path when voice session is active; fall back to per-task POST for text-chat / non-SDK engines - entrypoint(): call start_voice_session() after session.start(); register _on_room_disconnect that calls terminate_voice_session() so the agent is always killed when the user hangs up Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-04 04:01:02 -08:00
hailin	6ca8aab243	fix(agent-service): store proper title in session metadata, exclude systemPrompt from list API Two issues fixed: 1. agent.controller.ts — on the FIRST task of each session, write title+voiceMode into session.metadata so the client can display a meaningful conversation title: - Text sessions: metadata.title = first 40 chars of user prompt - Voice sessions: metadata.title = '' + metadata.voiceMode = true (Flutter renders these as '语音对话 M/D HH:mm') titleSet flag prevents overwriting the title on subsequent turns of the same session. 2. session.controller.ts — listSessions() now returns a DTO instead of the raw entity. systemPrompt is an internal engine instruction and is explicitly excluded from the response. The client receives { id, status, engineType, metadata, createdAt, updatedAt }.	2026-03-04 02:39:47 -08:00
hailin	9546dab93d	fix(it0_app): stop using systemPrompt as conversation title Voice sessions set systemPrompt to the voice-mode instruction string, causing every voice conversation to display '你正在通过语音与用户实时对话。请…' as its title in the chat history list. Title derivation priority (highest to lowest): 1. metadata.title — explicit title saved by backend on first task 2. metadata.voiceMode == true → '语音对话 M/D HH:mm' 3. Fallback → '对话 M/D HH:mm' based on session createdAt	2026-03-04 02:32:08 -08:00
hailin	f0634c2e49	fix(billing-service): remove stale invoice.items reference after OneToMany removal	2026-03-04 01:49:23 -08:00
hailin	df3b1a6ec6	fix(billing-service): fix entity table names (billing_ prefix) and column mappings to match migration	2026-03-04 01:47:56 -08:00
hailin	d96ea91815	fix(ops-service): add new TenantInfo quota fields to inline TenantContextService.run calls	2026-03-04 00:04:36 -08:00
hailin	ffe06fab7a	fix(billing-service): add tsconfig with workspace path aliases The billing-service tsconfig.json was missing the TypeScript path aliases required for the workspace build (turbo builds shared packages first, then resolves @it0/* via paths). Without these, nest build fails with 'Cannot find module @it0/database'. Also disables overly strict checks (strictNullChecks, strictPropertyInitialization, useUnknownInCatchVariables) to match the lenient settings used by other services. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-03 23:32:34 -08:00
hailin	40ee84a0b7	fix(billing-service): resolve all TypeScript compilation errors Comprehensive fix of 124 TS errors across the billing-service: Entity fixes: - invoice.entity.ts: add InvoiceStatus/InvoiceCurrency const objects, rename fields to match DB schema (subtotalCents, taxCents, totalCents, amountDueCents), add OneToMany items relation - invoice-item.entity.ts: add InvoiceItemType const object, add column name mappings and currency field - payment.entity.ts: add PaymentStatus const, rename amount→amountCents with column name mapping, add paidAt field - subscription.entity.ts: add SubscriptionStatus const object - usage-aggregate.entity.ts: rename periodYear/Month→year/month to match DB columns, add periodStart/periodEnd fields - payment-method.entity.ts: add displayName, expiresAt, updatedAt fields Port/Provider fixes: - payment-provider.port.ts: make PaymentProviderType a const object (not just a type), add PaymentSessionRequest alias, rename WebhookEvent with correct field shape (type vs eventType), make providerPaymentId optional - All 4 providers: replace PaymentSessionRequest→CreatePaymentParams, fix amountCents→amount, remove sessionId from PaymentSession return, add confirmPayment() stub, fix Stripe API version to '2023-10-16' Use case fixes: - aggregate-usage.use-case.ts: replace 'redis' with 'ioredis' (workspace standard); rewrite using ioredis xreadgroup API - change/check/generate use cases: fix Plan field names (monthlyPriceCentsUsd, includedTokens, overageRateCentsPerMTokenUsd) - generate-monthly-invoice: fix SubscriptionStatus/InvoiceCurrency as values (now const objects) - handle-payment-webhook: fix WebhookResult import, result.type usage, payment.paidAt Controller/Repository fixes: - plan.controller.ts, plan.repository.ts: fix Plan field names - webhook.controller.ts: remove express import, use any for req type - invoice-generator.service.ts: fix overageAmountCents→overageCentsUsd, monthlyPriceCny→monthlyPriceFenCny, includedTokensPerMonth→includedTokens Dependencies: - billing-service/package.json: replace redis with ioredis dependency - pnpm-lock.yaml: regenerated after ioredis addition Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-03 23:00:27 -08:00
hailin	c7f3807148	fix(billing-service): add to Dockerfile.service and update pnpm lockfile - Dockerfile.service: add COPY lines for billing-service/package.json in both build and production stages so pnpm install includes its deps (omission caused 'node_modules missing' turbo build error) - pnpm-lock.yaml: regenerated after running pnpm install to include all billing-service dependencies (stripe, alipay-sdk, wechat-pay-v3, etc.) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-03 21:27:35 -08:00
hailin	a58417d092	fix: correct billing migration schema refs and testing mock TenantInfo - 005-create-billing-tables.sql: replace all `it0_shared.tenants` with `public.tenants` and all `tenant_id VARCHAR(20)` with `tenant_id UUID` to match the actual server DB schema (public schema, UUID primary key) - packages/shared/testing src/test-utils.ts: add new quota fields (maxServers, maxUsers, maxStandingOrders, maxAgentTokensPerMonth) to TEST_TENANT mock to satisfy the extended TenantInfo interface, fixing the @it0/testing TypeScript build error Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-03 21:22:02 -08:00
hailin	9ed80cd0bc	feat: implement complete commercial monetization loop (Phases 1-4) ## Phase 1 - Token Metering + Quota Enforcement ### Usage Tracking - agent-service: add UsageRecord entity (per-tenant schema) tracking inputTokens/outputTokens/costUsd per AI task - Modify all 3 AI engines (claude-api, claude-code-cli, claude-agent-sdk) to emit separate input/output token counts in the `completed` event - claude-api-engine: costUsd = (input3 + output15) / 1,000,000 (claude-sonnet-4-5 pricing: $3/MTok in, $15/MTok out) - agent.controller: persist UsageRecord and publish `usage.recorded` event to Redis Streams on every task completion (non-blocking) - shared/events: new events UsageRecordedEvent, SubscriptionChangedEvent, QuotaExceededEvent, PaymentReceivedEvent ### Quota Enforcement - TenantInfo: add maxServers, maxUsers, maxStandingOrders, maxAgentTokensPerMonth fields - TenantContextMiddleware: rewritten to query public.tenants table for real quota values; 5-min in-memory cache; plan-based fallback on error - TenantContextService: getTenant() returns null instead of throwing; added getTenantOrThrow() for strict callers - inventory-service/server.controller: 429 when maxServers exceeded - ops-service/standing-order.controller: 429 when maxStandingOrders exceeded - auth-service/auth.service: 429 when maxUsers exceeded - 002-create-tenant-schema-template.sql: add usage_records table ## Phase 2 - billing-service (New Microservice, port 3010) ### Domain Layer (public schema, all UUIDs) Entities: Plan, Subscription, Invoice, InvoiceItem, Payment, PaymentMethod, UsageAggregate Domain services: - SubscriptionLifecycleService: full state machine (trialing -> active -> past_due -> cancelled/expired); upgrades immediate, downgrades at period end - InvoiceGeneratorService: monthly invoice = base fee + overage charges; proration item for mid-cycle upgrades - OverageCalculatorService: (totalTokens - includedTokens) * overageRate ### Infrastructure (all repos use DataSource directly, NOT TenantAwareRepository) - PlanRepository, SubscriptionRepository, InvoiceRepository (atomic transaction for invoice+items), PaymentRepository (payments + methods), UsageAggregateRepository (UPSERT via ON CONFLICT for atomic accumulation) ### Application Use Cases - CreateSubscriptionUseCase: called on tenant registration - ChangePlanUseCase: upgrade (immediate + proration) or downgrade (scheduled) - CancelSubscriptionUseCase: immediate or at-period-end - GenerateMonthlyInvoiceUseCase: cron target (1st of month 00:05 UTC); generates invoices, renews periods, applies scheduled downgrades - AggregateUsageUseCase: Redis Streams consumer group billing-service, upserts monthly usage aggregates from usage.recorded events - CheckTokenQuotaUseCase: hard limit enforcement per plan - CreatePaymentSessionUseCase + HandlePaymentWebhookUseCase ### REST API - GET /api/v1/billing/plans - GET/POST /api/v1/billing/subscription (+ /upgrade, /cancel) - GET /api/v1/billing/invoices (paginated) - GET /api/v1/billing/invoices/:id - POST /api/v1/billing/invoices/:id/pay - GET /api/v1/billing/usage/current + /history - CRUD /api/v1/billing/payment-methods - POST /api/v1/billing/webhooks/{stripe,alipay,wechat,crypto} ### Plan Seed (auto on startup via PlanSeedService) - free: $0/mo, 100K tokens, no overage, hard limit 100% - pro: $49.99/mo, 1M tokens, $8/MTok, hard limit 150% - enterprise: $199.99/mo, 10M tokens, $5/MTok, no hard limit ## Phase 3 - Payment Provider Integration ### PaymentProviderRegistry (Strategy Pattern, mirrors EngineRegistry) All providers use @Optional() injection; unconfigured providers omitted - StripeProvider: PaymentIntent API; webhook via stripe.webhooks.constructEvent - AlipayProvider: alipay-sdk; Native QR (precreate); RSA2 signature verify - WeChatPayProvider: v3 REST; Native Pay code_url; AES-256-GCM decrypt; HMAC-SHA256 request signing and webhook verification - CryptoProvider: Coinbase Commerce; hosted checkout; HMAC-SHA256 verify ### WebhookController All 4 webhook endpoints are public (no JWT) for payment provider callbacks. rawBody: true enabled in main.ts for signature verification. ## Infrastructure Changes - docker-compose.yml: billing-service container (port 13010); added as dependency of api-gateway - kong.yml: /api/v1/billing routes (JWT); /api/v1/billing/webhooks (public) - 005-create-billing-tables.sql: 7 billing tables + invoice sequence + ALTER tenants to add quota columns - run-migrations.ts: 005 runs as part of shared schema step ## Phase 4 - Frontend ### Web Admin (Next.js) New pages: - /billing: subscription card + token usage bar + warning banner + invoices - /billing/plans: comparison grid with USD/CNY toggle + upgrade/downgrade flow - /billing/invoices: paginated table with Pay Now button Sidebar: Billing group (CreditCard icon, 3 sub-items) i18n: billing keys added to en + zh sidebar translations ### Flutter App New feature module it0_app/lib/features/billing/: - BillingOverviewPage: plan card + token LinearProgressIndicator + latest invoice + upgrade button - BillingProvider (FutureProvider): parallel fetch subscription/quota/invoice Settings page: "订阅与用量" entry card Router: /settings/billing sub-route Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-03 21:09:17 -08:00
hailin	54e3d442ed	feat(it0_app): add auto-increment versionCode on each build 参照 rwadurian mobile-app 的版本自增机制，为 IT0 App 添加相同的逻辑： - 使用 version.properties 文件持久化存储 VERSION_CODE 计数器 - 每次 gradle build 自动读取当前值 +1 并写回文件 - versionCode 使用自增数字（跨日期持续递增，确保每次构建唯一） - versionName 格式: ${pubspec版本}.${自增号}（如 1.0.0.42） - version.properties 已加入 .gitignore，每个构建环境独立维护计数器这样每次编译 APK 都会自动获得一个比上次更高的版本号，无需手动修改 pubspec.yaml 中的 version 字段。 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 08:52:04 -08:00
hailin	d5df46c2d6	fix: add /data/versions directory creation in Dockerfile Ensure /data/versions/android and /data/versions/ios directories are created with correct appuser ownership during image build, fixing EACCES permission error when version-service starts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 08:27:53 -08:00
hailin	260195db50	fix(version-service): use DatabaseModule.forRoot() for correct build path The entrypoint.sh expects dist/services/${SERVICE_NAME}/src/main, but nest build with inline TypeORM config produces dist/main directly. Using DatabaseModule from @it0/database forces tsc to emit the nested path structure (since it references shared packages), matching the entrypoint path convention used by all other services. Also gains SnakeNamingStrategy and autoLoadEntities from the shared module. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 08:04:12 -08:00
hailin	f6dffe02c5	feat: add version-service for IT0 App version management New NestJS microservice (port 3009) providing complete version management API for IT0 App, designed to integrate with the existing mobile-upgrade frontend (update.szaiai.com). Backend — packages/services/version-service/ (9 new files): - AppVersion entity: platform (ANDROID/IOS), versionName, buildNumber, changelog, downloadUrl, fileSize, isForceUpdate, isEnabled, minOsVersion - REST controller with 8 endpoints: GET/POST /api/v1/versions — list (with platform/disabled filters) & create GET/PUT/DELETE /api/v1/versions/:id — single CRUD PATCH /api/v1/versions/:id/toggle — enable/disable POST /api/v1/versions/upload — multipart APK/IPA upload (500MB limit) POST /api/v1/versions/parse — extract version info from APK/IPA - File storage: /data/versions/{platform}/ via Docker volume - APK/IPA parsing: app-info-parser package - Database: public.app_versions table (non-tenant, platform-level) - No JWT auth (internal version management, consistent with existing apps) Infrastructure changes: - Dockerfile.service: added version-service package.json COPY lines - docker-compose.yml: version-service container (13009:3009), version_data volume, api-gateway depends_on - kong.yml: version-service route (/api/v1/versions), CORS origin for update.szaiai.com (mobile-upgrade frontend domain) Deployment note: nginx needs /downloads/versions/ location + client_max_body_size 500m Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 07:48:31 -08:00
hailin	26369be760	docs: add detailed comments for thinking state indicator mechanism voice-agent agent.py: - Module docstring explains lk.agent.state lifecycle (initializing → listening → thinking → speaking) - Explains how RoomIO publishes state as participant attribute - Documents BackgroundAudioPlayer with all available built-in clips Flutter agent_call_page.dart: - Documents _agentState field and all possible values - Documents ParticipantAttributesChanged listener with UI mapping Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 06:13:54 -08:00
hailin	f1d9210e1d	fix: correct BackgroundAudioPlayer import path Import from livekit.agents.voice.background_audio submodule directly, as it's not re-exported from livekit.agents.voice.__init__.py. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 06:03:34 -08:00
hailin	33bd1aa3aa	feat: add "thinking" state indicator for voice calls - voice-agent: enable BackgroundAudioPlayer with keyboard typing sound during LLM thinking state (auto-plays when agent enters "thinking", stops when "speaking" starts) - Flutter: monitor lk.agent.state participant attribute from LiveKit agent, show pulsing dots animation + "思考中..." text when thinking, avatar border changes to warning color with pulsing glow ring - Both call mode and chat mode headers show thinking state Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 05:45:04 -08:00
hailin	121ca5a5aa	docs: add Speechmatics STT postmortem — all 4 modes failed, unusable Detailed record of why livekit-plugins-speechmatics was removed: - EXTERNAL: no FINAL_TRANSCRIPT (framework never sends FlushSentinel) - ADAPTIVE: zero output (dual Silero VAD conflict) - SMART_TURN: fragments Chinese speech into tiny pieces - FIXED: finalize() async race condition with session teardown All tested on 2026-03-03, none viable with LiveKit agents v1.4.4. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 05:03:30 -08:00
hailin	7fb0d1de95	refactor: remove Speechmatics STT integration entirely, default to OpenAI - Delete speechmatics_stt.py plugin - Remove speechmatics branch from voice-agent entrypoint - Remove livekit-plugins-speechmatics dependency - Change default stt_provider to 'openai' in entity, controller, and UI - Remove SPEECHMATICS_API_KEY from docker-compose.yml - Remove speechmatics option from web-admin settings dropdown Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 04:58:38 -08:00
hailin	191ce2d6b3	fix: use FIXED mode with 1s silence trigger instead of SMART_TURN SMART_TURN fragments continuous speech into tiny pieces, each triggering an LLM request that aborts the previous one. FIXED mode waits for a configurable silence duration (1.0s) before emitting FINAL_TRANSCRIPT via the built-in END_OF_UTTERANCE handler. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 04:53:00 -08:00
hailin	e8a3e07116	docs: add comprehensive Speechmatics STT integration notes Document all findings from the integration process directly in the source code for future reference: 1. Language code mapping: Speechmatics uses ISO 639-3 "cmn" for Mandarin, but LiveKit LanguageCode auto-normalizes it to "zh". Must override stt._stt_options.language after construction. 2. Turn detection modes (critical): - EXTERNAL: unusable — LiveKit never sends FlushSentinel, only pushes silence frames, so FINAL_TRANSCRIPT never arrives - ADAPTIVE: unusable — client-side Silero VAD conflicts with LiveKit's own VAD, produces zero transcription output - SMART_TURN: correct choice — server-side intelligent turn detection, auto-emits FINAL_TRANSCRIPT, fully compatible 3. Speaker diarization: is_active flag distinguishes primary speaker from TTS echo, solving the "speaker confusion" problem 4. Docker deployment: SPEECHMATICS_API_KEY in .env, watch for COPY layer cache when rebuilding Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 04:47:33 -08:00
hailin	f30aa414dd	fix: use SMART_TURN mode per Speechmatics official recommendation Replace EXTERNAL mode + monkey-patch hack with SMART_TURN mode. SMART_TURN uses Speechmatics server-side turn detection that properly emits AddSegment (FINAL_TRANSCRIPT) when the user finishes speaking. No client-side finalize or debounce timer needed. Ref: https://docs.speechmatics.com/integrations-and-sdks/livekit Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 04:44:21 -08:00
hailin	de99990c4d	fix: text-based dedup to prevent duplicate FINAL_TRANSCRIPT emissions Speechmatics re-sends identical partial segments during silence, causing the debounce timer to fire multiple times with the same text. Each duplicate FINAL aborts the in-flight LLM request and restarts it. Replace time-based cooldown with text comparison: skip finalization if the segment text matches the last finalized text. Also skip starting new timers when partial text hasn't changed from last finalized. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 04:40:00 -08:00
hailin	3b0119fe09	fix: reduce STT latency, add cooldown dedup, enable diarization - Reduce debounce delay from 700ms to 400ms for faster response - Add 1.5s cooldown after emitting FINAL to prevent duplicate triggers that cause LLM abort/retry cycles - Enable speaker diarization (enable_diarization=True) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 03:20:12 -08:00
hailin	8ac1884ab4	fix: use debounce timer to auto-finalize Speechmatics partial transcripts The LiveKit framework never sends FlushSentinel to the STT stream. Instead it pushes silence frames and waits for FINAL_TRANSCRIPT events. In EXTERNAL turn-detection mode, Speechmatics only emits partials. New approach: each partial transcript restarts a 700ms debounce timer. When partials stop (user stops speaking), the timer fires and promotes the last partial to FINAL_TRANSCRIPT, unblocking the pipeline. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 03:08:17 -08:00
hailin	de3eccafd0	debug: add verbose logging to Speechmatics monkey-patch Trace _patched_process_audio lifecycle and FlushSentinel handling to diagnose why final transcripts are not being promoted. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 02:50:04 -08:00
hailin	1431dc0c83	fix: directly promote partial transcripts to FINAL on FlushSentinel VoiceAgentClient.finalize() schedules an async task chain that often loses the race against session teardown. Instead, intercept partial segments as they arrive, stash them, and synchronously emit them as FINAL_TRANSCRIPT when FlushSentinel fires. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 02:16:46 -08:00
hailin	73fd56f30a	fix: durable monkey-patch for Speechmatics finalize on flush Move the SpeechStream._process_audio patch from container runtime into our own source code so it survives Docker rebuilds. The patch adds client.finalize() on FlushSentinel so EXTERNAL mode produces final transcripts when LiveKit's VAD detects end of speech. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 02:00:42 -08:00
hailin	6707c5048d	fix: use EXTERNAL mode + patch plugin to finalize on flush EXTERNAL mode produces partial transcripts but livekit-plugins-speechmatics does not call finalize() when receiving a flush sentinel from the framework. A runtime monkey-patch on the plugin's SpeechStream._process_audio adds the missing finalize() call so final transcripts are generated. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 01:58:25 -08:00
hailin	8f951ad31c	fix: use turn_detection=stt for Speechmatics per official docs Speechmatics handles end-of-utterance natively via its Voice Agent API (ADAPTIVE mode). Use turn_detection="stt" on AgentSession so LiveKit delegates turn boundaries to the STT engine instead of conflicting with its own VAD-based turn detection. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 01:44:10 -08:00
hailin	db4e70e30c	fix: use EXTERNAL turn detection for Speechmatics in LiveKit pipeline ADAPTIVE mode enables a second client-side Silero VAD inside the Speechmatics SDK that conflicts with LiveKit's own VAD pipeline, causing no transcription to be returned. EXTERNAL mode delegates turn detection to LiveKit. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 01:31:33 -08:00
hailin	9daf0e3b4f	fix: bypass LanguageCode normalization that maps cmn back to zh LiveKit's LanguageCode class normalizes ISO 639-3 codes to ISO 639-1 (cmn → zh), but Speechmatics API expects "cmn" not "zh". Override the internal _stt_options.language after construction. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 01:04:20 -08:00
hailin	7292ac6ca6	fix: use cmn instead of cmn_en for Speechmatics Voice Agent API cmn_en bilingual code not supported by Voice Agent API, causes timeout. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 00:19:50 -08:00
hailin	17ff9d3ce0	fix: use Speechmatics cmn_en bilingual model for Chinese-English mixed speech Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-02 23:57:26 -08:00
hailin	1d43943110	fix: correct Speechmatics STT language mapping and parameter name - Map Whisper language codes (zh→cmn, en→en, etc.) to Speechmatics codes - Fix parameter name: enable_partials → include_partials Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-02 23:56:37 -08:00
hailin	e32a3a9800	fix: use @TenantId() decorator in VoiceConfigController for JWT tenant extraction Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-02 22:30:37 -08:00
hailin	f9c47de04b	feat: add STT provider switching (OpenAI ↔ Speechmatics) in settings - Add VoiceConfig entity/repo/service/controller in agent-service for per-tenant STT provider persistence (default: speechmatics) - Add Speechmatics STT plugin in voice-agent with livekit-plugins-speechmatics - Modify voice-agent entrypoint for 3-way STT selection: metadata > agent-service config > env var fallback - Add "Voice" section in web-admin settings page with STT provider dropdown - Add i18n translations (en/zh) for voice settings - Add SPEECHMATICS_API_KEY env var in docker-compose Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-02 22:13:18 -08:00
hailin	7cb185e0cd	fix: remove RunbookExecution data wrapper type in runbook detail page Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-02 18:50:38 -08:00
hailin	bf68ceccbc	fix: remove PaginatedResponse wrapper in communication page Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-02 18:48:17 -08:00
hailin	07e6c5671d	fix: resolve remaining .total and .data references after response format migration Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-02 18:29:59 -08:00
hailin	ee9383d301	fix: audit logs page - use array length instead of .total property Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-02 18:22:44 -08:00
hailin	146d271dc3	fix: convert Response wrapper interfaces to direct array types Backend APIs return arrays directly, not { data, total } wrappers. Changed 21 interface declarations to type aliases matching actual API response format. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-02 09:54:33 -08:00
hailin	10e0b0ce29	fix: remove incorrect .data wrapper — backend returns arrays directly All pages expected API responses in { data: [], total } format but backend APIs return plain arrays. Changed data?.data ?? [] to data ?? [] across 22 page components. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-02 09:47:22 -08:00
hailin	d21f41d7c3	fix: auto-redirect to login on 401 Unauthorized When API returns 401, clear stored tokens and redirect to /login instead of showing an error message. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-02 09:40:58 -08:00
hailin	375c5632f6	fix: correct all web-admin API endpoint URLs to match backend routes The web-admin frontend was calling incorrect API paths that didn't match the actual backend service routes through Kong gateway, causing all requests to fail with 404 or route-mismatch errors. URL corrections: - servers: /api/v1/servers → /api/v1/inventory/servers - runbooks: /api/v1/runbooks → /api/v1/ops/runbooks - risk-rules: /api/v1/security/risk-rules → /api/v1/agent/risk-rules - credentials: /api/v1/security/credentials → /api/v1/inventory/credentials - roles: /api/v1/security/roles → /api/v1/auth/roles - permissions: /api/v1/security/permissions → /api/v1/auth/permissions - tenants: /api/v1/tenants → /api/v1/admin/tenants - communication: /api/v1/communication → /api/v1/comm Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-02 09:29:51 -08:00
hailin	94e3153e39	chore: remove debug data_received logging	2026-03-02 06:50:41 -08:00
hailin	81e36bf859	debug: add data_received event logging to diagnose data channel	2026-03-02 06:38:02 -08:00
hailin	63b986fced	fix: redesign voice call mixed-mode input with dual-layout architecture Problem: - Text input area caused BOTTOM OVERFLOWED BY 135 PIXELS when keyboard opened - Input bar overlapped with call control buttons - Sent messages were not displayed on screen (only SnackBar feedback) Solution — split into two distinct layouts: 1. Call Mode (default): - Full-screen call UI: avatar, waveform, duration, large control buttons - Keyboard button in controls toggles to chat mode - No text input elements — clean voice-only interface 2. Chat Mode (tap keyboard button): - Compact call header: green status dot + "iAgent" + duration + inline mute/end/speaker/collapse controls - Scrollable message list (Expanded widget — properly handles keyboard) - User messages: right-aligned blue bubbles with attachment thumbnails - Agent responses: left-aligned gray bubbles with robot avatar - Input bar at bottom: attachment picker + text field + send button Message display: - User-sent text/attachments tracked in _messages list, shown as bubbles - Agent responses sent back via LiveKit data channel (topic='text_reply') from voice-agent → Flutter, displayed as assistant bubbles - Auto-scroll to latest message Voice-agent change (agent.py): - After session.say(response), publish response text back to Flutter via ctx.room.local_participant.publish_data() with topic='text_reply' - Flutter listens for DataReceivedEvent to display agent responses Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-02 06:11:07 -08:00

1 2 3 4 5 ...

263 Commits All Branches Search

263 Commits

All Branches