Required by FastAPI for form/file upload parsing. Missing dependency
may cause import errors and container restart loops.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
TenantAwareRepository.getRepository() was calling createQueryRunner()
without ever releasing it, causing database connection pool exhaustion.
This caused ops-service (and eventually other services) to hang on
all API requests once the pool filled up.
Replaced getRepository() with withRepository() pattern that wraps
operations in try/finally to always release the QueryRunner.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Addresses reliability gaps in the real-time voice WebSocket connection
between Flutter client and Python voice-service backend.
Backend (voice-service):
- Heartbeat: new _heartbeat_sender coroutine sends JSON ping text frames
every 15s alongside the Pipecat pipeline; failed send = dead connection
- Session preservation: on WebSocket disconnect, sessions are now marked
"disconnected" with a timestamp instead of being deleted, allowing
reconnection within a configurable TTL (default 60s)
- Reconnect endpoint: POST /sessions/{id}/reconnect verifies the session
is alive and in "disconnected" state, returns fresh websocket_url
- Reconnect-aware WS handler: detects "disconnected" sessions, cancels
stale pipeline tasks, creates a new pipeline, sends "session.resumed"
- Background cleanup: asyncio loop every 30s removes sessions that have
been disconnected longer than session_ttl
- Structured event protocol: text frames = JSON control messages
(ping/pong/session.resumed/session.ended/error), binary = PCM audio
- New settings: session_ttl (60s), heartbeat_interval (15s),
heartbeat_timeout (45s)
Flutter (agent_call_page.dart):
- Heartbeat monitoring: tracks last server ping timestamp, triggers
reconnect if no ping received in 45s (3 missed intervals)
- Auto-reconnect: exponential backoff (1s→2s→4s→8s→16s), max 5 attempts;
calls /reconnect endpoint to verify session, rebuilds WebSocket,
resets audio buffer, restarts heartbeat
- Reconnecting UI: yellow warning banner "重新连接中... (N/5)" with
spinner overlay during reconnection attempts
- WebSocket data routing: _onWsData distinguishes String (JSON control)
from binary (audio) frames, handles ping/session.resumed/session.ended
- User-initiated disconnect guard: _userEndedCall flag prevents reconnect
attempts when user intentionally hangs up
- session_id field compatibility: supports session_id/sessionId/id
Flutter (pcm_player.dart):
- Jitter buffer: queues incoming PCM chunks, starts playback only after
accumulating 4800 bytes (150ms at 16kHz 16-bit mono) to smooth out
network timing variance
- reset() method: clears buffer on reconnect to discard stale audio
- Buffer underrun handling: re-enters buffering phase if queue empties
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
SDK blocks bypassPermissions when running as root for security.
Add non-root 'appuser' to Dockerfile.service and update volume
mounts to use /home/appuser/.claude paths.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- bypassPermissions blocked by SDK when running as root
- Switch to acceptEdits with canUseTool for programmatic control
- Mount .claude.json config file into container
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
In a Docker container without TTY, permissionMode 'default' blocks
waiting for interactive permission prompts. Switch to bypassPermissions
with canUseTool callback for programmatic risk-based access control.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
tsc with module=commonjs converts `await import()` to require(),
which breaks ESM-only packages. Use Function('return import()')
workaround to preserve native dynamic import at runtime.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Mount ~/.claude/ into agent-service container for OAuth token access
- Switch default engine to claude_agent_sdk
- Remove ANTHROPIC_API_KEY from env in subscription mode so SDK uses OAuth
- Keep API key mode for per-tenant billing
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Follow iConsulting pattern: set NODE_TLS_REJECT_UNAUTHORIZED=0 when
ANTHROPIC_BASE_URL is configured, enabling connection through the
self-signed proxy at 67.223.119.33.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Change AGENT_ENGINE_TYPE from claude_code_cli to claude_api in docker-compose
- Add ANTHROPIC_BASE_URL env var support to claude-api-engine
- Add ANTHROPIC_BASE_URL to agent-service environment in docker-compose
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add restart: unless-stopped to all 12 Docker services
- Add process.on(unhandledRejection/uncaughtException) to all 7 service main.ts
- Fix handleEventTrigger using tenantId UUID as schema name instead of slug lookup
- Wrap Redis event subscription callbacks in try/catch
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace traditional on-device speech_to_text with a modern pipeline:
- Record audio via `record` package with hardware noise suppression
- Apply GTCRN neural denoising (sherpa-onnx, ICASSP 2024, 48K params)
- Trim silence, POST to backend /voice/transcribe (faster-whisper)
Changes:
- Add /transcribe endpoint to voice-service for audio file upload
- Add SpeechEnhancer wrapper for sherpa-onnx GTCRN model (523KB)
- Rewrite chat_page.dart voice input: record → denoise → transcribe
- Keep NoiseReducer.trimSilence for silence removal only
- Upgrade record to v6.2.0, add sherpa_onnx, path_provider
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The previous approach split by semicolons then filtered statements starting
with '--', which incorrectly removed entire CREATE TABLE blocks that had
comment headers (e.g., '-- Agent Sessions\nCREATE TABLE...').
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Backend:
- Enhanced register endpoint to accept companyName for self-service
tenant creation with schema provisioning and admin user setup
- Added TenantInvite entity with token-based invitation system
- Added invite CRUD endpoints to TenantController (create/list/revoke)
- Added public endpoints for invite validation and acceptance
Frontend:
- Created registration page with optional organization name field
- Created invitation acceptance page at /invite/[token]
- Added invite management UI to tenant detail page
- Updated login page with link to registration
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implement remaining backend controllers for all web admin menu pages:
- SettingsController: general, notification, theme, account, API keys
- RoleController: CRUD roles with permission assignment
- PermissionController: permission matrix for RBAC management
- MetricsController: server metrics overview and per-server data
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Map flat quota fields to nested quota object and add userCount field
to match the frontend's expected Tenant interface.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add UsersController to auth-service for user CRUD (GET/POST/PUT/DELETE /api/v1/auth/users)
- Add Kong route /api/v1/admin -> auth-service for tenant management
- Remove AuthGuard from TenantController (Kong handles JWT)
- Fix frontend agent-config API paths from /api/v1/agent/config to /api/v1/agent-config
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Kong validates the JWT but doesn't populate req.user on the backend.
The middleware now decodes the JWT payload to extract user info (id,
email, tenantId, roles) so RolesGuard can check role-based access.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Frontend alert-rules paths changed from /monitoring/alert-rules to
/monitor/alerts/rules to match backend routes
- Removed Kong ACL plugin on audit-routes (JWT auth is sufficient)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
TypeORM entities use camelCase properties (tenantId, passwordHash) but
database tables use snake_case columns (tenant_id, password_hash). The
naming strategy automatically converts between the two conventions.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The @it0/database package doesn't have @types/express, causing build
failures. Use any types for req/res/next parameters instead.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
All services using TenantAwareRepository require AsyncLocalStorage tenant
context to set the correct PostgreSQL search_path. The middleware reads
X-Tenant-Id from request headers and wraps the request with
TenantContextService.run(), using schema naming convention it0_t_{tenantId}.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Kong handles JWT validation at the gateway level. Service-level
AuthGuard('jwt') fails because services don't register a Passport
JWT strategy (only auth-service does). Removed from 17 controllers
across ops, inventory, monitor, comm, audit, and agent services.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace global JWT plugin with per-service JWT (skip auth-service)
to fix auth routes being blocked by global JWT in DB-less mode
- Fix UserRepository and ApiKeyRepository to use standard TypeORM
instead of TenantAwareRepository (users are global, not per-schema)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add kid claim to auth-service JWT for Kong validation
- Add Kong consumer with JWT credential (shared secret via env)
- Add agent-config route to Kong for /api/v1/agent-config
- Kong Dockerfile uses entrypoint script to inject JWT_SECRET at runtime
- Fix frontend login path (/auth/login → /api/v1/auth/login)
- Extract tenantId from JWT on login and store as current_tenant
- Add auth guard in admin layout (redirect to /login if no token)
- Pass JWT_SECRET env var to Kong container in docker-compose
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Update Kong CORS origins to allow it0.szaiai.com
- Update WebSocket URL to wss://it0api.szaiai.com
- Fix proxy route to read API_BASE_URL at request time
(was being inlined at build time by Next.js standalone)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace TenantAwareRepository with standard @InjectRepository
(TenantAwareRepository requires AsyncLocalStorage tenant context
middleware which agent-service does not have)
- Replace @TenantId() decorator with @Headers('x-tenant-id')
for direct HTTP header extraction
- Return defaults gracefully when no tenant is selected
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Agent-service does not have a registered Passport JWT strategy —
JWT validation is handled by Kong API gateway. The AuthGuard was
causing 500 "Unknown authentication strategy" errors on all
new controller endpoints.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implement missing REST API endpoints that the web-admin frontend
pages were calling but had no backend support:
- GET/POST/PUT /api/v1/agent-config (engine, prompt, turns, budget, tools)
- GET/POST/PUT/DELETE /api/v1/agent/skills (CRUD for agent skills)
- GET/POST/PUT/DELETE /api/v1/agent/hooks (CRUD for hook scripts)
Each endpoint includes entity, repository, service, and controller
layers following the existing DDD + tenant-aware patterns.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Model downloads (Whisper, Kokoro, Silero VAD) are synchronous blocking
calls that prevent uvicorn from completing startup and responding to
healthchecks. Move all model loading to a daemon thread so the server
starts immediately.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Wrap model loading in try/except so server starts even if models fail
- Fix device env var mapping (unified 'device' field instead of 'whisper_device')
- Default Whisper model to 'base' instead of 'large-v3' (3GB) for CPU deployment
- Increase healthcheck start_period to 120s for model download time
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Dockerfile.service: fix entry point path (dist/services/{name}/src/main)
due to tsconfig paths widening rootDir during compilation
- Kong config: remove unsupported ws/wss protocols (WebSocket works
automatically over http/https in Kong 3.7)
- voice-service: fix pipecat import path for v0.0.30 API
(pipecat.transports.network.websocket_server with lowercase class names)
- voice-service: add openai dependency required by pipecat anthropic service
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
faster-whisper 1.0.0 depends on av==11.* which has no prebuilt wheels
and fails to compile. Version 1.2.1 uses av 12+ with prebuilt wheels.
Also removed unnecessary FFmpeg dev libraries from Dockerfile.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
PyAV (av==11, dep of faster-whisper) requires pkg-config and
FFmpeg development headers to compile from source.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Server is on HK network, no need for China mirrors. Added
build-essential for compiling native Python packages (kokoro, etc).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
web-admin npm ci was timing out on the server. Added npmmirror.com
for npm and tsinghua mirror for pip to resolve network issues.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>