Commit Graph

76 Commits

Author SHA1 Message Date
hailin 67d5a13c0c fix: set compose project name to 'it0' for consistent image naming
Changes image names from docker-{service} to it0-{service}.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 02:57:42 -08:00
hailin 259838ae88 fix: set HOSTNAME=0.0.0.0 for Next.js standalone to bind all interfaces
Next.js standalone server binds to container hostname by default,
making it unreachable from 127.0.0.1 for healthchecks and from
Docker port forwarding.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 02:52:37 -08:00
hailin 83da374bbb fix: use 127.0.0.1 in web-admin healthcheck to avoid IPv6 resolution
Node.js 18 resolves 'localhost' to ::1 (IPv6) but Next.js standalone
only binds to 0.0.0.0 (IPv4), causing Connection Refused.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 02:49:51 -08:00
hailin a06b489a1e fix: load voice models in background thread to unblock startup
Model downloads (Whisper, Kokoro, Silero VAD) are synchronous blocking
calls that prevent uvicorn from completing startup and responding to
healthchecks. Move all model loading to a daemon thread so the server
starts immediately.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 00:26:06 -08:00
hailin 3702fa3f52 fix: make voice-service startup graceful and fix device config
- Wrap model loading in try/except so server starts even if models fail
- Fix device env var mapping (unified 'device' field instead of 'whisper_device')
- Default Whisper model to 'base' instead of 'large-v3' (3GB) for CPU deployment
- Increase healthcheck start_period to 120s for model download time

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 00:20:12 -08:00
hailin d0447fb69f fix: use node/python HTTP healthchecks instead of wget
wget returns error on 404, but services are healthy (just no root
endpoint). Using node http.get for NestJS services (accepts any
non-5xx response) and python urllib for voice-service.

Also upgraded api-gateway depends_on to service_healthy.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 00:13:47 -08:00
hailin e7ae82e51d feat: add healthcheck to all services in docker-compose
NestJS services use wget to check API endpoints.
voice-service uses curl to check FastAPI /docs endpoint.
web-admin uses wget to check Next.js root.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 00:10:38 -08:00
hailin b620898bc8 fix: revert to node:18 (cached), enable crypto via NODE_OPTIONS
Docker Hub is unreachable from server, so node:20 can't be pulled.
Reverting to node:18-alpine (already cached) and using
--experimental-global-webcrypto to enable globalThis.crypto.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 19:17:23 -08:00
hailin bbb288025a fix: upgrade to Node.js 20 for globalThis.crypto support
crypto.randomUUID() is used throughout services but crypto is not
a global in Node.js 18. Node.js 20 provides globalThis.crypto.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 19:15:36 -08:00
hailin 39718a9a09 fix: resolve runtime errors for NestJS, Kong, and voice-service
- Dockerfile.service: fix entry point path (dist/services/{name}/src/main)
  due to tsconfig paths widening rootDir during compilation
- Kong config: remove unsupported ws/wss protocols (WebSocket works
  automatically over http/https in Kong 3.7)
- voice-service: fix pipecat import path for v0.0.30 API
  (pipecat.transports.network.websocket_server with lowercase class names)
- voice-service: add openai dependency required by pipecat anthropic service

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 19:00:03 -08:00
hailin 93c4a21f06 fix: upgrade faster-whisper to 1.2.1 to resolve av build failure
faster-whisper 1.0.0 depends on av==11.* which has no prebuilt wheels
and fails to compile. Version 1.2.1 uses av 12+ with prebuilt wheels.
Also removed unnecessary FFmpeg dev libraries from Dockerfile.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 16:40:04 -08:00
hailin 6deaf16365 fix: add pkg-config and FFmpeg dev libs for PyAV build
PyAV (av==11, dep of faster-whisper) requires pkg-config and
FFmpeg development headers to compile from source.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 05:20:37 -08:00
hailin c0b4f77de5 fix: remove China mirrors, add build-essential for voice-service
Server is on HK network, no need for China mirrors. Added
build-essential for compiling native Python packages (kokoro, etc).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 05:11:39 -08:00
hailin 9a95cdc4a9 fix: update numpy to 1.26.4 for pipecat-ai compatibility
pipecat-ai==0.0.30 requires numpy~=1.26.4, conflicting with 1.26.0.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 05:09:01 -08:00
hailin da01571c1b fix: remove COPY public from web-admin Dockerfile
The public directory doesn't exist in the project, causing
Docker build to fail with "not found" error.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 05:03:53 -08:00
hailin b382e6e469 fix: add China registry mirrors for npm and pip in Dockerfiles
web-admin npm ci was timing out on the server. Added npmmirror.com
for npm and tsinghua mirror for pip to resolve network issues.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 04:59:09 -08:00
hailin 84b3e5ff7b fix: update pnpm-lock.yaml for @it0/testing dependency change
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 04:51:04 -08:00
hailin e7570a3710 fix: add missing @it0/common dependency to @it0/testing
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 04:48:49 -08:00
hailin f8bf230f14 fix: rename turbo.json pipeline to tasks for Turbo 2.x compatibility
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 04:44:25 -08:00
hailin ee1ee7b484 fix: remove non-existent scripts/ COPY from voice-service Dockerfile
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 04:39:02 -08:00
hailin 4db373b03f . 2026-02-19 20:37:19 +08:00
hailin e875cd49bb fix: resolve Kong image tag and port conflicts for shared server
- Change Kong base image from kong:3.7-alpine (non-existent) to kong:3.7
- Remap all host ports to avoid conflicts with existing iconsulting services:
  - Backend services: 13001-13008 (was 3001-3008)
  - Web admin: 13000 (was 3000)
  - API gateway: 18000/18001 (was 8000/8001)
  - PostgreSQL: 15432 (was 5432)
  - Redis: 16379 (was 6379)
- Add container_name with it0- prefix to all services
- Update deploy.sh health check ports to match new mappings

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 04:36:23 -08:00
hailin 9120f4927e fix: add Dockerfiles and fix docker-compose build configuration
- Add shared Dockerfile.service for all 7 NestJS microservices using
  multi-stage build with pnpm workspace support
- Add Dockerfile for web-admin (Next.js standalone output)
- Add .dockerignore files for root and web-admin
- Fix docker-compose.yml: use monorepo root as build context with
  SERVICE_NAME build arg instead of per-service Dockerfiles
- Fix postgres/redis missing network config (services couldn't reach them)
- Use .env variables for DB credentials instead of hardcoded values
- Add JWT_REFRESH_SECRET and REDIS_URL to services that were missing them
- Add DB init script volume mount for postgres
- Remove deprecated version: '3.8' from all compose files
- Add output: 'standalone' to next.config.js for optimized Docker builds

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 04:31:23 -08:00
hailin 8116f17fd0 docs: add comprehensive deployment guide
Add deployment-guide.md covering build, deployment, and operations
for the entire IT0 platform including all microservices, web admin,
Flutter app, and voice service.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 16:54:00 -08:00
hailin e761b65b6e feat: add deployment scripts with SSL support for production
Backend deploy script (deploy/docker/deploy.sh):
- install: auto-generate .env with secure secrets (JWT, DB passwords, vault keys)
- up/down/restart: manage all services (infra + app + gateway)
- build/build-no-cache: Docker image management
- status/health: health checks for all 9 services + infrastructure
- migrate: TypeORM migration commands (run/generate/revert/schema-sync)
- infra-*: standalone infrastructure management (PostgreSQL + Redis)
- voice-*: voice service with GPU support (docker-compose.voice.yml overlay)
- start-svc/stop-svc/rebuild-svc: individual service operations
- ssl-init: obtain Let's Encrypt certificates for both domains independently
- ssl-up/ssl-down: start/stop with Nginx SSL reverse proxy
- ssl-renew/ssl-status: certificate renewal and status checks

Web Admin deploy script (it0-web-admin/deploy.sh):
- build/start/stop/restart/logs/status/clean commands
- auto-generates Dockerfile (Next.js multi-stage standalone build)
- auto-generates docker-compose.yml
- configurable API domain (default: it0api.szaiai.com)

SSL / Nginx configuration:
- nginx.conf: reverse proxy for both domains with HTTP->HTTPS redirect
  - it0api.szaiai.com -> api-gateway:8000 (with WebSocket support)
  - it0.szaiai.com -> web-admin:3000 (with Next.js HMR support)
- nginx-init.conf: HTTP-only config for initial ACME challenge verification
- ssl-params.conf: TLS 1.2/1.3, HSTS, security headers (Mozilla Intermediate)
- docker-compose.ssl.yml: Nginx + Certbot overlay with auto-renewal (12h cycle)

Domain plan:
- https://it0api.szaiai.com — API endpoint (backend services)
- https://it0.szaiai.com — Web Admin dashboard (frontend)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 17:44:27 -08:00
hailin 00f8801d51 Initial commit: IT0 AI-powered server cluster operations platform
Full-stack monorepo with DDD + Clean Architecture:
- Backend: 7 NestJS microservices + 5 shared libraries (TypeScript)
- Mobile: Flutter app with Riverpod (Dart)
- Web Admin: Next.js dashboard with Zustand + React Query
- Voice: Python voice service (STT/TTS/VAD)
- Infra: Docker Compose, K8s manifests, Turborepo build

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 22:54:37 -08:00