Commit Graph

7 Commits

Author SHA1 Message Date
hailin 3cb9ebd407 fix: release QueryRunner connections to prevent pool exhaustion
TenantAwareRepository.getRepository() was calling createQueryRunner()
without ever releasing it, causing database connection pool exhaustion.
This caused ops-service (and eventually other services) to hang on
all API requests once the pool filled up.

Replaced getRepository() with withRepository() pattern that wraps
operations in try/finally to always release the QueryRunner.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 15:55:06 -08:00
hailin 9a1ecf10ec fix: add restart policy, global error handlers, and fix tenant schema bug
- Add restart: unless-stopped to all 12 Docker services
- Add process.on(unhandledRejection/uncaughtException) to all 7 service main.ts
- Fix handleEventTrigger using tenantId UUID as schema name instead of slug lookup
- Wrap Redis event subscription callbacks in try/catch

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 05:30:34 -08:00
hailin 1c291ce6c0 fix: Scheduler 用 slug 构建 tenant schema 名
tenant schema 是 it0_t_{slug}(如 it0_t_default),不是 it0_t_{uuid}。

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 05:03:04 -08:00
hailin 09274aa6af fix: Scheduler 查询 public.tenants 而非 it0_shared.tenants
数据库实际 schema 是 public.tenants,不是 it0_shared.tenants。

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 05:00:17 -08:00
hailin 840318f449 fix: Scheduler 缺少 tenant 上下文导致 ops-service 卡死
根因: @Cron 定时任务在 HTTP 请求上下文之外运行,
TenantAwareRepository 需要 AsyncLocalStorage 中的 tenant 信息,
每分钟抛 "Tenant context not initialized" 错误。

修复:
- scanCronOrders: 查 it0_shared.tenants 获取所有活跃租户,
  在 TenantContextService.run() 上下文中逐一执行
- handleEventTrigger: 从 Redis event 中提取 tenantId,
  同样包裹在 TenantContextService.run() 中
- 每个 tenant 循环加 try/catch 防止单个租户出错影响其他

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 04:55:52 -08:00
hailin 806113554b fix: remove AuthGuard('jwt') from all service controllers
Kong handles JWT validation at the gateway level. Service-level
AuthGuard('jwt') fails because services don't register a Passport
JWT strategy (only auth-service does). Removed from 17 controllers
across ops, inventory, monitor, comm, audit, and agent services.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 23:42:37 -08:00
hailin 00f8801d51 Initial commit: IT0 AI-powered server cluster operations platform
Full-stack monorepo with DDD + Clean Architecture:
- Backend: 7 NestJS microservices + 5 shared libraries (TypeScript)
- Mobile: Flutter app with Riverpod (Dart)
- Web Admin: Next.js dashboard with Zustand + React Query
- Voice: Python voice service (STT/TTS/VAD)
- Infra: Docker Compose, K8s manifests, Turborepo build

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 22:54:37 -08:00