Go to file
hailin 50dbb641a3 fix: comprehensive hardening of agent task cancel/inject/approve flows
6 rounds of systematic audit identified and fixed 14 bugs across
backend controller and Flutter client:

## Backend (agent.controller.ts)

Security & Tenant Isolation:
- Add @TenantId + ForbiddenException check to cancelTask, injectMessage,
  approveCommand — all 4 write endpoints now enforce tenant isolation
- Add tenantId check on session reuse in executeTask to prevent
  cross-tenant session hijacking

Architecture & Correctness:
- Extract shared runTaskStream() from inline fire-and-forget block,
  used by both executeTask and injectMessage to reduce duplication
- Use session.engineType (not getActiveEngine()) in cancelTask,
  injectMessage, approveCommand — fixes wrong-engine-cancel when
  global engine config is switched after task creation
- Add concurrent task prevention: executeTask checks for existing
  RUNNING task on same session and cancels it before starting new one
- Add runningTasks Map to track task promises, awaitTaskCleanup()
  helper with 3s timeout for inject to wait for partial text save
- captureSdkSessionId() captures SDK session ID into metadata
  without DB save (callers persist), preventing fire-and-forget race

Cancel/Reject Improvements:
- cancelTask: idempotent (returns early if already CANCELLED/COMPLETED),
  session stays 'active' (was 'cancelled'), emits cancelled WS event
- approveCommand reject: session stays 'active' (was 'cancelled'),
  now emits cancelled WS event so Flutter stream listeners clean up
- approveCommand approved: collect text events and save assistant
  response to conversation history on completion (was missing)

Minor:
- task.result! non-null assertion → task.result ?? 'Unknown error'
- Add findRunningBySessionId() to TaskRepository

## Flutter

API Contract Fix:
- approveCommand: route changed from /api/v1/ops/approvals/:id/approve
  to /api/v1/agent/tasks/:id/approve with {approved: true} body
- rejectCommand: route changed from /api/v1/ops/approvals/:id/reject
  to /api/v1/agent/tasks/:id/approve with {approved: false} body

Resource Management:
- ChatNotifier.dispose() now disconnects WebSocket to prevent
  connection leak when navigating away from chat

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 22:20:46 -08:00
deploy refactor: clean up agent SSH setup after fixing host-local routing 2026-02-26 18:11:44 -08:00
docs docs: add comprehensive deployment guide 2026-02-18 16:54:00 -08:00
it0-web-admin feat: complete tenant member management (CRUD + delete tenant) 2026-02-26 10:00:09 -08:00
it0_app fix: comprehensive hardening of agent task cancel/inject/approve flows 2026-02-27 22:20:46 -08:00
packages fix: comprehensive hardening of agent task cancel/inject/approve flows 2026-02-27 22:20:46 -08:00
.dockerignore fix: add Dockerfiles and fix docker-compose build configuration 2026-02-19 04:31:23 -08:00
.env.example Initial commit: IT0 AI-powered server cluster operations platform 2026-02-08 22:54:37 -08:00
.gitignore fix: 修复 .gitignore 误忽略 Flutter data/models/ 源码导致构建失败 2026-02-22 16:29:03 -08:00
Dockerfile.service refactor: clean up agent SSH setup after fixing host-local routing 2026-02-26 18:11:44 -08:00
README.md Initial commit: IT0 AI-powered server cluster operations platform 2026-02-08 22:54:37 -08:00
entrypoint.sh refactor: clean up agent SSH setup after fixing host-local routing 2026-02-26 18:11:44 -08:00
logo.svg feat: rename app from IT0 to iAgent (我智能体) 2026-02-22 06:39:40 -08:00
package.json Initial commit: IT0 AI-powered server cluster operations platform 2026-02-08 22:54:37 -08:00
pnpm-lock.yaml chore: upgrade claude-agent-sdk to ^0.2.52 2026-02-24 04:12:03 -08:00
pnpm-workspace.yaml Initial commit: IT0 AI-powered server cluster operations platform 2026-02-08 22:54:37 -08:00
tsconfig.base.json Initial commit: IT0 AI-powered server cluster operations platform 2026-02-08 22:54:37 -08:00
turbo.json fix: rename turbo.json pipeline to tasks for Turbo 2.x compatibility 2026-02-19 04:44:25 -08:00

README.md

IT0 — AI-Powered Server Cluster Operations Platform

Intelligent operations platform that combines AI agents with human oversight for managing server clusters.

Architecture

  • Backend: NestJS microservices (TypeScript) with DDD + Clean Architecture
  • Mobile: Flutter app with Riverpod state management
  • Web Admin: Next.js dashboard with Zustand + React Query
  • Voice: Python service for voice-based interaction (STT/TTS/VAD)

Services

Service Description
auth-service Authentication, RBAC, API key management
agent-service AI agent orchestration (Claude CLI + API)
inventory-service Server, cluster, credential management
monitor-service Metrics collection, alerting, health checks
ops-service Task execution, approvals, standing orders
comm-service Multi-channel notifications, escalation
audit-service Audit logging, compliance trail
voice-service Voice pipeline (Python)

Quick Start

# Backend
pnpm install
pnpm dev

# Flutter
cd it0_app && flutter pub get && flutter run

# Web Admin
cd it0-web-admin && pnpm install && pnpm dev

Tech Stack

  • Runtime: Node.js 20+, Dart 3.x, Python 3.11+
  • Database: PostgreSQL (schema-per-tenant)
  • Cache/Events: Redis Streams
  • AI: Anthropic Claude (CLI + API)
  • Build: pnpm workspaces + Turborepo