Commit Graph

2 Commits

Author SHA1 Message Date
hailin 1a8fc81549 fix: move limit_req_zone to http context (conf.d) 2026-03-06 04:49:27 -08:00
hailin c56ae7bb7a feat: add production inference API (FastAPI + Celery + Redis + NGINX)
- api/: FastAPI app with /v1/generate, /v1/jobs/{id}, /v1/videos/{id}
- api/tasks.py: Celery worker, each GPU gets its own worker process
- deploy/: systemd units (opensora-api, opensora-worker@), nginx.conf, setup.sh
- Architecture: NGINX → Gunicorn/FastAPI → Redis → 8× Celery workers (GPU 0-7)
- Each task runs torchrun --nproc_per_node=1 subprocess, fully isolated per GPU

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-06 04:40:10 -08:00