hailin
|
c56ae7bb7a
|
feat: add production inference API (FastAPI + Celery + Redis + NGINX)
- api/: FastAPI app with /v1/generate, /v1/jobs/{id}, /v1/videos/{id}
- api/tasks.py: Celery worker, each GPU gets its own worker process
- deploy/: systemd units (opensora-api, opensora-worker@), nginx.conf, setup.sh
- Architecture: NGINX → Gunicorn/FastAPI → Redis → 8× Celery workers (GPU 0-7)
- Each task runs torchrun --nproc_per_node=1 subprocess, fully isolated per GPU
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-03-06 04:40:10 -08:00 |