feat: enable OpenAI Realtime STT for streaming speech recognition

Switch from batch STT (gpt-4o-transcribe via /audio/transcriptions)
to streaming Realtime API (WebSocket). This eliminates the ~2s batch
upload+process latency per utterance.

Also updated nginx proxy on 67.223.119.33 to support WebSocket upgrade
for /v1/realtime endpoint.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
hailin 2026-03-01 07:49:25 -08:00
parent e302891f16
commit ba83e433d3
1 changed files with 1 additions and 0 deletions

View File

@ -148,6 +148,7 @@ async def entrypoint(ctx: JobContext) -> None:
model=settings.openai_stt_model,
language=settings.whisper_language,
client=_oai_client,
use_realtime=True,
)
else:
stt = LocalWhisperSTT(