Ant Afu tends to give long replies which causes TTS queue delays.
Append "请用2-3句话简短回答" to reduce response length.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Model outputs 44100Hz but device expects 24000Hz via Opus. Without
resampling, audio plays at wrong speed causing 29s delays between
segments. Verified: synthesis+resample takes 0.38s for 1.6s audio.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Benchmark: short=0.37s, long=1.06s with 8 CPU threads.
GPU not available in pip sherpa-onnx, CPU is fast enough.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Filter "完成资料引用" and other status text from Antaf responses
- Use int8 quantized model for faster TTS inference
- Add configurable num_threads for sherpa-onnx
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Offline VITS TTS using sherpa-onnx, no network dependency.
Uses vits-melo-tts-zh_en model for Chinese/English.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>