- Filter "完成资料引用" and other status text from Antaf responses
- Use int8 quantized model for faster TTS inference
- Add configurable num_threads for sherpa-onnx
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Offline VITS TTS using sherpa-onnx, no network dependency.
Uses vits-melo-tts-zh_en model for Chinese/English.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ASR wraps user speech as JSON {"content":"...", "language":"zh", "emotion":"..."},
extract only the content field instead of sending raw JSON to Antaf bridge.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The xiaozhi server injects tool_call reminders and system prompts as
role=user messages into dialogue. These were being picked up as the
"last user message" and sent to Antaf bridge instead of the actual query.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Ant Afu returns internal reasoning/thinking process mixed with actual
response text, causing TTS to read out internal monologue. Also fixes
duplicate text chunks being sent repeatedly.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two approaches to replace self-hosted Qwen3-32B with Ant Afu AI:
- Plan A: Custom LLM Provider (text API via Frida HTTP Bridge)
- Plan B: Full voice passthrough (audio injection via voice bridge)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>