Commit Graph

11 Commits

Author SHA1 Message Date
hailin 83cdf3396d fix: use full onnx model with 8 threads for fast local TTS
Benchmark: short=0.37s, long=1.06s with 8 CPU threads.
GPU not available in pip sherpa-onnx, CPU is fast enough.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 22:53:59 -07:00
hailin c2727d7e08 fix: clean junk text from Antaf + use int8 TTS model for speed
- Filter "完成资料引用" and other status text from Antaf responses
- Use int8 quantized model for faster TTS inference
- Add configurable num_threads for sherpa-onnx

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 22:52:02 -07:00
hailin e5599d4f43 feat: add sherpa-onnx local TTS provider
Offline VITS TTS using sherpa-onnx, no network dependency.
Uses vits-melo-tts-zh_en model for Chinese/English.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 22:43:51 -07:00
hailin 12b4994ac0 fix: extract content from ASR JSON before sending to Antaf
ASR wraps user speech as JSON {"content":"...", "language":"zh", "emotion":"..."},
extract only the content field instead of sending raw JSON to Antaf bridge.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 20:32:45 -07:00
hailin cb9d430cfc fix: skip system-injected user messages, only send real user query to Antaf
The xiaozhi server injects tool_call reminders and system prompts as
role=user messages into dialogue. These were being picked up as the
"last user message" and sent to Antaf bridge instead of the actual query.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 20:27:37 -07:00
hailin f461e341ba fix: filter thinking content and deduplicate SSE chunks in AntafLLM
Ant Afu returns internal reasoning/thinking process mixed with actual
response text, causing TTS to read out internal monologue. Also fixes
duplicate text chunks being sent repeatedly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 20:05:42 -07:00
hailin d399a21f23 feat: add AntafLLM provider for Ant Afu text API integration
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 12:27:49 -07:00
hailin 688a5e17b3 docs: add antaf integration plan for ESP32 device
Two approaches to replace self-hosted Qwen3-32B with Ant Afu AI:
- Plan A: Custom LLM Provider (text API via Frida HTTP Bridge)
- Plan B: Full voice passthrough (audio injection via voice bridge)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 12:26:31 -07:00
hailin ae260da3eb add frontend code 2026-04-05 12:05:11 -07:00
hailin 742389e965 add backend code 2026-04-05 19:01:15 +00:00
hailin ac9061f06d first commit 2026-04-05 18:55:20 +00:00