taixf

Commit Graph

Author	SHA1	Message	Date
hailin	bbdb59cc05	debug: add audio frame counters to relay for troubleshooting Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 05:22:52 -07:00
hailin	ec17c085b2	fix: setup voice before recv_loop, drain responses, check ws state Root cause: recv_loop started before open_voice completed, bridge connection died during UI transition. Now setup completes first. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 05:13:54 -07:00
hailin	216f2fe6a0	feat: add voice relay (Plan B) - ESP32 audio passthrough to Antaf - voice_bridge_v7.js: audio injection support (type=3 frames) - relay.py: WebSocket↔TCP bridge with Opus↔PCM + resampling - test_inject.py: injection verification script - Injection verified: 1454 frames stable, no crash Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 05:05:24 -07:00
hailin	b70c1dd071	feat: append concise reply hint to Antaf queries Ant Afu tends to give long replies which causes TTS queue delays. Append "请用2-3句话简短回答" to reduce response length. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 04:30:32 -07:00
hailin	5679622996	fix: resample TTS audio from 44100Hz to 24000Hz for device compatibility Model outputs 44100Hz but device expects 24000Hz via Opus. Without resampling, audio plays at wrong speed causing 29s delays between segments. Verified: synthesis+resample takes 0.38s for 1.6s audio. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 23:11:48 -07:00
hailin	9b2b875c2b	fix: run TTS synthesis in thread pool to avoid blocking event loop Also add size check for int8 model to skip LFS pointer files. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 23:05:44 -07:00
hailin	83cdf3396d	fix: use full onnx model with 8 threads for fast local TTS Benchmark: short=0.37s, long=1.06s with 8 CPU threads. GPU not available in pip sherpa-onnx, CPU is fast enough. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 22:53:59 -07:00
hailin	c2727d7e08	fix: clean junk text from Antaf + use int8 TTS model for speed - Filter "完成资料引用" and other status text from Antaf responses - Use int8 quantized model for faster TTS inference - Add configurable num_threads for sherpa-onnx Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 22:52:02 -07:00
hailin	e5599d4f43	feat: add sherpa-onnx local TTS provider Offline VITS TTS using sherpa-onnx, no network dependency. Uses vits-melo-tts-zh_en model for Chinese/English. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 22:43:51 -07:00
hailin	12b4994ac0	fix: extract content from ASR JSON before sending to Antaf ASR wraps user speech as JSON {"content":"...", "language":"zh", "emotion":"..."}, extract only the content field instead of sending raw JSON to Antaf bridge. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 20:32:45 -07:00
hailin	cb9d430cfc	fix: skip system-injected user messages, only send real user query to Antaf The xiaozhi server injects tool_call reminders and system prompts as role=user messages into dialogue. These were being picked up as the "last user message" and sent to Antaf bridge instead of the actual query. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 20:27:37 -07:00
hailin	f461e341ba	fix: filter thinking content and deduplicate SSE chunks in AntafLLM Ant Afu returns internal reasoning/thinking process mixed with actual response text, causing TTS to read out internal monologue. Also fixes duplicate text chunks being sent repeatedly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 20:05:42 -07:00
hailin	d399a21f23	feat: add AntafLLM provider for Ant Afu text API integration Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 12:27:49 -07:00
hailin	688a5e17b3	docs: add antaf integration plan for ESP32 device Two approaches to replace self-hosted Qwen3-32B with Ant Afu AI: - Plan A: Custom LLM Provider (text API via Frida HTTP Bridge) - Plan B: Full voice passthrough (audio injection via voice bridge) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 12:26:31 -07:00
hailin	ae260da3eb	add frontend code	2026-04-05 12:05:11 -07:00
hailin	742389e965	add backend code	2026-04-05 19:01:15 +00:00
hailin	ac9061f06d	first commit	2026-04-05 18:55:20 +00:00

17 Commits All Branches Search

17 Commits

All Branches