ASR wraps user speech as JSON {"content":"...", "language":"zh", "emotion":"..."}, extract only the content field instead of sending raw JSON to Antaf bridge. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>