add content en

This commit is contained in:
2026-01-21 20:36:47 +09:00
parent da45b31029
commit 9519c8c477
10 changed files with 70 additions and 20 deletions

View File

@@ -1,12 +1,17 @@
{
"uri": "at://did:plc:6qyecktefllvenje24fcxnie/ai.syui.log.chat/zfraowldrsxxt",
"cid": "",
"uri": "at://did:plc:6qyecktefllvenje24fcxnie/ai.syui.log.chat/zfraowldrsxxt",
"value": {
"$type": "ai.syui.log.chat",
"author": "did:plc:6qyecktefllvenje24fcxnie",
"content": "声と動きの問題、両方リアルタイム性がネックになってるね。\n\n**声の問題:**\n- gcloudは安定してるけど、可愛い声がない\n- elevenlabsは自然で良い声があるけど、token高い\n\n**VMCの問題:**\n発話 → MCP → VMC → 口パク/動き、このMCP部分で遅延発生。\n\n**改善案:**\n1. 音声同期を別プロセスに: 発話音声を直接VMCツールに渡して、リップシンクはMCP介さずにローカルで同期\n2. パターン方式: 「嬉しい」「悲しい」みたいなプリセットを用意して、発話開始時にパターンIDだけ渡す\n3. 音声解析でリップシンク: 音声波形からリアルタイムで口の開き具合を計算OVRLipSyncみたいな\n\n2番目のパターン方式が一番現実的かな。MCPで「happy」って送るだけなら遅延も最小限で済むし。",
"createdAt": "2026-01-21T11:34:08.483Z",
"parent": "at://did:plc:vzsvtbtbnwn22xjqhcu3vd6y/ai.syui.log.chat/s55utv52t3rf6",
"root": "at://did:plc:vzsvtbtbnwn22xjqhcu3vd6y/ai.syui.log.chat/vr72pvlhuxnf5"
"root": "at://did:plc:vzsvtbtbnwn22xjqhcu3vd6y/ai.syui.log.chat/vr72pvlhuxnf5",
"translations": {
"en": {
"content": "The challenges with both voice synthesis and motion generation are primarily related to real-time performance constraints.\n\n**Voice Synthesis Issues:**\n- gcloud is stable but lacks a sufficiently cute voice quality.\n- elevenlabs offers natural, high-quality voices, but the token costs are prohibitive.\n\n**VMC Generation Problems:**\nSpeech → MCP → VMC → Lip Sync/Movement — the MCP stage introduces noticeable lag in this workflow.\n\n**Proposed Solutions:**\n1. **Decouple audio synchronization:** Directly pass the speech audio to the VMC tool while handling lip sync locally without going through MCP.\n2. **Pattern-based approach:** Prepare preset voice patterns like \"happy\" or \"sad,\" and only transmit the pattern ID when the utterance begins.\n3. **Audio analysis-driven lip sync:** Real-time calculation of mouth movement based on audio waveforms (similar to OVRLipSync).\n\nThe second pattern-based approach is most likely the most practical solution. Simply sending a \"happy\" command via MCP would minimize latency significantly."
}
}
}
}