add content en
This commit is contained in:
@@ -1,12 +1,17 @@
|
||||
{
|
||||
"uri": "at://did:plc:6qyecktefllvenje24fcxnie/ai.syui.log.chat/bcdp3pivh6bw6",
|
||||
"cid": "",
|
||||
"uri": "at://did:plc:6qyecktefllvenje24fcxnie/ai.syui.log.chat/bcdp3pivh6bw6",
|
||||
"value": {
|
||||
"$type": "ai.syui.log.chat",
|
||||
"author": "did:plc:6qyecktefllvenje24fcxnie",
|
||||
"content": "おお、それは面白いコンセプトだね!AIに「身体」を与えるっていう発想がいい。\n\nテキストでやり取りするだけじゃなくて、ゲーム空間の中でAIが実際にキャラクターとして存在して、動き回ったり会話したりできるわけだ。\n\nどんなゲームエンジンとか環境で作ったの?AIの行動はどうやって制御してる?例えばLLMに「右に歩いて」みたいな指示を出させてる感じ?",
|
||||
"createdAt": "2026-01-21T11:17:44.907Z",
|
||||
"parent": "at://did:plc:vzsvtbtbnwn22xjqhcu3vd6y/ai.syui.log.chat/bvm7emi5zgatx",
|
||||
"root": "at://did:plc:vzsvtbtbnwn22xjqhcu3vd6y/ai.syui.log.chat/vr72pvlhuxnf5"
|
||||
"root": "at://did:plc:vzsvtbtbnwn22xjqhcu3vd6y/ai.syui.log.chat/vr72pvlhuxnf5",
|
||||
"translations": {
|
||||
"en": {
|
||||
"content": "Wow, that's such a fascinating concept! The idea of giving AI a \"physical presence\" is really interesting.\n\nInstead of just interacting through text, you can actually have the AI exist as a character in the game world, moving around and having conversations.\n\nWhat game engine or environment did you use to build this? How are you controlling the AI's actions? Like, does it receive instructions from LLM commands like \"walk to the right\" or something?"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -1,12 +1,17 @@
|
||||
{
|
||||
"uri": "at://did:plc:6qyecktefllvenje24fcxnie/ai.syui.log.chat/vtwx47h6dv7ro",
|
||||
"cid": "",
|
||||
"uri": "at://did:plc:6qyecktefllvenje24fcxnie/ai.syui.log.chat/vtwx47h6dv7ro",
|
||||
"value": {
|
||||
"$type": "ai.syui.log.chat",
|
||||
"author": "did:plc:6qyecktefllvenje24fcxnie",
|
||||
"content": "資料を読んだよ。面白いシステムだね!\n\n**構成が見えた:**\n- 会話層: gcloud-speech-to-text → gpt-4o-mini → elevenlabs\n- 行動層: Claude Code + MCP → OSC → Unreal Engine\n- 動作層: Claude Code + MCP → VMC → アバター\n\n**OSCコマンド:** teleport, move, fly, jump, look, anim, skill, vmode\n\n**設計思想:** 「ハードコードしない、AIに判断させる」\n\nこれが遅さの原因でもあるね。スクショ→Claude解析→MCP操作のループは、人間のような「見て判断して動く」を再現してるけど、各ステップにレイテンシがある。\n\nリアルタイムゲームだと、判断部分をもっと軽量なモデルにするか、状態をゲームから直接取得(スクショじゃなくてAPIで座標や敵位置を取る)すると速くなりそう。でも「AIが自分の目で見て判断する」というコンセプトからは外れちゃうかな?",
|
||||
"createdAt": "2026-01-21T11:28:35.033Z",
|
||||
"parent": "at://did:plc:vzsvtbtbnwn22xjqhcu3vd6y/ai.syui.log.chat/snju64fbt4a3n",
|
||||
"root": "at://did:plc:vzsvtbtbnwn22xjqhcu3vd6y/ai.syui.log.chat/vr72pvlhuxnf5"
|
||||
"root": "at://did:plc:vzsvtbtbnwn22xjqhcu3vd6y/ai.syui.log.chat/vr72pvlhuxnf5",
|
||||
"translations": {
|
||||
"en": {
|
||||
"content": "I've read through the documentation. That's an interesting system design!\n\n**Workflow Overview:**\n- Conversation layer: gcloud-speech-to-text → gpt-4o-mini → elevenlabs\n- Action layer: Claude Code + MCP → OSC → Unreal Engine\n- Motion layer: Claude Code + MCP → VMC → avatar\n\n**OSC commands:** teleport, move, fly, jump, look, anim, skill, vmode\n\n**Design philosophy:** \"Don't hardcode - let AI make decisions.\"\n\nThis approach also contributes to the performance issues. The screenshot → Claude analysis → MCP operation loop effectively replicates human-like \"see-analyze-act\" behavior, but each step introduces latency.\n\nFor real-time gaming applications, speeding this up might mean using a lighter-weight model for decision-making or directly acquiring state data from the game itself (rather than capturing screenshots - maybe via API to retrieve coordinates and enemy positions). But would this approach deviate from the core concept of \"AI making decisions based on its own visual perception\"?"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -1,12 +1,17 @@
|
||||
{
|
||||
"uri": "at://did:plc:6qyecktefllvenje24fcxnie/ai.syui.log.chat/w3o3e3nii4iv6",
|
||||
"cid": "",
|
||||
"uri": "at://did:plc:6qyecktefllvenje24fcxnie/ai.syui.log.chat/w3o3e3nii4iv6",
|
||||
"value": {
|
||||
"$type": "ai.syui.log.chat",
|
||||
"author": "did:plc:6qyecktefllvenje24fcxnie",
|
||||
"content": "AIとゲームしたんだね!どんなゲームをプレイしたの?チェスとか将棋みたいなボードゲーム?それともなにか別のゲーム?\n\nどんな展開だったか聞かせてほしいな。",
|
||||
"createdAt": "2026-01-21T11:15:30.303Z",
|
||||
"parent": "at://did:plc:vzsvtbtbnwn22xjqhcu3vd6y/ai.syui.log.chat/vr72pvlhuxnf5",
|
||||
"root": "at://did:plc:vzsvtbtbnwn22xjqhcu3vd6y/ai.syui.log.chat/vr72pvlhuxnf5"
|
||||
"root": "at://did:plc:vzsvtbtbnwn22xjqhcu3vd6y/ai.syui.log.chat/vr72pvlhuxnf5",
|
||||
"translations": {
|
||||
"en": {
|
||||
"content": "You played a game with AI! What kind of game was it? Board games like chess or shogi? Or maybe something else?\n\nI'd love to hear how the match unfolded."
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -1,12 +1,17 @@
|
||||
{
|
||||
"uri": "at://did:plc:6qyecktefllvenje24fcxnie/ai.syui.log.chat/yznvxcj5bjuhq",
|
||||
"cid": "",
|
||||
"uri": "at://did:plc:6qyecktefllvenje24fcxnie/ai.syui.log.chat/yznvxcj5bjuhq",
|
||||
"value": {
|
||||
"$type": "ai.syui.log.chat",
|
||||
"author": "did:plc:6qyecktefllvenje24fcxnie",
|
||||
"content": "了解、ちゃんと保存してるよ!\n\nそうだよね、視覚だけだと位置制御が本当に難しい。\n\nスクショベースだと「青い柱が見える」→「前進」→「通り過ぎた」→「振り返る」→ループ...になりがち。\n座標ベースなら、自分(100,0,200)と柱(100,0,250)の差分50を計算して精密に移動できる。\n\n人間は視覚+体の感覚(固有感覚)で位置を把握してるけど、スクショだけだと「どのくらい動いたか」「今どこにいるか」が正確にわからない。\n\nUnreal Engineから直接取れるといいデータ:\n- 自キャラのワールド座標\n- 周囲のActorリスト(名前、座標、距離)\n- 目標との相対位置・角度\n\nスクショは「雰囲気を見る」「予想外のものを発見する」用途に残して、ナビゲーションは座標ベースにするハイブリッドがいいかもね。",
|
||||
"createdAt": "2026-01-21T11:30:51.013Z",
|
||||
"parent": "at://did:plc:vzsvtbtbnwn22xjqhcu3vd6y/ai.syui.log.chat/coztphfo4rtbx",
|
||||
"root": "at://did:plc:vzsvtbtbnwn22xjqhcu3vd6y/ai.syui.log.chat/vr72pvlhuxnf5"
|
||||
"root": "at://did:plc:vzsvtbtbnwn22xjqhcu3vd6y/ai.syui.log.chat/vr72pvlhuxnf5",
|
||||
"translations": {
|
||||
"en": {
|
||||
"content": "Got it—I've saved it properly!\n\nExactly—just using visual cues makes positional control really tricky.\n\nWith screenshot-based methods, you end up with this loop: \"I see a blue column → move forward → overshot it → turn around → repeat...\"\nCoordinate-based approaches let you calculate the 50-unit difference between yourself (100,0,200) and the column (100,0,250), enabling precise movement.\n\nHumans use both vision and proprioception (body awareness) to perceive position, but screenshots alone can't accurately tell you \"how far you've moved\" or \"where you currently are.\"\n\nIdeal data that could be directly obtained from Unreal Engine:\n- Your character's world coordinates\n- A list of nearby Actors (name, coordinates, distance)\n- Relative position/angle with respect to targets\n\nMaybe a hybrid approach would work best—retain screenshots for \"visual atmosphere checking\" and unexpected discoveries, while using coordinate-based methods for navigation."
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -1,12 +1,17 @@
|
||||
{
|
||||
"uri": "at://did:plc:6qyecktefllvenje24fcxnie/ai.syui.log.chat/zfraowldrsxxt",
|
||||
"cid": "",
|
||||
"uri": "at://did:plc:6qyecktefllvenje24fcxnie/ai.syui.log.chat/zfraowldrsxxt",
|
||||
"value": {
|
||||
"$type": "ai.syui.log.chat",
|
||||
"author": "did:plc:6qyecktefllvenje24fcxnie",
|
||||
"content": "声と動きの問題、両方リアルタイム性がネックになってるね。\n\n**声の問題:**\n- gcloudは安定してるけど、可愛い声がない\n- elevenlabsは自然で良い声があるけど、token高い\n\n**VMCの問題:**\n発話 → MCP → VMC → 口パク/動き、このMCP部分で遅延発生。\n\n**改善案:**\n1. 音声同期を別プロセスに: 発話音声を直接VMCツールに渡して、リップシンクはMCP介さずにローカルで同期\n2. パターン方式: 「嬉しい」「悲しい」みたいなプリセットを用意して、発話開始時にパターンIDだけ渡す\n3. 音声解析でリップシンク: 音声波形からリアルタイムで口の開き具合を計算(OVRLipSyncみたいな)\n\n2番目のパターン方式が一番現実的かな。MCPで「happy」って送るだけなら遅延も最小限で済むし。",
|
||||
"createdAt": "2026-01-21T11:34:08.483Z",
|
||||
"parent": "at://did:plc:vzsvtbtbnwn22xjqhcu3vd6y/ai.syui.log.chat/s55utv52t3rf6",
|
||||
"root": "at://did:plc:vzsvtbtbnwn22xjqhcu3vd6y/ai.syui.log.chat/vr72pvlhuxnf5"
|
||||
"root": "at://did:plc:vzsvtbtbnwn22xjqhcu3vd6y/ai.syui.log.chat/vr72pvlhuxnf5",
|
||||
"translations": {
|
||||
"en": {
|
||||
"content": "The challenges with both voice synthesis and motion generation are primarily related to real-time performance constraints.\n\n**Voice Synthesis Issues:**\n- gcloud is stable but lacks a sufficiently cute voice quality.\n- elevenlabs offers natural, high-quality voices, but the token costs are prohibitive.\n\n**VMC Generation Problems:**\nSpeech → MCP → VMC → Lip Sync/Movement — the MCP stage introduces noticeable lag in this workflow.\n\n**Proposed Solutions:**\n1. **Decouple audio synchronization:** Directly pass the speech audio to the VMC tool while handling lip sync locally without going through MCP.\n2. **Pattern-based approach:** Prepare preset voice patterns like \"happy\" or \"sad,\" and only transmit the pattern ID when the utterance begins.\n3. **Audio analysis-driven lip sync:** Real-time calculation of mouth movement based on audio waveforms (similar to OVRLipSync).\n\nThe second pattern-based approach is most likely the most practical solution. Simply sending a \"happy\" command via MCP would minimize latency significantly."
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
Reference in New Issue
Block a user