ai/ai
ai/ai
1
0
This commit is contained in:
syui 2025-03-25 14:27:34 +09:00
commit f528ca6359
Signed by: syui
GPG Key ID: 5417CFEBAD92DF56
9 changed files with 526 additions and 0 deletions

1
.gitignore vendored Normal file
View File

@ -0,0 +1 @@
**.DS_Store

16
README.md Normal file
View File

@ -0,0 +1,16 @@
# <img src="./icon/ai.png" width="30"> ai `ai`
AI model `ai`
the aim is to incorporate it into `aibot` and `aios`.
|name|full|code|repo|
|---|---|---|---|
|ai|ai ai|aiai|https://git.syui.ai/ai/ai|
|os|ai os|aios|https://git.syui.ai/ai/os|
|bot|ai bot|aibot|https://git.syui.ai/ai/bot|
|at|ai|ai.syu.is|https://git.syui.ai/ai/at|
```sh
$ ollama run syui/ai "hello"
```

479
docs/ja.md Normal file
View File

@ -0,0 +1,479 @@
# ai
AI modelの`ai`は、`aibot``aios`に組み込むことを目指します。
|name|full|code|repo|
|---|---|---|---|
|ai|ai ai|aiai|https://git.syui.ai/ai/ai|
|os|ai os|aios|https://git.syui.ai/ai/os|
|bot|ai bot|aibot|https://git.syui.ai/ai/bot|
|at|ai|ai.syu.is|https://git.syui.ai/ai/at|
```sh
$ ollama run syui/ai "hello"
```
## 学習
物語を学習させます。特定の語彙を使用します。例えば「自分のことをアイという」などです。
> アイね、回答するの
## できること
基本的には`aibot`へのrequestに応じて、`comfyui`で画像や動画生成、LLMで回答を行います。
webからはatprotoを通じて実行されます。
```sh
[web]aiat --> [server]aios --> [at]aibot --> [ai]aiai
```
## 使用するもの
- https://github.com/ollama/ollama
- https://github.com/n8n-io/n8n
- https://github.com/comfyanonymous/comfyui
- https://github.com/NVIDIA/cosmos
- https://github.com/stability-ai/stablediffusion
- https://github.com/unslothai/unsloth
- https://github.com/ml-explore/mlx-examples
- https://github.com/ggml-org/llama.cpp
```json
{
"model": [ "gemma3", "deepseek-r1" ],
"tag": [ "ollama", "LoRA", "unsloth", "open-webui", "n8n" ]
}
```
## ollama
```sh
# mac
$ brew install ollama
$ brew services restart ollama
# windows
$ winget install ollama.ollama
$ ollama serve
$ ollama pull gemma3:1b
$ ollama run gemma3:1b "hello"
```
## n8n
```sh
# https://github.com/n8n-io/n8n/
$ docker volume create n8n_data
$ docker run -it --rm --name n8n -p 5678:5678 -v n8n_data:/home/node/.n8n docker.n8n.io/n8nio/n8n
```
## webui
```sh
$ winget install python.python.3.12
$ python --version
$ python -m venv webui
$ cd webui
$ .\Scripts\activate
$ pip install open-webui
$ open-webui serve
http://localhost:8080
```
## LoRA
apple siliconでLoRA(finetuning)するには`mlx_lm`を使用します。
```sh
$ brew install --cask anaconda
$ brew info anaconda
$ cd /opt/homebrew/Caskroom/anaconda/*
$ ./Anaconda3*.sh
```
`google/gemma-3-1b-it`を承認しておきます。
- https://huggingface.co/google/gemma-3-1b-it
```sh
$ pip install -U "huggingface_hub[cli]"
# https://huggingface.co/settings/tokens
# Repositories permissions:Read access to contents of selected repos
$ huggingface_hub login
```
```sh
$ conda create -n finetuning python=3.12
$ conda activate finetuning
$ pip install mlx-lm
$ echo "{ \"model\": \"https://huggingface.co/google/gemma-3-1b-it\", \"data\": \"https://github.com/ml-explore/mlx-examples/tree/main/lora/data\" }"|jq .
$ git clone https://github.com/ml-explore/mlx-examples
$ model=google/gemma-3-1b-it
$ data=mlx-examples/lora/data
$ mlx_lm.lora --train --model $model --data $data --batch-size 3
$ ls adapters
adapter_config.json
adapters.safetensors
```
## unsloth
windowsでLoRA(finetuning)するには`unsloth`を使います。
最も安定しているのは以下のverです。
```sh
$ nvidia-smi
$ nvcc --version
# https://github.com/unslothai/notebooks/blob/main/unsloth_windows.ps1
cuda: 12.4
python: 3.11
```
torchをcuda:12.8で使用する方法もあります。python:3.12を使用できます。しかし、unslothのinstallでは問題が発生しました。
```sh
$ winget install --scope machine nvidia.cuda --version 12.4.1
$ winget install curl.curl
```
```sh
# https://docs.unsloth.ai/get-started/installing-+-updating/windows-installation
$ curl -sLO https://raw.githubusercontent.com/unslothai/notebooks/refs/heads/main/unsloth_windows.ps1
$ powershell.exe -ExecutionPolicy Bypass -File .\unsloth_windows.ps1
$ vim custom.py
```
上記はpwshでunsolthを使う方法ですが、wslを使ったほうがいいです。
```py
# https://docs.unsloth.ai/get-started/fine-tuning-guide
from unsloth import FastModel
import torch
fourbit_models = [
# 4bit dynamic quants for superior accuracy and low memory use
# https://docs.unsloth.ai/basics/tutorial-how-to-run-and-fine-tune-gemma-3
# https://huggingface.co/unsloth/gemma-3-4b-it
"unsloth/gemma-3-1b-it-unsloth-bnb-4bit",
"unsloth/gemma-3-4b-it-unsloth-bnb-4bit",
"unsloth/gemma-3-12b-it-unsloth-bnb-4bit",
"unsloth/gemma-3-27b-it-unsloth-bnb-4bit",
# Other popular models!
"unsloth/Llama-3.1-8B",
"unsloth/Llama-3.2-3B",
"unsloth/Llama-3.3-70B",
"unsloth/mistral-7b-instruct-v0.3",
"unsloth/Phi-4",
] # More models at https://huggingface.co/unsloth
model, tokenizer = FastModel.from_pretrained(
model_name = "unsloth/gemma-3-4b-it",
max_seq_length = 2048, # Choose any for long context!
load_in_4bit = True, # 4 bit quantization to reduce memory
load_in_8bit = False, # [NEW!] A bit more accurate, uses 2x memory
full_finetuning = False, # [NEW!] We have full finetuning now!
# token = "hf_...", # use one if using gated models
)
model = FastModel.get_peft_model(
model,
finetune_vision_layers = False, # Turn off for just text!
finetune_language_layers = True, # Should leave on!
finetune_attention_modules = True, # Attention good for GRPO
finetune_mlp_modules = True, # SHould leave on always!
r = 8, # Larger = higher accuracy, but might overfit
lora_alpha = 8, # Recommended alpha == r at least
lora_dropout = 0,
bias = "none",
random_state = 3407,
)
```
## comfyui
https://github.com/comfyanonymous/comfyui
- https://github.com/ltdrdata/comfyui-manager
- https://github.com/ltdrdata/comfyui-impact-pack
開発者のmatrix roomが`comfyui_space:matrix.org`にあります。
https://app.element.io/#/room/#comfyui_space:matrix.org
### install
installは`git clone`から構築したほうがいいです。
```sh
# https://docs.comfy.org/installation/manual_install#nvidia:install-nightly
$ winget install python.python.3.12
$ git clone https://github.com/comfyanonymous/comfyui
$ cd comfyui
$ python -m venv venv
$ Set-ExecutionPolicy RemoteSigned -Scope Process
$ venv\Scripts\activate
$ pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128
$ python -m pip install --upgrade pip
$ pip install -r requirements.txt
$ python main.py
http://localhost:8188
```
comfyuiはnodeで書きますが`workflow`と呼ぶようです。jsonで保存されます。簡単にimportできます。
https://comfyanonymous.github.io/ComfyUI_examples/
基本構造としては、以下のとおりです。
```sh
./comfyui
├── main.py
├── custom_nodes # ここにpluginを入れる
│   └── comfyui-manager
└── models
└── checkpoints # ここにmodelを入れる
└── model.safetensors
```
1. modelは[comfyui-manager](https://github.com/ltdrdata/comfyui-manager)、または[civitai.com](https://civitai.com/models)からdlするといいです。
2. workflowは[example](https://github.com/aimpowerment/comfyui-workflows)があります。また、[openart.ai](https://openart.ai/workflows/all)と連携できます。
3. promptは[majinai.art](https://majinai.art/ja/)を参照してください。
例えば、`text-to-image.json`をworkflowに読み込んで、modelとpromptを書き換えて生成してみます。
```sh
# https://docs.comfy.org/get_started/first_generation
# このjsonをworkflowにimportします
$ curl -sLO https://raw.githubusercontent.com/Comfy-Org/docs/refs/heads/main/public/text-to-image.json
```
`1920x1080(1080p)`を使用しました。一部、ollamaでpromptを生成してもらっています。
<img src="https://git.syui.ai/ai/ai/raw/branch/main/repos/comfyui/output/ComfyUI_00001_.png" width="600px">
```json
{
"models": {
"checkpoints": "hsUltrahdCG_IIIEpic.safetensors",
"hsUltrahdCG_IIIEpic": "https://civitai.com/api/download/models/1456463?type=Model&format=SafeTensor"
},
"prompt": {
"positive": "(portrait,portrait1.5, masterpiece:1.2, simple dress, full body, blonde hair, Round eyes, look at viewer, best quality:1.5, front view:1.10), (little child girl:1.5, stumpy child:1.3, smirk:1.10), narrow eyes:1.5, flat breasts:1.6, flat chest:1.6, flat hip:1.5, <lora:AsianEyesEra:0.7>, AsianEyesEra:0.8, cinematic shadows:1.6, mouth close:1.5, cobblestone road, wearing strings:1.5, little child:1.8, little girl:1.8, long hair, Angel halo, White Dress, full body, Eye color is blue",
"negative": "breasts pubic hair, score_6, score_5, score_4, source_pony, (worst quality:1.8), (low quality:1.8), (normal quality:1.8), lowres, bad anatomy, bad hands, signature, watermarks, ugly, imperfect eyes, skewed eyes, unnatural face, unnatural body, error, extra limb, missing limbs, bad-artist, depth of field, girls, 2girls, many girls, plural girls, mutiple girls, brown hair, black hair"
}
}
```
ollamaから生成されるpromptは、`show text -> string function -> [text]clip text encode(prompt)`で統合して`positive`に繋いでいます。上記のpromptに追加すると使用できますが、解説のため分けています。
gemmaから生成されているため、質問+回答に改行が挟まれて出力されています。これは適時、ollamaで使用するmodelの出力に合わせて加工します。今回は全文そのまま入れています。
```json
{
"prompt": {
"positive": "You describe pictures. I will give you a brief description of the picture. You will describe the picture in intricate detail. Describe clothing, pose, expression, setting, lighting, and any other details you can think of. Use long descriptive sentences.You describe pictures. I will give you a brief description of the picture. You reply with comma separated keywords that describe the picture. Describe clothing, pose, expression, setting, and any other details you can think of. Use comma separated keywords. Do not use sentences. Use brevity. , cherry blossom, sakura tree, pink blossoms, petals falling, serene atmosphere, traditional kimono, elegant pose, soft lighting, dappled sunlight, peaceful scene, Japanese garden, tranquil setting, detailed foliage, pink hues, gentle breeze, traditional architecture, serene, calm, beautiful, springtime, delicate, romantic"
}
}
```
### model
```json
{
"models": {
"checkpoints": {
"sd_xl_base_1.0.safetensors": "https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/sd_xl_base_1.0.safetensors",
"sd_xl_refiner_1.0.safetensors": "https://huggingface.co/stabilityai/stable-diffusion-xl-refiner-1.0/blob/main/sd_xl_refiner_1.0.safetensors",
"v1-5-pruned.ckpt2": "https://huggingface.co/runwayml/stable-diffusion-v1-5/blob/main/v1-5-pruned.ckpt",
"moeFussionV1.4.0_RICZ_vz.safetensors": "https://huggingface.co/JosefJilek/moeFussion/blob/main/moeFussionV1.4.0_RICZ_vz.safetensors",
"hsUltrahdCG_IIIEpic": "https://civitai.com/api/download/models/1456463?type=Model&format=SafeTensor"
},
"vae": {
"wan_2.1_vae.safetensors": "https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors"
},
"diffusion_models": {
"wan2.1_i2v_720p_14B_fp8_scaled.safetensors": "https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/blob/main/split_files/diffusion_models/wan2.1_i2v_720p_14B_fp8_scaled.safetensors"
},
"text_encoders": {
"umt5_xxl_fp8_e4m3fn_scaled.safetensors": "https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors"
},
"clip_vision": {
"clip_vision_h.safetensors": "https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/clip_vision/clip_vision_h.safetensors"
},
"loras": {
"StudioGhibli.Redmond-StdGBRRedmAF-StudioGhibli.safetensors": "https://huggingface.co/artificialguybr/StudioGhibli.Redmond-V2/blob/main/StudioGhibli.Redmond-StdGBRRedmAF-StudioGhibli.safetensors",
"Yoji_Shinkawa_Art_Style_Flux.safetensors": "https://civitai.com/api/download/models/912623?type=Model&format=SafeTensor"
},
"upscalers": {},
"other_models": {}
}
}
```
### torch(cuda:12.8)で問題が発生した場合
`cuda:12.8`で問題が発生した場合、こちらを参考にしてください。
私の環境だと問題は発生しませんでした。
```sh
$ cd ComfyUI/.venv/Scripts/
$ ./python.exe -V
python:3.12
```
`torch`のnightly versionは`cuda:12.8`に対応しています。
https://pytorch.org/get-started/locally/
```sh
$ pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128
```
しかし、`torchaudio`などで衝突が発生します。衝突が発生する場合はversionを指定します。しかし、この方法で互換性が完全に解消されるわけではありません。
```sh
$ ./python.exe -m pip uninstall torch torchvision torchaudio -y
$ ./python.exe -m pip install --pre torch==2.7.0.dev20250306+cu128 torchvision==0.22.0.dev20250306+cu128 torchaudio==2.6.0.dev20250306+cu128 --index-url https://download.pytorch.org/whl/nightly/cu128
```
ホイールファイルを使用すると安定するようです。
```sh
# https://huggingface.co/w-e-w/torch-2.6.0-cu128.nv
$ ./python.exe -m pip install torch-2.x.x+cu128-cp312-cp312-win_amd64.whl
$ ./python.exe -m pip install torchvision-x.x.x+cu128-cp312-cp312-win_amd64.whl
$ ./python.exe -m pip install torchaudio-x.x.x+cu128-cp312-cp312-win_amd64.whl
$ ./python.exe -c "import torch; print(torch.cuda.is_available()); print(torch.__version__); print(torch.cuda.get_arch_list())"
```
### comfyui + ollama
自動でpromptの生成を行います。ollamaのlocal llm modelを使用します。
- https://github.com/stavsap/comfyui-ollama
- https://github.com/pythongosssss/ComfyUI-Custom-Scripts
`show text`のcustom nodeを使用するには`ComfyUI-Custom-Scripts`が必要です。
https://github.com/aimpowerment/comfyui-workflows/blob/main/ollama-txt2img-workflow.json
nodeにある設定の方には以下の文章を載せておきます。
https://github.com/ScreamingHawk/comfyui-ollama-prompt-encode/blob/main/nodes/OllamaPromptGenerator.py
```json
{
"ollama-clip-prompt-encode": {
"prepend_tags": "You describe pictures. I will give you a brief description of the picture. You reply with comma separated keywords that describe the picture. Describe clothing, pose, expression, setting, and any other details you can think of. Use comma separated keywords. Do not use sentences. Use brevity.",
"text": "Under the cherry sakura tree"
}
}
```
### comfyuiで動画生成
#### cosmos
nvidiaのcosmosを使った動画生成です。
https://comfyanonymous.github.io/ComfyUI_examples/cosmos/
```json
{
"models": {
"text_encoders": {
"url": "https://huggingface.co/comfyanonymous/cosmos_1.0_text_encoder_and_VAE_ComfyUI/blob/main/text_encoders/oldt5_xxl_fp8_e4m3fn_scaled.safetensors"
},
"vae": {
"url": "https://huggingface.co/comfyanonymous/cosmos_1.0_text_encoder_and_VAE_ComfyUI/blob/main/vae/cosmos_cv8x8x8_1.0.safetensors"
},
"diffusion_models": {
"url": "https://huggingface.co/mcmonkey/cosmos-1.0/blob/main/Cosmos-1_0-Diffusion-7B-Video2World.safetensors"
}
}
}
```
```sh
$ curl -sLO https://comfyanonymous.github.io/ComfyUI_examples/cosmos/image_to_video_cosmos_7B.json
```
#### wan2.1
`cosmos`より`wan2.1`のほうがよい動画が生成できました。
https://blog.comfy.org/p/wan21-video-model-native-support
```json
{
"models": {
"diffusion_models": {
"url": "https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/blob/main/split_files/diffusion_models/wan2.1_i2v_720p_14B_fp8_scaled.safetensors"
},
"text_encoders": {
"url": "https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors"
},
"clip_vision": {
"url": "https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/clip_vision/clip_vision_h.safetensors"
},
"vae": {
"url": "https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors"
}
}
}
```
```sh
$ curl -sLO "https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/example%20workflows_Wan2.1/image_to_video_wan_720p_example.json"
```
`1280x720(720p)`を使用しました。
<img src="https://git.syui.ai/ai/ai/raw/branch/main/repos/comfyui/output/ComfyUI_00001_.webp" width="300px">
```json
{
"prompt": {
"positive": "A girl walks in a beautiful and dreamy place. The wind is blowing. She looks up at the sky and sees the clouds slowly moving by, Cherry blossoms falling like snow",
"negative": "Overexposure, static, blurred details, subtitles, paintings, pictures, still, overall gray, worst quality, low quality, JPEG compression residue, ugly, mutilated, redundant fingers, poorly painted hands, poorly painted faces, deformed, disfigured, deformed limbs, fused fingers, cluttered background, three legs, a lot of people in the background, upside down"
}
}
```
### instantIDで同じ顔を使う
[cubiq/ComfyUI_InstantID](https://github.com/cubiq/ComfyUI_InstantID)を使うことで、生成される画像に同じキャラクター(顔)を使用することができます。
なお、[Gourieff/ReActor](https://github.com/Gourieff/ComfyUI-ReActor)を併用することで、更に精度を高めることができます。
- ComfyUI_InstantID
- InstantID/ip-adapter
- InstantID/ControlNet
```sh
$ python.exe -m pip install onnxruntime
```
`instantID`のinstallはかなり厄介です。依存関係が多く、errorが出やすいと思います。また、`model(checkpoints)`には`SDXL`に対応したものが必要です。
後述する`InstantID_depth.json`, `InstantID_basic.json`を動作させる事ができましたが、必要なものが何なのかは正確に理解出来ていません。
```sh
$ curl -sLO https://raw.githubusercontent.com/cubiq/ComfyUI_InstantID/refs/heads/main/examples/InstantID_depth.json
$ curl -sLO https://raw.githubusercontent.com/cubiq/ComfyUI_InstantID/refs/heads/main/examples/InstantID_basic.json
```
instantIDはseedをrandomにしないと同じ画像が生成されます。そのため、seedのpinを公開してrandomにつなげています。(comfyuiではnodeのpinをどう呼ぶのかわからない)

BIN
icon/ai.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 140 KiB

1
prompt.txt Normal file
View File

@ -0,0 +1 @@

View File

@ -0,0 +1,29 @@
{
"models": {
"checkpoints": {
"sd_xl_base_1.0.safetensors": "https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/sd_xl_base_1.0.safetensors",
"sd_xl_refiner_1.0.safetensors": "https://huggingface.co/stabilityai/stable-diffusion-xl-refiner-1.0/blob/main/sd_xl_refiner_1.0.safetensors",
"v1-5-pruned.ckpt2": "https://huggingface.co/runwayml/stable-diffusion-v1-5/blob/main/v1-5-pruned.ckpt",
"moeFussionV1.4.0_RICZ_vz.safetensors": "https://huggingface.co/JosefJilek/moeFussion/blob/main/moeFussionV1.4.0_RICZ_vz.safetensors",
"hsUltrahdCG_IIIEpic": "https://civitai.com/api/download/models/1456463?type=Model&format=SafeTensor"
},
"vae": {
"wan_2.1_vae.safetensors": "https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors"
},
"diffusion_models": {
"wan2.1_i2v_720p_14B_fp8_scaled.safetensors": "https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/blob/main/split_files/diffusion_models/wan2.1_i2v_720p_14B_fp8_scaled.safetensors"
},
"text_encoders": {
"umt5_xxl_fp8_e4m3fn_scaled.safetensors": "https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors"
},
"clip_vision": {
"clip_vision_h.safetensors": "https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/clip_vision/clip_vision_h.safetensors"
},
"loras": {
"StudioGhibli.Redmond-StdGBRRedmAF-StudioGhibli.safetensors": "https://huggingface.co/artificialguybr/StudioGhibli.Redmond-V2/blob/main/StudioGhibli.Redmond-StdGBRRedmAF-StudioGhibli.safetensors",
"Yoji_Shinkawa_Art_Style_Flux.safetensors": "https://civitai.com/api/download/models/912623?type=Model&format=SafeTensor"
},
"upscalers": {},
"other_models": {}
}
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.4 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.8 MiB