ai/ai
ai/ai
1
0
This commit is contained in:
syui 2025-03-25 01:38:21 +09:00
commit 36178bc6d8
Signed by: syui
GPG Key ID: 5417CFEBAD92DF56
7 changed files with 451 additions and 0 deletions

16
README.md Normal file
View File

@ -0,0 +1,16 @@
# <img src="./icon/ai.png" width="30"> ai `ai`
AI model `ai`
the aim is to incorporate it into `aibot` and `aios`.
|name|full|code|repo|
|---|---|---|---|
|ai|ai ai|aiai|https://git.syui.ai/ai/ai|
|os|ai os|aios|https://git.syui.ai/ai/os|
|bot|ai bot|aibot|https://git.syui.ai/ai/bot|
|at|ai|ai.syu.is|https://git.syui.ai/ai/at|
```sh
$ ollama run syui/ai "hello"
```

403
docs/ja.md Normal file
View File

@ -0,0 +1,403 @@
# ai
AI modelの`ai`は、`aibot``aios`に組み込むことを目指します。
|name|full|code|repo|
|---|---|---|---|
|ai|ai ai|aiai|https://git.syui.ai/ai/ai|
|os|ai os|aios|https://git.syui.ai/ai/os|
|bot|ai bot|aibot|https://git.syui.ai/ai/bot|
|at|ai|ai.syu.is|https://git.syui.ai/ai/at|
```sh
$ ollama run syui/ai "hello"
```
## 学習
物語を学習させます。特定の語彙を使用します。例えば「自分のことをアイという」などです。
> アイね、回答するの
## できること
基本的には`aibot`へのrequestに応じて、`comfyui`で画像や動画生成、LLMで回答を行います。
webからはatprotoを通じて実行されます。
```sh
[web]aiat --> [server]aios --> [at]aibot --> [ai]aiai
```
## 使用するもの
- https://github.com/ollama/ollama
- https://github.com/n8n-io/n8n
- https://github.com/comfyanonymous/comfyui
- https://github.com/NVIDIA/cosmos
- https://github.com/stability-ai/stablediffusion
- https://github.com/unslothai/unsloth
- https://github.com/ml-explore/mlx-examples
- https://github.com/ggml-org/llama.cpp
```json
{
"model": [ "gemma3", "deepseek-r1" ],
"tag": [ "ollama", "LoRA", "unsloth", "open-webui", "n8n" ]
}
```
## ollama
```sh
# mac
$ brew install ollama
$ brew services restart ollama
# windows
$ winget install ollama.ollama
$ ollama serve
$ ollama pull gemma3:1b
$ ollama run gemma3:1b "hello"
```
## n8n
```sh
# https://github.com/n8n-io/n8n/
$ docker volume create n8n_data
$ docker run -it --rm --name n8n -p 5678:5678 -v n8n_data:/home/node/.n8n docker.n8n.io/n8nio/n8n
```
## webui
```sh
$ winget install python.python.3.12
$ python --version
$ python -m venv webui
$ cd webui
$ .\Scripts\activate
$ pip install open-webui
$ open-webui serve
http://localhost:8080
```
## LoRA
apple siliconでLoRA(finetuning)するには`mlx_lm`を使用します。
```sh
$ brew install --cask anaconda
$ brew info anaconda
$ cd /opt/homebrew/Caskroom/anaconda/*
$ ./Anaconda3*.sh
```
`google/gemma-3-1b-it`を承認しておきます。
- https://huggingface.co/google/gemma-3-1b-it
```sh
$ pip install -U "huggingface_hub[cli]"
# https://huggingface.co/settings/tokens
# Repositories permissions:Read access to contents of selected repos
$ huggingface_hub login
```
```sh
$ conda create -n finetuning python=3.12
$ conda activate finetuning
$ pip install mlx-lm
$ echo "{ \"model\": \"https://huggingface.co/google/gemma-3-1b-it\", \"data\": \"https://github.com/ml-explore/mlx-examples/tree/main/lora/data\" }"|jq .
$ git clone https://github.com/ml-explore/mlx-examples
$ model=google/gemma-3-1b-it
$ data=mlx-examples/lora/data
$ mlx_lm.lora --train --model $model --data $data --batch-size 3
$ ls adapters
adapter_config.json
adapters.safetensors
```
## unsloth
windowsでLoRA(finetuning)するには`unsloth`を使います。
最も安定しているのは以下のverです。
```sh
$ nvidia-smi
$ nvcc --version
# https://github.com/unslothai/notebooks/blob/main/unsloth_windows.ps1
cuda: 12.4
python: 3.11
```
torchをcuda:12.8で使用する方法もあります。python:3.12を使用できます。しかし、unslothのinstallでは問題が発生しました。
```sh
$ winget install --scope machine nvidia.cuda --version 12.4.1
$ winget install curl.curl
```
```sh
# https://docs.unsloth.ai/get-started/installing-+-updating/windows-installation
$ curl -sLO https://raw.githubusercontent.com/unslothai/notebooks/refs/heads/main/unsloth_windows.ps1
$ powershell.exe -ExecutionPolicy Bypass -File .\unsloth_windows.ps1
$ vim custom.py
```
上記はpwshでunsolthを使う方法ですが、wslを使ったほうがいいです。
```py
# https://docs.unsloth.ai/get-started/fine-tuning-guide
from unsloth import FastModel
import torch
fourbit_models = [
# 4bit dynamic quants for superior accuracy and low memory use
# https://docs.unsloth.ai/basics/tutorial-how-to-run-and-fine-tune-gemma-3
# https://huggingface.co/unsloth/gemma-3-4b-it
"unsloth/gemma-3-1b-it-unsloth-bnb-4bit",
"unsloth/gemma-3-4b-it-unsloth-bnb-4bit",
"unsloth/gemma-3-12b-it-unsloth-bnb-4bit",
"unsloth/gemma-3-27b-it-unsloth-bnb-4bit",
# Other popular models!
"unsloth/Llama-3.1-8B",
"unsloth/Llama-3.2-3B",
"unsloth/Llama-3.3-70B",
"unsloth/mistral-7b-instruct-v0.3",
"unsloth/Phi-4",
] # More models at https://huggingface.co/unsloth
model, tokenizer = FastModel.from_pretrained(
model_name = "unsloth/gemma-3-4b-it",
max_seq_length = 2048, # Choose any for long context!
load_in_4bit = True, # 4 bit quantization to reduce memory
load_in_8bit = False, # [NEW!] A bit more accurate, uses 2x memory
full_finetuning = False, # [NEW!] We have full finetuning now!
# token = "hf_...", # use one if using gated models
)
model = FastModel.get_peft_model(
model,
finetune_vision_layers = False, # Turn off for just text!
finetune_language_layers = True, # Should leave on!
finetune_attention_modules = True, # Attention good for GRPO
finetune_mlp_modules = True, # SHould leave on always!
r = 8, # Larger = higher accuracy, but might overfit
lora_alpha = 8, # Recommended alpha == r at least
lora_dropout = 0,
bias = "none",
random_state = 3407,
)
```
## comfyui
https://github.com/comfyanonymous/comfyui
- https://github.com/ltdrdata/comfyui-manager
- https://github.com/ltdrdata/comfyui-impact-pack
開発者のmatrix roomが`comfyui_space:matrix.org`にあります。
https://app.element.io/#/room/#comfyui_space:matrix.org
### install
installは`git clone`から構築したほうがいいです。
```sh
# https://docs.comfy.org/installation/manual_install#nvidia:install-nightly
$ winget install python.python.3.12
$ git clone https://github.com/comfyanonymous/comfyui
$ cd comfyui
$ python -m venv venv
$ Set-ExecutionPolicy RemoteSigned -Scope Process
$ venv\Scripts\activate
$ pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128
$ python -m pip install --upgrade pip
$ pip install -r requirements.txt
$ python main.py
http://localhost:8188
```
comfyuiはnodeで書きますが`workflow`と呼ぶようです。jsonで保存されます。簡単にimportできます。
https://comfyanonymous.github.io/ComfyUI_examples/
基本構造としては、以下のとおりです。
```sh
./comfyui
├── main.py
├── custom_nodes # ここにpluginを入れる
│   └── comfyui-manager
└── models
└── checkpoints # ここにmodelを入れる
└── model.safetensors
```
1. modelは[comfyui-manager](https://github.com/ltdrdata/comfyui-manager)、または[civitai.com](https://civitai.com/models)からdlするといいです。
2. workflowは[example](https://github.com/aimpowerment/comfyui-workflows)があります。また、[openart.ai](https://openart.ai/workflows/all)と連携できます。
3. promptは[majinai.art](https://majinai.art/ja/)を参照してください。
例えば、`text-to-image.json`をworkflowに読み込んで、modelとpromptを書き換えて生成してみます。
```sh
# https://docs.comfy.org/get_started/first_generation
$ curl -sLO https://raw.githubusercontent.com/Comfy-Org/docs/refs/heads/main/public/text-to-image.json
```
内容は以下のとおりです。jsonはimport(workflow)用ではありません。
```json
{
"models": {
"checkpoints": "SD2.1/wd-illusion-fp16.safetensors",
"vae": "SD2.1/kl-f8-anime2.ckpt"
},
"prompt": {
"positive": "(little girl, head, gold hair, long, violet eyes :1.1), 1girl, solo, gold hair, long hair, galaxy background, looking at viewer, (depth of field, blurry, blurry background, bokeh:1.4), white dress, angel halo",
"negative": "nsfw, (worst quality, low quality:1.4)"
}
}
```
<img src="https://git.syui.ai/ai/ai/raw/branch/main/repos/comfyui/output/ComfyUI_00001_.png" width="300px">
### torch(cuda:12.8)で問題が発生した場合
`cuda:12.8`で問題が発生した場合、こちらを参考にしてください。
私の環境だと問題は発生しませんでした。
```sh
$ cd ComfyUI/.venv/Scripts/
$ ./python.exe -V
python:3.12
```
`torch`のnightly versionは`cuda:12.8`に対応しています。
https://pytorch.org/get-started/locally/
```sh
$ pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128
```
しかし、`torchaudio`などで衝突が発生します。衝突が発生する場合はversionを指定します。しかし、この方法で互換性が完全に解消されるわけではありません。
```sh
$ ./python.exe -m pip uninstall torch torchvision torchaudio -y
$ ./python.exe -m pip install --pre torch==2.7.0.dev20250306+cu128 torchvision==0.22.0.dev20250306+cu128 torchaudio==2.6.0.dev20250306+cu128 --index-url https://download.pytorch.org/whl/nightly/cu128
```
ホイールファイルを使用すると安定するようです。
```sh
# https://huggingface.co/w-e-w/torch-2.6.0-cu128.nv
$ ./python.exe -m pip install torch-2.x.x+cu128-cp312-cp312-win_amd64.whl
$ ./python.exe -m pip install torchvision-x.x.x+cu128-cp312-cp312-win_amd64.whl
$ ./python.exe -m pip install torchaudio-x.x.x+cu128-cp312-cp312-win_amd64.whl
$ ./python.exe -c "import torch; print(torch.cuda.is_available()); print(torch.__version__); print(torch.cuda.get_arch_list())"
```
### comfyui + ollama
自動でpromptの生成を行います。ollamaのlocal llm modelを使用します。
- https://github.com/stavsap/comfyui-ollama
- https://github.com/pythongosssss/ComfyUI-Custom-Scripts
`show text`のcustom nodeを使用するには`ComfyUI-Custom-Scripts`が必要です。
https://github.com/aimpowerment/comfyui-workflows/blob/main/ollama-txt2img-workflow.json
nodeにある設定の方には以下の文章を載せておきます。
> https://github.com/ScreamingHawk/comfyui-ollama-prompt-encode/blob/main/nodes/OllamaPromptGenerator.py
>
> You describe pictures. I will give you a brief description of the picture. You reply with comma separated keywords that describe the picture. Describe clothing, pose, expression, setting, and any other details you can think of. Use comma separated keywords. Do not use sentences. Use brevity.
### comfyuiで動画生成
#### cosmos
nvidiaのcosmosを使った動画生成です。
https://comfyanonymous.github.io/ComfyUI_examples/cosmos/
```json
{
"models": {
"text_encoders": {
"url": "https://huggingface.co/comfyanonymous/cosmos_1.0_text_encoder_and_VAE_ComfyUI/blob/main/text_encoders/oldt5_xxl_fp8_e4m3fn_scaled.safetensors"
},
"vae": {
"url": "https://huggingface.co/comfyanonymous/cosmos_1.0_text_encoder_and_VAE_ComfyUI/blob/main/vae/cosmos_cv8x8x8_1.0.safetensors"
},
"diffusion_models": {
"url": "https://huggingface.co/mcmonkey/cosmos-1.0/blob/main/Cosmos-1_0-Diffusion-7B-Video2World.safetensors"
}
}
}
```
```sh
$ curl -sLO https://comfyanonymous.github.io/ComfyUI_examples/cosmos/image_to_video_cosmos_7B.json
```
#### wan2.1
https://blog.comfy.org/p/wan21-video-model-native-support
```json
{
"models": {
"diffusion_models": {
"url": "https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/blob/main/split_files/diffusion_models/wan2.1_i2v_720p_14B_fp8_scaled.safetensors"
},
"text_encoders": {
"url": "https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors"
},
"clip_vision": {
"url": "https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/clip_vision/clip_vision_h.safetensors"
},
"vae": {
"url": "https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors"
}
}
}
```
```sh
$ curl -sLO "https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/example%20workflows_Wan2.1/image_to_video_wan_720p_example.json"
```
<img src="https://git.syui.ai/ai/ai/raw/branch/main/repos/comfyui/output/ComfyUI_00001_.webp" width="300px">
### ReActor + instantIDで同じ顔を使う
https://github.com/cubiq/ComfyUI_InstantID
- ComfyUI_InstantID
- InstantID/ip-adapter
- InstantID/ControlNet
```sh
$ curl -sLO https://raw.githubusercontent.com/cubiq/ComfyUI_InstantID/refs/heads/main/examples/InstantID_depth.json
```
```sh
$ python.exe -m pip install onnxruntime
```

BIN
icon/ai.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 140 KiB

View File

@ -0,0 +1,32 @@
{
"models": {
"diffusion_models": {
"wan2.1_i2v_720p_14B_fp8_scaled.safetensors": "https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/blob/main/split_files/diffusion_models/wan2.1_i2v_720p_14B_fp8_scaled.safetensors",
"Cosmos-1_0-Diffusion-7B-Text2World.safetensors": "https://huggingface.co/mcmonkey/cosmos-1.0/blob/main/Cosmos-1_0-Diffusion-7B-Text2World.safetensors",
"Cosmos-1_0-Diffusion-7B-Video2World.safetensors": "https://huggingface.co/mcmonkey/cosmos-1.0/blob/main/Cosmos-1_0-Diffusion-7B-Video2World.safetensors"
},
"text_encoders": {
"umt5_xxl_fp8_e4m3fn_scaled.safetensors": "https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors",
"oldt5_xxl_fp8_e4m3fn_scaled.safetensors": "https://huggingface.co/comfyanonymous/cosmos_1.0_text_encoder_and_VAE_ComfyUI/blob/main/text_encoders/oldt5_xxl_fp8_e4m3fn_scaled.safetensors"
},
"clip_vision": {
"clip_vision_h.safetensors": "https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/clip_vision/clip_vision_h.safetensors"
},
"vae": {
"wan_2.1_vae.safetensors": "https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors",
"cosmos_cv8x8x8_1.0.safetensors": "https://huggingface.co/comfyanonymous/cosmos_1.0_text_encoder_and_VAE_ComfyUI/blob/main/vae/cosmos_cv8x8x8_1.0.safetensors"
},
"checkpoints": {
"sd_xl_base_1.0.safetensors": "https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/sd_xl_base_1.0.safetensors",
"sd_xl_refiner_1.0.safetensors": "https://huggingface.co/stabilityai/stable-diffusion-xl-refiner-1.0/blob/main/sd_xl_refiner_1.0.safetensors",
"v1-5-pruned.ckpt2": "https://huggingface.co/runwayml/stable-diffusion-v1-5/blob/main/v1-5-pruned.ckpt",
"moeFussionV1.4.0_RICZ_vz.safetensors": "https://huggingface.co/JosefJilek/moeFussion/blob/main/moeFussionV1.4.0_RICZ_vz.safetensors"
},
"loras": {
"StudioGhibli.Redmond-StdGBRRedmAF-StudioGhibli.safetensors": "https://huggingface.co/artificialguybr/StudioGhibli.Redmond-V2/blob/main/StudioGhibli.Redmond-StdGBRRedmAF-StudioGhibli.safetensors",
"Yoji_Shinkawa_Art_Style_Flux.safetensors": "https://civitai.com/api/download/models/912623?type=Model&format=SafeTensor"
},
"upscalers": {},
"other_models": {}
}
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 454 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.1 MiB