feat: add Arty configuration and Claude Code documentation
- Add arty.yml for repository management with environment profiles (prod/dev/minimal) - Add CLAUDE.md with comprehensive architecture and usage documentation - Add comfyui_models.yaml for ComfyUI model configuration - Include deployment scripts for model linking and dependency installation - Document all git repositories (ComfyUI + 10 custom nodes) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
411
CLAUDE.md
Normal file
411
CLAUDE.md
Normal file
@@ -0,0 +1,411 @@
|
|||||||
|
# CLAUDE.md
|
||||||
|
|
||||||
|
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
This is a lightweight, process-based AI model orchestrator designed for RunPod GPU instances (specifically RTX 4090 with 24GB VRAM). It manages sequential loading of multiple large AI models on a single GPU, providing OpenAI-compatible API endpoints for text, image, and audio generation.
|
||||||
|
|
||||||
|
**Key Design Philosophy:**
|
||||||
|
- **Sequential model loading** - Only one model active at a time to fit within GPU memory constraints
|
||||||
|
- **Process-based architecture** - Uses Python subprocess instead of Docker-in-Docker for RunPod compatibility
|
||||||
|
- **Automatic model switching** - Orchestrator detects request types and switches models on-demand
|
||||||
|
- **OpenAI-compatible APIs** - Works seamlessly with LiteLLM proxy and other AI tools
|
||||||
|
|
||||||
|
## Architecture
|
||||||
|
|
||||||
|
### Core Components
|
||||||
|
|
||||||
|
1. **Orchestrator** (`model-orchestrator/orchestrator_subprocess.py`)
|
||||||
|
- FastAPI proxy server listening on port 9000
|
||||||
|
- Manages model lifecycle via Python subprocesses
|
||||||
|
- Routes requests to appropriate model services
|
||||||
|
- Handles sequential model loading/unloading
|
||||||
|
|
||||||
|
2. **Model Registry** (`model-orchestrator/models.yaml`)
|
||||||
|
- YAML configuration defining available models
|
||||||
|
- Specifies: type, framework, service script, port, VRAM requirements, startup time
|
||||||
|
- Easy to extend with new models
|
||||||
|
|
||||||
|
3. **Model Services** (`models/*/`)
|
||||||
|
- Individual Python servers running specific AI models
|
||||||
|
- vLLM for text generation (Qwen 2.5 7B, Llama 3.1 8B)
|
||||||
|
- ComfyUI for image/video/audio generation (FLUX, SDXL, CogVideoX, MusicGen)
|
||||||
|
|
||||||
|
4. **Ansible Provisioning** (`playbook.yml`)
|
||||||
|
- Complete infrastructure-as-code setup
|
||||||
|
- Installs dependencies, downloads models, configures services
|
||||||
|
- Supports selective installation via tags
|
||||||
|
|
||||||
|
### Why Process-Based Instead of Docker?
|
||||||
|
|
||||||
|
The subprocess implementation (`orchestrator_subprocess.py`) is preferred over the Docker version (`orchestrator.py`) because:
|
||||||
|
- RunPod instances run in containers - Docker-in-Docker adds complexity
|
||||||
|
- Faster model startup (direct Python process spawning)
|
||||||
|
- Simpler debugging (single process tree)
|
||||||
|
- Reduced overhead (no container management layer)
|
||||||
|
|
||||||
|
**Note:** Always use `orchestrator_subprocess.py` for RunPod deployments.
|
||||||
|
|
||||||
|
## Common Commands
|
||||||
|
|
||||||
|
### Repository Management with Arty
|
||||||
|
|
||||||
|
This project uses Arty for repository and deployment management. See `arty.yml` for full configuration.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Clone all repositories (fresh deployment)
|
||||||
|
arty sync --env prod # Production: Essential nodes only
|
||||||
|
arty sync --env dev # Development: All nodes including optional
|
||||||
|
arty sync --env minimal # Minimal: Just orchestrator + ComfyUI base
|
||||||
|
|
||||||
|
# Run deployment scripts
|
||||||
|
arty run setup/full # Show setup instructions
|
||||||
|
arty run models/link-comfyui # Link downloaded models to ComfyUI
|
||||||
|
arty run deps/comfyui-nodes # Install custom node dependencies
|
||||||
|
arty run services/start # Start orchestrator
|
||||||
|
arty run services/stop # Stop all services
|
||||||
|
|
||||||
|
# Health checks
|
||||||
|
arty run health/orchestrator # Check orchestrator
|
||||||
|
arty run health/comfyui # Check ComfyUI
|
||||||
|
arty run check/gpu # nvidia-smi
|
||||||
|
arty run check/models # Show cache size
|
||||||
|
```
|
||||||
|
|
||||||
|
### Initial Setup
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# 1. Clone repositories with Arty (fresh RunPod instance)
|
||||||
|
arty sync --env prod
|
||||||
|
|
||||||
|
# 2. Configure environment
|
||||||
|
cd /workspace/ai
|
||||||
|
cp .env.example .env
|
||||||
|
# Edit .env and set HF_TOKEN=your_huggingface_token
|
||||||
|
|
||||||
|
# 3. Full deployment with Ansible
|
||||||
|
ansible-playbook playbook.yml
|
||||||
|
|
||||||
|
# 4. Essential ComfyUI setup (faster, ~80GB instead of ~137GB)
|
||||||
|
ansible-playbook playbook.yml --tags comfyui-essential
|
||||||
|
|
||||||
|
# 5. Link models to ComfyUI
|
||||||
|
arty run models/link-comfyui
|
||||||
|
|
||||||
|
# 6. Install custom node dependencies
|
||||||
|
arty run deps/comfyui-nodes
|
||||||
|
|
||||||
|
# 7. Selective installation (base system + Python + vLLM models only)
|
||||||
|
ansible-playbook playbook.yml --tags base,python,dependencies
|
||||||
|
```
|
||||||
|
|
||||||
|
### Service Management
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Start orchestrator (runs in foreground)
|
||||||
|
bash scripts/start-all.sh
|
||||||
|
# Or directly:
|
||||||
|
python3 model-orchestrator/orchestrator_subprocess.py
|
||||||
|
|
||||||
|
# Stop all services
|
||||||
|
bash scripts/stop-all.sh
|
||||||
|
|
||||||
|
# Stop orchestrator only
|
||||||
|
pkill -f orchestrator_subprocess.py
|
||||||
|
|
||||||
|
# Stop specific model service
|
||||||
|
pkill -f "models/vllm/server.py"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Testing
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Health check
|
||||||
|
curl http://localhost:9000/health
|
||||||
|
|
||||||
|
# List available models
|
||||||
|
curl http://localhost:9000/v1/models
|
||||||
|
|
||||||
|
# Test text generation (streaming)
|
||||||
|
curl -s -N -X POST http://localhost:9000/v1/chat/completions \
|
||||||
|
-H 'Content-Type: application/json' \
|
||||||
|
-d '{
|
||||||
|
"model": "qwen-2.5-7b",
|
||||||
|
"messages": [{"role": "user", "content": "Count to 5"}],
|
||||||
|
"max_tokens": 50,
|
||||||
|
"stream": true
|
||||||
|
}'
|
||||||
|
|
||||||
|
# Test image generation
|
||||||
|
curl -X POST http://localhost:9000/v1/images/generations \
|
||||||
|
-H 'Content-Type: application/json' \
|
||||||
|
-d '{
|
||||||
|
"model": "flux-schnell",
|
||||||
|
"prompt": "A serene mountain landscape at sunset",
|
||||||
|
"size": "1024x1024"
|
||||||
|
}'
|
||||||
|
```
|
||||||
|
|
||||||
|
### Ansible Tags Reference
|
||||||
|
|
||||||
|
**System Setup:**
|
||||||
|
- `base` - Base system packages
|
||||||
|
- `python` - Python environment setup
|
||||||
|
- `dependencies` - Install Python packages
|
||||||
|
|
||||||
|
**Model Installation:**
|
||||||
|
- `models` - Download vLLM/Flux/MusicGen models (legacy)
|
||||||
|
- `comfyui` - Install ComfyUI base
|
||||||
|
- `comfyui-essential` - Quick setup (ComfyUI + essential models only, ~80GB)
|
||||||
|
- `comfyui-models-image` - Image generation models (FLUX, SDXL, SD3.5)
|
||||||
|
- `comfyui-models-video` - Video generation models (CogVideoX, SVD)
|
||||||
|
- `comfyui-models-audio` - Audio generation models (MusicGen variants)
|
||||||
|
- `comfyui-models-support` - CLIP, IP-Adapter, ControlNet models
|
||||||
|
- `comfyui-models-all` - All ComfyUI models (~137GB)
|
||||||
|
- `comfyui-nodes` - Install essential custom nodes
|
||||||
|
|
||||||
|
**Infrastructure:**
|
||||||
|
- `tailscale` - Install Tailscale VPN client
|
||||||
|
- `systemd` - Configure systemd services (use `never` - not for RunPod)
|
||||||
|
- `validate` - Health checks (use `never` - run explicitly)
|
||||||
|
|
||||||
|
### Adding New Models
|
||||||
|
|
||||||
|
1. **Add model definition to `model-orchestrator/models.yaml`:**
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
llama-3.1-8b:
|
||||||
|
type: text
|
||||||
|
framework: vllm
|
||||||
|
service_script: models/vllm/server_llama.py
|
||||||
|
port: 8001
|
||||||
|
vram_gb: 17
|
||||||
|
startup_time_seconds: 120
|
||||||
|
endpoint: /v1/chat/completions
|
||||||
|
description: "Llama 3.1 8B Instruct"
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Create service script** (`models/vllm/server_llama.py`):
|
||||||
|
|
||||||
|
```python
|
||||||
|
import os
|
||||||
|
from vllm.entrypoints.openai.api_server import run_server
|
||||||
|
|
||||||
|
model = "meta-llama/Llama-3.1-8B-Instruct"
|
||||||
|
port = int(os.getenv("PORT", 8001))
|
||||||
|
run_server(model=model, port=port)
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Download model** (handled by Ansible playbook or manually via HuggingFace CLI)
|
||||||
|
|
||||||
|
4. **Restart orchestrator:**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
bash scripts/stop-all.sh && bash scripts/start-all.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
## Key Implementation Details
|
||||||
|
|
||||||
|
### Model Switching Logic
|
||||||
|
|
||||||
|
The orchestrator automatically switches models based on:
|
||||||
|
- **Endpoint path** - `/v1/chat/completions` → text models, `/v1/images/generations` → image models
|
||||||
|
- **Model name in request** - Matches against model registry
|
||||||
|
- **Sequential loading** - Stops current model before starting new one to conserve VRAM
|
||||||
|
|
||||||
|
See `orchestrator_subprocess.py:64-100` for process management implementation.
|
||||||
|
|
||||||
|
### Model Registry Structure
|
||||||
|
|
||||||
|
Each model in `models.yaml` requires:
|
||||||
|
- `type` - text, image, or audio
|
||||||
|
- `framework` - vllm, openedai-images, audiocraft, comfyui
|
||||||
|
- `service_script` - Relative path to Python/shell script
|
||||||
|
- `port` - Service port (8000+)
|
||||||
|
- `vram_gb` - GPU memory requirement
|
||||||
|
- `startup_time_seconds` - Max health check timeout
|
||||||
|
- `endpoint` - API endpoint path
|
||||||
|
- `description` - Human-readable description
|
||||||
|
|
||||||
|
### Environment Variables
|
||||||
|
|
||||||
|
Set in `.env` file:
|
||||||
|
- `HF_TOKEN` - **Required** - HuggingFace API token for model downloads
|
||||||
|
- `GPU_TAILSCALE_IP` - Optional - Tailscale IP for VPN access
|
||||||
|
|
||||||
|
Models are cached in:
|
||||||
|
- `/workspace/huggingface_cache` - HuggingFace models
|
||||||
|
- `/workspace/models` - Other model files
|
||||||
|
- `/workspace/ComfyUI/models` - ComfyUI model directory structure
|
||||||
|
|
||||||
|
### Integration with LiteLLM
|
||||||
|
|
||||||
|
For unified API management through LiteLLM proxy:
|
||||||
|
|
||||||
|
**LiteLLM configuration (`litellm-config.yaml` on VPS):**
|
||||||
|
```yaml
|
||||||
|
model_list:
|
||||||
|
- model_name: qwen-2.5-7b
|
||||||
|
litellm_params:
|
||||||
|
model: hosted_vllm/openai/qwen-2.5-7b # Use hosted_vllm prefix!
|
||||||
|
api_base: http://100.121.199.88:9000/v1 # Tailscale VPN IP
|
||||||
|
api_key: dummy
|
||||||
|
stream: true
|
||||||
|
timeout: 600
|
||||||
|
```
|
||||||
|
|
||||||
|
**Critical:** Use `hosted_vllm/openai/` prefix for vLLM models to enable proper streaming support. Wrong prefix causes empty delta chunks.
|
||||||
|
|
||||||
|
### ComfyUI Installation
|
||||||
|
|
||||||
|
ComfyUI provides advanced image/video/audio generation capabilities:
|
||||||
|
|
||||||
|
**Directory structure created:**
|
||||||
|
```
|
||||||
|
/workspace/ComfyUI/
|
||||||
|
├── models/
|
||||||
|
│ ├── checkpoints/ # FLUX, SDXL, SD3 models
|
||||||
|
│ ├── clip_vision/ # CLIP vision models
|
||||||
|
│ ├── video_models/ # CogVideoX, SVD
|
||||||
|
│ ├── audio_models/ # MusicGen
|
||||||
|
│ └── custom_nodes/ # Extension nodes
|
||||||
|
```
|
||||||
|
|
||||||
|
**Essential custom nodes installed:**
|
||||||
|
- ComfyUI-Manager - Model/node management GUI
|
||||||
|
- ComfyUI-VideoHelperSuite - Video operations
|
||||||
|
- ComfyUI-AnimateDiff-Evolved - Video generation
|
||||||
|
- ComfyUI_IPAdapter_plus - Style transfer
|
||||||
|
- ComfyUI-Impact-Pack - Auto face enhancement
|
||||||
|
- comfyui-sound-lab - Audio generation
|
||||||
|
|
||||||
|
**VRAM requirements for 24GB GPU:**
|
||||||
|
- FLUX Schnell FP16: 23GB (leaves 1GB)
|
||||||
|
- SDXL Base: 12GB
|
||||||
|
- CogVideoX-5B: 12GB (with optimizations)
|
||||||
|
- MusicGen Medium: 8GB
|
||||||
|
|
||||||
|
See `COMFYUI_MODELS.md` for detailed model catalog and usage examples.
|
||||||
|
|
||||||
|
## Deployment Workflow
|
||||||
|
|
||||||
|
### RunPod Deployment (Current Setup)
|
||||||
|
|
||||||
|
1. **Clone repository:**
|
||||||
|
```bash
|
||||||
|
cd /workspace
|
||||||
|
git clone <repo-url> ai
|
||||||
|
cd ai
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Configure environment:**
|
||||||
|
```bash
|
||||||
|
cp .env.example .env
|
||||||
|
# Edit .env, set HF_TOKEN
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Run Ansible provisioning:**
|
||||||
|
```bash
|
||||||
|
ansible-playbook playbook.yml
|
||||||
|
# Or selective: --tags base,python,comfyui-essential
|
||||||
|
```
|
||||||
|
|
||||||
|
4. **Start services:**
|
||||||
|
```bash
|
||||||
|
bash scripts/start-all.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
5. **Verify:**
|
||||||
|
```bash
|
||||||
|
curl http://localhost:9000/health
|
||||||
|
```
|
||||||
|
|
||||||
|
### Tailscale VPN Integration
|
||||||
|
|
||||||
|
To connect RunPod GPU to VPS infrastructure:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# On RunPod instance
|
||||||
|
curl -fsSL https://tailscale.com/install.sh | sh
|
||||||
|
tailscaled --tun=userspace-networking --socks5-server=localhost:1055 &
|
||||||
|
tailscale up --advertise-tags=tag:gpu
|
||||||
|
tailscale ip -4 # Get IP for LiteLLM config
|
||||||
|
```
|
||||||
|
|
||||||
|
Benefits: Secure tunnel, no public exposure, low latency.
|
||||||
|
|
||||||
|
## Project Structure
|
||||||
|
|
||||||
|
```
|
||||||
|
runpod/
|
||||||
|
├── model-orchestrator/
|
||||||
|
│ ├── orchestrator_subprocess.py # Main orchestrator (USE THIS)
|
||||||
|
│ ├── orchestrator.py # Docker-based version (legacy)
|
||||||
|
│ ├── models.yaml # Model registry
|
||||||
|
│ └── requirements.txt
|
||||||
|
├── models/
|
||||||
|
│ ├── vllm/
|
||||||
|
│ │ ├── server.py # vLLM text generation service
|
||||||
|
│ │ └── requirements.txt
|
||||||
|
│ └── comfyui/
|
||||||
|
│ ├── start.sh # ComfyUI startup script
|
||||||
|
│ └── requirements.txt
|
||||||
|
├── scripts/
|
||||||
|
│ ├── start-all.sh # Start orchestrator
|
||||||
|
│ └── stop-all.sh # Stop all services
|
||||||
|
├── arty.yml # Arty repository manager config
|
||||||
|
├── playbook.yml # Ansible provisioning playbook
|
||||||
|
├── inventory.yml # Ansible inventory (localhost)
|
||||||
|
├── ansible.cfg # Ansible configuration
|
||||||
|
├── .env.example # Environment variables template
|
||||||
|
├── CLAUDE.md # This file
|
||||||
|
├── COMFYUI_MODELS.md # ComfyUI models catalog
|
||||||
|
├── MODELS_LINKED.md # Model linkage documentation
|
||||||
|
├── comfyui_models.yaml # ComfyUI model configuration
|
||||||
|
└── README.md # User documentation
|
||||||
|
```
|
||||||
|
|
||||||
|
## Troubleshooting
|
||||||
|
|
||||||
|
### Model fails to start
|
||||||
|
- Check VRAM: `nvidia-smi`
|
||||||
|
- Verify model weights downloaded: `ls -lh /workspace/huggingface_cache`
|
||||||
|
- Check port conflicts: `lsof -i :9000`
|
||||||
|
- Test model directly: `python3 models/vllm/server.py`
|
||||||
|
|
||||||
|
### Streaming returns empty deltas
|
||||||
|
- Use correct LiteLLM model prefix: `hosted_vllm/openai/model-name`
|
||||||
|
- Set `stream: true` in LiteLLM config
|
||||||
|
- Verify orchestrator proxies streaming correctly
|
||||||
|
|
||||||
|
### HuggingFace download errors
|
||||||
|
- Check token: `echo $HF_TOKEN`
|
||||||
|
- Set in .env: `HF_TOKEN=your_token_here`
|
||||||
|
- Re-run Ansible: `ansible-playbook playbook.yml --tags dependencies`
|
||||||
|
|
||||||
|
### Out of storage space
|
||||||
|
- Check disk usage: `df -h /workspace`
|
||||||
|
- Use essential tags: `--tags comfyui-essential` (~80GB vs ~137GB)
|
||||||
|
- Clear cache: `rm -rf /workspace/huggingface_cache`
|
||||||
|
|
||||||
|
### Orchestrator not responding
|
||||||
|
- Check process: `ps aux | grep orchestrator`
|
||||||
|
- View logs: Check terminal output where orchestrator was started
|
||||||
|
- Restart: `bash scripts/stop-all.sh && bash scripts/start-all.sh`
|
||||||
|
|
||||||
|
## Performance Notes
|
||||||
|
|
||||||
|
- **Model switching time:** 30-120 seconds (depends on model size)
|
||||||
|
- **Text generation:** ~20-40 tokens/second (Qwen 2.5 7B on RTX 4090)
|
||||||
|
- **Image generation:** 4-5 seconds per image (FLUX Schnell)
|
||||||
|
- **Music generation:** 60-90 seconds for 30s audio (MusicGen Medium)
|
||||||
|
|
||||||
|
## Important Conventions
|
||||||
|
|
||||||
|
- **Always use `orchestrator_subprocess.py`** - Not the Docker version
|
||||||
|
- **Sequential loading only** - One model active at a time for 24GB VRAM
|
||||||
|
- **Models downloaded by Ansible** - Use playbook tags, not manual downloads
|
||||||
|
- **Services run as processes** - Not systemd (RunPod containers don't support it)
|
||||||
|
- **Environment managed via .env** - Required: HF_TOKEN
|
||||||
|
- **Port 9000 for orchestrator** - Model services use 8000+
|
||||||
212
arty.yml
Normal file
212
arty.yml
Normal file
@@ -0,0 +1,212 @@
|
|||||||
|
name: "RunPod AI Model Orchestrator"
|
||||||
|
version: "2.0.0"
|
||||||
|
description: "Process-based AI model orchestrator for RunPod GPU instances with ComfyUI integration"
|
||||||
|
author: "valknar@pivoine.art"
|
||||||
|
license: "MIT"
|
||||||
|
|
||||||
|
# Git repositories to clone for a fresh RunPod deployment
|
||||||
|
references:
|
||||||
|
# ComfyUI base installation
|
||||||
|
- url: https://github.com/comfyanonymous/ComfyUI.git
|
||||||
|
into: /workspace/ComfyUI
|
||||||
|
description: "ComfyUI - Node-based interface for image/video/audio generation"
|
||||||
|
|
||||||
|
# ComfyUI Essential Custom Nodes
|
||||||
|
- url: https://github.com/ltdrdata/ComfyUI-Manager.git
|
||||||
|
into: /workspace/ComfyUI/custom_nodes/ComfyUI-Manager
|
||||||
|
description: "ComfyUI Manager - Install/manage custom nodes and models"
|
||||||
|
essential: true
|
||||||
|
|
||||||
|
- url: https://github.com/Kosinkadink/ComfyUI-VideoHelperSuite.git
|
||||||
|
into: /workspace/ComfyUI/custom_nodes/ComfyUI-VideoHelperSuite
|
||||||
|
description: "Video operations and processing"
|
||||||
|
essential: true
|
||||||
|
|
||||||
|
- url: https://github.com/Kosinkadink/ComfyUI-AnimateDiff-Evolved.git
|
||||||
|
into: /workspace/ComfyUI/custom_nodes/ComfyUI-AnimateDiff-Evolved
|
||||||
|
description: "AnimateDiff for video generation"
|
||||||
|
essential: true
|
||||||
|
|
||||||
|
- url: https://github.com/cubiq/ComfyUI_IPAdapter_plus.git
|
||||||
|
into: /workspace/ComfyUI/custom_nodes/ComfyUI_IPAdapter_plus
|
||||||
|
description: "IP-Adapter for style transfer"
|
||||||
|
essential: true
|
||||||
|
|
||||||
|
- url: https://github.com/ltdrdata/ComfyUI-Impact-Pack.git
|
||||||
|
into: /workspace/ComfyUI/custom_nodes/ComfyUI-Impact-Pack
|
||||||
|
description: "Auto face enhancement and detailer"
|
||||||
|
essential: true
|
||||||
|
|
||||||
|
# ComfyUI Optional Custom Nodes
|
||||||
|
- url: https://github.com/kijai/ComfyUI-CogVideoXWrapper.git
|
||||||
|
into: /workspace/ComfyUI/custom_nodes/ComfyUI-CogVideoXWrapper
|
||||||
|
description: "CogVideoX integration for text-to-video"
|
||||||
|
essential: false
|
||||||
|
|
||||||
|
- url: https://github.com/ltdrdata/ComfyUI-Inspire-Pack.git
|
||||||
|
into: /workspace/ComfyUI/custom_nodes/ComfyUI-Inspire-Pack
|
||||||
|
description: "Additional inspiration tools"
|
||||||
|
essential: false
|
||||||
|
|
||||||
|
- url: https://github.com/Kosinkadink/ComfyUI-Advanced-ControlNet.git
|
||||||
|
into: /workspace/ComfyUI/custom_nodes/ComfyUI-Advanced-ControlNet
|
||||||
|
description: "Advanced ControlNet features"
|
||||||
|
essential: false
|
||||||
|
|
||||||
|
- url: https://github.com/MrForExample/ComfyUI-3D-Pack.git
|
||||||
|
into: /workspace/ComfyUI/custom_nodes/ComfyUI-3D-Pack
|
||||||
|
description: "3D asset generation"
|
||||||
|
essential: false
|
||||||
|
|
||||||
|
- url: https://github.com/MixLabPro/comfyui-sound-lab.git
|
||||||
|
into: /workspace/ComfyUI/custom_nodes/comfyui-sound-lab
|
||||||
|
description: "MusicGen and Stable Audio integration"
|
||||||
|
essential: false
|
||||||
|
|
||||||
|
# Environment profiles for selective repository management
|
||||||
|
envs:
|
||||||
|
# Production: Only essential components
|
||||||
|
prod:
|
||||||
|
- /workspace/ai
|
||||||
|
- /workspace/ComfyUI
|
||||||
|
- /workspace/ComfyUI/custom_nodes/ComfyUI-Manager
|
||||||
|
- /workspace/ComfyUI/custom_nodes/ComfyUI-VideoHelperSuite
|
||||||
|
- /workspace/ComfyUI/custom_nodes/ComfyUI-AnimateDiff-Evolved
|
||||||
|
- /workspace/ComfyUI/custom_nodes/ComfyUI_IPAdapter_plus
|
||||||
|
- /workspace/ComfyUI/custom_nodes/ComfyUI-Impact-Pack
|
||||||
|
|
||||||
|
# Development: All repositories including optional nodes
|
||||||
|
dev:
|
||||||
|
- /workspace/ai
|
||||||
|
- /workspace/ComfyUI
|
||||||
|
- /workspace/ComfyUI/custom_nodes/ComfyUI-Manager
|
||||||
|
- /workspace/ComfyUI/custom_nodes/ComfyUI-VideoHelperSuite
|
||||||
|
- /workspace/ComfyUI/custom_nodes/ComfyUI-AnimateDiff-Evolved
|
||||||
|
- /workspace/ComfyUI/custom_nodes/ComfyUI_IPAdapter_plus
|
||||||
|
- /workspace/ComfyUI/custom_nodes/ComfyUI-Impact-Pack
|
||||||
|
- /workspace/ComfyUI/custom_nodes/ComfyUI-CogVideoXWrapper
|
||||||
|
- /workspace/ComfyUI/custom_nodes/ComfyUI-Inspire-Pack
|
||||||
|
- /workspace/ComfyUI/custom_nodes/ComfyUI-Advanced-ControlNet
|
||||||
|
- /workspace/ComfyUI/custom_nodes/ComfyUI-3D-Pack
|
||||||
|
- /workspace/ComfyUI/custom_nodes/comfyui-sound-lab
|
||||||
|
|
||||||
|
# Minimal: Only orchestrator and ComfyUI base
|
||||||
|
minimal:
|
||||||
|
- /workspace/ai
|
||||||
|
- /workspace/ComfyUI
|
||||||
|
- /workspace/ComfyUI/custom_nodes/ComfyUI-Manager
|
||||||
|
|
||||||
|
# Deployment scripts for RunPod instances
|
||||||
|
scripts:
|
||||||
|
# Initial setup
|
||||||
|
setup/full: |
|
||||||
|
cd /workspace/ai
|
||||||
|
cp .env.example .env
|
||||||
|
echo "Edit .env and set HF_TOKEN, then run: ansible-playbook playbook.yml"
|
||||||
|
|
||||||
|
setup/essential: |
|
||||||
|
cd /workspace/ai
|
||||||
|
cp .env.example .env
|
||||||
|
echo "Edit .env and set HF_TOKEN, then run: ansible-playbook playbook.yml --tags comfyui-essential"
|
||||||
|
|
||||||
|
# Model linking (run after models are downloaded)
|
||||||
|
models/link-comfyui: |
|
||||||
|
cd /workspace/ComfyUI/models/diffusers
|
||||||
|
ln -sf /workspace/huggingface_cache/models--black-forest-labs--FLUX.1-schnell FLUX.1-schnell
|
||||||
|
ln -sf /workspace/huggingface_cache/models--black-forest-labs--FLUX.1-dev FLUX.1-dev
|
||||||
|
ln -sf /workspace/huggingface_cache/models--stabilityai--stable-diffusion-xl-base-1.0 stable-diffusion-xl-base-1.0
|
||||||
|
ln -sf /workspace/huggingface_cache/models--stabilityai--stable-diffusion-xl-refiner-1.0 stable-diffusion-xl-refiner-1.0
|
||||||
|
ln -sf /workspace/huggingface_cache/models--stabilityai--stable-diffusion-3.5-large stable-diffusion-3.5-large
|
||||||
|
cd /workspace/ComfyUI/models/clip_vision
|
||||||
|
ln -sf /workspace/huggingface_cache/models--openai--clip-vit-large-patch14 clip-vit-large-patch14
|
||||||
|
ln -sf /workspace/huggingface_cache/models--laion--CLIP-ViT-bigG-14-laion2B-39B-b160k CLIP-ViT-bigG-14
|
||||||
|
ln -sf /workspace/huggingface_cache/models--google--siglip-so400m-patch14-384 siglip-so400m-patch14-384
|
||||||
|
cd /workspace/ComfyUI/models/diffusion_models
|
||||||
|
ln -sf /workspace/huggingface_cache/models--THUDM--CogVideoX-5b CogVideoX-5b
|
||||||
|
ln -sf /workspace/huggingface_cache/models--stabilityai--stable-video-diffusion-img2vid stable-video-diffusion-img2vid
|
||||||
|
ln -sf /workspace/huggingface_cache/models--stabilityai--stable-video-diffusion-img2vid-xt stable-video-diffusion-img2vid-xt
|
||||||
|
echo "Models linked to ComfyUI"
|
||||||
|
|
||||||
|
# Service management
|
||||||
|
services/start: bash /workspace/ai/scripts/start-all.sh
|
||||||
|
services/stop: bash /workspace/ai/scripts/stop-all.sh
|
||||||
|
services/restart: bash /workspace/ai/scripts/stop-all.sh && bash /workspace/ai/scripts/start-all.sh
|
||||||
|
|
||||||
|
# Dependency installation
|
||||||
|
deps/comfyui-nodes: |
|
||||||
|
pip3 install -r /workspace/ComfyUI/custom_nodes/ComfyUI-Manager/requirements.txt
|
||||||
|
pip3 install -r /workspace/ComfyUI/custom_nodes/ComfyUI-VideoHelperSuite/requirements.txt
|
||||||
|
pip3 install 'numpy<2.0.0' --force-reinstall
|
||||||
|
echo "Custom node dependencies installed"
|
||||||
|
|
||||||
|
# Ansible provisioning shortcuts
|
||||||
|
ansible/base: cd /workspace/ai && ansible-playbook playbook.yml --tags base,python,dependencies
|
||||||
|
ansible/vllm: cd /workspace/ai && ansible-playbook playbook.yml --tags models
|
||||||
|
ansible/comfyui: cd /workspace/ai && ansible-playbook playbook.yml --tags comfyui,comfyui-essential
|
||||||
|
ansible/comfyui-all: cd /workspace/ai && ansible-playbook playbook.yml --tags comfyui,comfyui-models-all,comfyui-nodes
|
||||||
|
ansible/full: cd /workspace/ai && ansible-playbook playbook.yml
|
||||||
|
|
||||||
|
# Health checks
|
||||||
|
health/orchestrator: curl http://localhost:9000/health
|
||||||
|
health/comfyui: curl http://localhost:8188
|
||||||
|
health/vllm: curl http://localhost:8000/health
|
||||||
|
|
||||||
|
# System checks
|
||||||
|
check/gpu: nvidia-smi
|
||||||
|
check/disk: df -h /workspace
|
||||||
|
check/models: du -sh /workspace/huggingface_cache
|
||||||
|
check/cache: find /workspace/huggingface_cache -type d -name 'models--*' -maxdepth 1
|
||||||
|
|
||||||
|
# Deployment notes
|
||||||
|
notes: |
|
||||||
|
RunPod AI Model Orchestrator - Quick Start
|
||||||
|
|
||||||
|
1. Fresh Deployment:
|
||||||
|
- Clone repositories: arty sync --env prod
|
||||||
|
- Configure environment: cd /workspace/ai && cp .env.example .env
|
||||||
|
- Set HF_TOKEN in .env file
|
||||||
|
- Run Ansible: ansible-playbook playbook.yml --tags comfyui-essential
|
||||||
|
- Link models: arty run models/link-comfyui
|
||||||
|
- Install node deps: arty run deps/comfyui-nodes
|
||||||
|
- Start services: arty run services/start
|
||||||
|
|
||||||
|
2. Model Downloads:
|
||||||
|
- Essential (~80GB): ansible-playbook playbook.yml --tags comfyui-essential
|
||||||
|
- All models (~137GB): ansible-playbook playbook.yml --tags comfyui-models-all
|
||||||
|
|
||||||
|
3. Service Management:
|
||||||
|
- Start: arty run services/start
|
||||||
|
- Stop: arty run services/stop
|
||||||
|
- Restart: arty run services/restart
|
||||||
|
|
||||||
|
4. Health Checks:
|
||||||
|
- Orchestrator: arty run health/orchestrator
|
||||||
|
- ComfyUI: arty run health/comfyui
|
||||||
|
- vLLM: arty run health/vllm
|
||||||
|
|
||||||
|
5. Environment Profiles:
|
||||||
|
- Production (essential only): arty sync --env prod
|
||||||
|
- Development (all nodes): arty sync --env dev
|
||||||
|
- Minimal (orchestrator + ComfyUI only): arty sync --env minimal
|
||||||
|
|
||||||
|
6. Important Files:
|
||||||
|
- Configuration: /workspace/ai/playbook.yml
|
||||||
|
- Model registry: /workspace/ai/model-orchestrator/models.yaml
|
||||||
|
- Environment: /workspace/ai/.env
|
||||||
|
- Services: /workspace/ai/scripts/*.sh
|
||||||
|
|
||||||
|
7. Ports:
|
||||||
|
- Orchestrator: 9000
|
||||||
|
- ComfyUI: 8188
|
||||||
|
- vLLM: 8000+
|
||||||
|
|
||||||
|
8. Storage:
|
||||||
|
- Models cache: /workspace/huggingface_cache (~401GB)
|
||||||
|
- ComfyUI models: /workspace/ComfyUI/models (symlinks to cache)
|
||||||
|
- Project: /workspace/ai
|
||||||
|
|
||||||
|
For detailed documentation, see:
|
||||||
|
- /workspace/ai/README.md
|
||||||
|
- /workspace/ai/CLAUDE.md
|
||||||
|
- /workspace/ai/COMFYUI_MODELS.md
|
||||||
|
- /workspace/ai/MODELS_LINKED.md
|
||||||
268
comfyui_models.yaml
Normal file
268
comfyui_models.yaml
Normal file
@@ -0,0 +1,268 @@
|
|||||||
|
# ============================================================================
|
||||||
|
# ComfyUI Model Configuration
|
||||||
|
# ============================================================================
|
||||||
|
#
|
||||||
|
# This configuration file defines all available ComfyUI models for download.
|
||||||
|
# Models are organized by category: image, video, audio, and support models.
|
||||||
|
#
|
||||||
|
# Each model entry contains:
|
||||||
|
# - repo_id: HuggingFace repository identifier
|
||||||
|
# - description: Human-readable description
|
||||||
|
# - size_gb: Approximate size in gigabytes
|
||||||
|
# - essential: Whether this is an essential model (true/false)
|
||||||
|
# - category: Model category (image/video/audio/support)
|
||||||
|
#
|
||||||
|
# ============================================================================
|
||||||
|
|
||||||
|
# Global settings
|
||||||
|
settings:
|
||||||
|
cache_dir: /workspace/huggingface_cache
|
||||||
|
parallel_downloads: 1
|
||||||
|
retry_attempts: 3
|
||||||
|
timeout_seconds: 3600
|
||||||
|
|
||||||
|
# Model categories
|
||||||
|
model_categories:
|
||||||
|
# ==========================================================================
|
||||||
|
# IMAGE GENERATION MODELS
|
||||||
|
# ==========================================================================
|
||||||
|
image_models:
|
||||||
|
- repo_id: black-forest-labs/FLUX.1-schnell
|
||||||
|
description: FLUX.1 Schnell - Fast 4-step inference
|
||||||
|
size_gb: 23
|
||||||
|
essential: true
|
||||||
|
category: image
|
||||||
|
format: fp16
|
||||||
|
vram_gb: 23
|
||||||
|
notes: Industry-leading image generation quality
|
||||||
|
|
||||||
|
- repo_id: black-forest-labs/FLUX.1-dev
|
||||||
|
description: FLUX.1 Dev - Balanced quality/speed
|
||||||
|
size_gb: 23
|
||||||
|
essential: false
|
||||||
|
category: image
|
||||||
|
format: fp16
|
||||||
|
vram_gb: 23
|
||||||
|
notes: Development version with enhanced features
|
||||||
|
|
||||||
|
- repo_id: stabilityai/stable-diffusion-xl-base-1.0
|
||||||
|
description: SDXL Base 1.0 - Industry standard
|
||||||
|
size_gb: 7
|
||||||
|
essential: true
|
||||||
|
category: image
|
||||||
|
format: fp16
|
||||||
|
vram_gb: 12
|
||||||
|
notes: Most widely used Stable Diffusion model
|
||||||
|
|
||||||
|
- repo_id: stabilityai/stable-diffusion-xl-refiner-1.0
|
||||||
|
description: SDXL Refiner 1.0 - Enhances base output
|
||||||
|
size_gb: 6
|
||||||
|
essential: false
|
||||||
|
category: image
|
||||||
|
format: fp16
|
||||||
|
vram_gb: 12
|
||||||
|
notes: Use after SDXL base for improved details
|
||||||
|
|
||||||
|
- repo_id: stabilityai/stable-diffusion-3.5-large
|
||||||
|
description: SD 3.5 Large - Latest Stability AI
|
||||||
|
size_gb: 18
|
||||||
|
essential: false
|
||||||
|
category: image
|
||||||
|
format: fp16
|
||||||
|
vram_gb: 20
|
||||||
|
notes: Newest generation Stable Diffusion
|
||||||
|
|
||||||
|
# ==========================================================================
|
||||||
|
# VIDEO GENERATION MODELS
|
||||||
|
# ==========================================================================
|
||||||
|
video_models:
|
||||||
|
- repo_id: THUDM/CogVideoX-5b
|
||||||
|
description: CogVideoX-5B - Professional text-to-video
|
||||||
|
size_gb: 20
|
||||||
|
essential: true
|
||||||
|
category: video
|
||||||
|
format: fp16
|
||||||
|
vram_gb: 20
|
||||||
|
frames: 49
|
||||||
|
resolution: 720p
|
||||||
|
notes: State-of-the-art text-to-video generation
|
||||||
|
|
||||||
|
- repo_id: stabilityai/stable-video-diffusion-img2vid
|
||||||
|
description: SVD - 14 frame image-to-video
|
||||||
|
size_gb: 8
|
||||||
|
essential: true
|
||||||
|
category: video
|
||||||
|
format: fp16
|
||||||
|
vram_gb: 20
|
||||||
|
frames: 14
|
||||||
|
resolution: 576x1024
|
||||||
|
notes: Convert images to short video clips
|
||||||
|
|
||||||
|
- repo_id: stabilityai/stable-video-diffusion-img2vid-xt
|
||||||
|
description: SVD-XT - 25 frame image-to-video
|
||||||
|
size_gb: 8
|
||||||
|
essential: false
|
||||||
|
category: video
|
||||||
|
format: fp16
|
||||||
|
vram_gb: 20
|
||||||
|
frames: 25
|
||||||
|
resolution: 576x1024
|
||||||
|
notes: Extended frame count version
|
||||||
|
|
||||||
|
# ==========================================================================
|
||||||
|
# AUDIO GENERATION MODELS
|
||||||
|
# ==========================================================================
|
||||||
|
audio_models:
|
||||||
|
- repo_id: facebook/musicgen-small
|
||||||
|
description: MusicGen Small - Fast generation
|
||||||
|
size_gb: 3
|
||||||
|
essential: false
|
||||||
|
category: audio
|
||||||
|
format: fp32
|
||||||
|
vram_gb: 4
|
||||||
|
duration_seconds: 30
|
||||||
|
notes: Fastest music generation, lower quality
|
||||||
|
|
||||||
|
- repo_id: facebook/musicgen-medium
|
||||||
|
description: MusicGen Medium - Balanced quality
|
||||||
|
size_gb: 11
|
||||||
|
essential: true
|
||||||
|
category: audio
|
||||||
|
format: fp32
|
||||||
|
vram_gb: 8
|
||||||
|
duration_seconds: 30
|
||||||
|
notes: Best balance of speed and quality
|
||||||
|
|
||||||
|
- repo_id: facebook/musicgen-large
|
||||||
|
description: MusicGen Large - Highest quality
|
||||||
|
size_gb: 22
|
||||||
|
essential: false
|
||||||
|
category: audio
|
||||||
|
format: fp32
|
||||||
|
vram_gb: 16
|
||||||
|
duration_seconds: 30
|
||||||
|
notes: Best quality, slower generation
|
||||||
|
|
||||||
|
# ==========================================================================
|
||||||
|
# SUPPORT MODELS (CLIP, IP-Adapter, etc.)
|
||||||
|
# ==========================================================================
|
||||||
|
support_models:
|
||||||
|
- repo_id: openai/clip-vit-large-patch14
|
||||||
|
description: CLIP H - For SD 1.5 IP-Adapter
|
||||||
|
size_gb: 2
|
||||||
|
essential: true
|
||||||
|
category: support
|
||||||
|
format: fp32
|
||||||
|
vram_gb: 2
|
||||||
|
notes: Text-image understanding model
|
||||||
|
|
||||||
|
- repo_id: laion/CLIP-ViT-bigG-14-laion2B-39B-b160k
|
||||||
|
description: CLIP G - For SDXL IP-Adapter
|
||||||
|
size_gb: 7
|
||||||
|
essential: true
|
||||||
|
category: support
|
||||||
|
format: fp32
|
||||||
|
vram_gb: 4
|
||||||
|
notes: Larger CLIP model for SDXL
|
||||||
|
|
||||||
|
- repo_id: google/siglip-so400m-patch14-384
|
||||||
|
description: SigLIP - For FLUX models
|
||||||
|
size_gb: 2
|
||||||
|
essential: true
|
||||||
|
category: support
|
||||||
|
format: fp32
|
||||||
|
vram_gb: 2
|
||||||
|
notes: Advanced image-text alignment
|
||||||
|
|
||||||
|
# ============================================================================
|
||||||
|
# STORAGE & VRAM SUMMARIES
|
||||||
|
# ============================================================================
|
||||||
|
|
||||||
|
storage_requirements:
|
||||||
|
essential_only:
|
||||||
|
image: 30 # FLUX Schnell + SDXL Base
|
||||||
|
video: 28 # CogVideoX + SVD
|
||||||
|
audio: 11 # MusicGen Medium
|
||||||
|
support: 11 # All 3 CLIP models
|
||||||
|
total: 80 # Total essential storage
|
||||||
|
|
||||||
|
all_models:
|
||||||
|
image: 54 # All image models
|
||||||
|
video: 36 # All video models
|
||||||
|
audio: 36 # All audio models
|
||||||
|
support: 11 # All support models
|
||||||
|
total: 137 # Total with optional models
|
||||||
|
|
||||||
|
vram_requirements:
|
||||||
|
# For 24GB GPU (RTX 4090)
|
||||||
|
simultaneous_loadable:
|
||||||
|
- name: Image Focus - FLUX FP16
|
||||||
|
models: [FLUX.1 Schnell]
|
||||||
|
vram_used: 23
|
||||||
|
remaining: 1
|
||||||
|
|
||||||
|
- name: Image Focus - FLUX FP8 + SDXL
|
||||||
|
models: [FLUX.1 Schnell FP8, SDXL Base]
|
||||||
|
vram_used: 24
|
||||||
|
remaining: 0
|
||||||
|
|
||||||
|
- name: Video Generation
|
||||||
|
models: [CogVideoX-5B optimized, SDXL]
|
||||||
|
vram_used: 24
|
||||||
|
remaining: 0
|
||||||
|
|
||||||
|
- name: Multi-Modal
|
||||||
|
models: [SDXL, MusicGen Medium]
|
||||||
|
vram_used: 20
|
||||||
|
remaining: 4
|
||||||
|
|
||||||
|
# ============================================================================
|
||||||
|
# INSTALLATION PROFILES
|
||||||
|
# ============================================================================
|
||||||
|
|
||||||
|
installation_profiles:
|
||||||
|
minimal:
|
||||||
|
description: Minimal setup for testing
|
||||||
|
categories: [support_models]
|
||||||
|
storage_gb: 11
|
||||||
|
estimated_time: 5-10 minutes
|
||||||
|
|
||||||
|
essential:
|
||||||
|
description: Essential models only (~80GB)
|
||||||
|
categories: [image_models, video_models, audio_models, support_models]
|
||||||
|
essential_only: true
|
||||||
|
storage_gb: 80
|
||||||
|
estimated_time: 1-2 hours
|
||||||
|
|
||||||
|
image_focused:
|
||||||
|
description: All image generation models
|
||||||
|
categories: [image_models, support_models]
|
||||||
|
storage_gb: 65
|
||||||
|
estimated_time: 45-90 minutes
|
||||||
|
|
||||||
|
video_focused:
|
||||||
|
description: All video generation models
|
||||||
|
categories: [video_models, image_models, support_models]
|
||||||
|
essential_only: true
|
||||||
|
storage_gb: 69
|
||||||
|
estimated_time: 1-2 hours
|
||||||
|
|
||||||
|
complete:
|
||||||
|
description: All models (including optional)
|
||||||
|
categories: [image_models, video_models, audio_models, support_models]
|
||||||
|
storage_gb: 137
|
||||||
|
estimated_time: 2-4 hours
|
||||||
|
|
||||||
|
# ============================================================================
|
||||||
|
# METADATA
|
||||||
|
# ============================================================================
|
||||||
|
|
||||||
|
metadata:
|
||||||
|
version: 1.0.0
|
||||||
|
last_updated: 2025-11-21
|
||||||
|
compatible_with:
|
||||||
|
- ComfyUI >= 0.1.0
|
||||||
|
- Python >= 3.10
|
||||||
|
- HuggingFace Hub >= 0.20.0
|
||||||
|
maintainer: Valknar
|
||||||
|
repository: https://github.com/yourusername/runpod
|
||||||
Reference in New Issue
Block a user