CLAUDE.md

# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Overview

RunPod AI Orchestrator - A Docker-based deployment template for GPU instances on RunPod. Orchestrates AI services (ComfyUI, vLLM, AudioCraft) using Supervisor for process management, with optional Tailscale VPN integration.

## Architecture

### Service Orchestration
- **Supervisor** manages all services via `supervisord.conf`
- Services run in isolated Python venvs under `services/<name>/venv/`
- Logs written to `.logs/` directory
- Each service has its own requirements.txt

### Services (managed by Supervisor)
| Service | Port | Description | Auto-start |
|---------|------|-------------|------------|
| ComfyUI | 8188 | Node-based image/video/audio generation | Yes |
| WebDAV Sync | - | Uploads ComfyUI outputs to HiDrive | Yes |
| AudioCraft | - | Music generation | Yes |
| vLLM Llama | 8001 | Llama 3.1 8B language model | No |
| vLLM BGE | 8002 | BGE embedding model | No |

### Model Management
- Models defined in YAML configs: `models/models_civitai.yaml`, `models/models_huggingface.yaml`
- Downloaded to `.cache/` and symlinked to `services/comfyui/models/`
- Uses external scripts: `artifact_civitai_download.sh`, `artifact_huggingface_download.sh`

### Repository Management (Arty)
- `arty.yml` defines git repos to clone (ComfyUI + custom nodes)
- Repos cloned to `services/comfyui/` and `services/comfyui/custom_nodes/`
- Run `arty sync` to clone/update all dependencies

## Common Commands

### Full Setup (on RunPod)
```bash
arty setup                   # Complete setup: deps, tailscale, services, comfyui, models, supervisor
```

### Supervisor Control
```bash
arty supervisor/start        # Start supervisord
arty supervisor/stop         # Stop all services
arty supervisor/status       # Check service status
arty supervisor/restart      # Restart all services
supervisorctl -c supervisord.conf status           # Direct status check
supervisorctl -c supervisord.conf start comfyui    # Start specific service
supervisorctl -c supervisord.conf tail -f comfyui  # Follow logs
```

### Model Management
```bash
arty models/download         # Download models from Civitai/HuggingFace
arty models/link             # Symlink cached models to ComfyUI
```

### Setup Components
```bash
arty deps                    # Clone git references
arty setup/tailscale         # Configure Tailscale VPN
arty setup/services          # Create venvs for all services
arty setup/comfyui           # Install ComfyUI and custom node dependencies
```

### Docker Build (CI runs on Gitea)
```bash
docker build -t runpod-ai-orchestrator .
```

## Environment Variables

Required in `.env` (or RunPod template):
- `HF_TOKEN` - HuggingFace API token
- `TAILSCALE_AUTHKEY` - Tailscale auth key (optional)
- `CIVITAI_API_KEY` - Civitai API key for model downloads
- `WEBDAV_URL`, `WEBDAV_USERNAME`, `WEBDAV_PASSWORD`, `WEBDAV_REMOTE_PATH` - WebDAV sync config
- `PUBLIC_KEY` - SSH public key for RunPod access

## File Structure

```
├── Dockerfile              # Minimal base image (PyTorch + CUDA)
├── start.sh                # Container entrypoint
├── supervisord.conf        # Process manager config
├── arty.yml                # Git repos + setup scripts
├── models/
│   ├── models_civitai.yaml      # Civitai model definitions
│   └── models_huggingface.yaml  # HuggingFace model definitions
├── services/
│   ├── comfyui/            # ComfyUI + custom_nodes (cloned by arty)
│   ├── audiocraft/         # AudioCraft Studio (cloned by arty)
│   ├── vllm/               # vLLM configs
│   └── webdav-sync/        # Output sync service
└── .gitea/workflows/       # CI/CD for Docker builds
```

## Important Notes

- **Network Volume**: On RunPod, `/workspace` is the persistent network volume. The orchestrator repo is cloned to `/workspace/orchestrator`.
- **Service Ports**: ComfyUI (8188), Supervisor Web UI (9001), vLLM Llama (8001), vLLM BGE (8002)
- **vLLM services** are disabled by default (autostart=false) to conserve GPU memory
- **Custom nodes** have their dependencies installed into the ComfyUI venv during setup
Initial commit 2025-11-26 18:04:53 +01:00			`# CLAUDE.md`

			`This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.`

			`## Overview`

			`RunPod AI Orchestrator - A Docker-based deployment template for GPU instances on RunPod. Orchestrates AI services (ComfyUI, vLLM, AudioCraft) using Supervisor for process management, with optional Tailscale VPN integration.`

			`## Architecture`

			`### Service Orchestration`
			- Supervisor manages all services via `supervisord.conf`
			- Services run in isolated Python venvs under `services/<name>/venv/`
			- Logs written to `.logs/` directory
			`- Each service has its own requirements.txt`

			`### Services (managed by Supervisor)`
			`\| Service \| Port \| Description \| Auto-start \|`
			`\|---------\|------\|-------------\|------------\|`
			`\| ComfyUI \| 8188 \| Node-based image/video/audio generation \| Yes \|`
			`\| WebDAV Sync \| - \| Uploads ComfyUI outputs to HiDrive \| Yes \|`
			`\| AudioCraft \| - \| Music generation \| Yes \|`
			`\| vLLM Llama \| 8001 \| Llama 3.1 8B language model \| No \|`
			`\| vLLM BGE \| 8002 \| BGE embedding model \| No \|`

			`### Model Management`
			- Models defined in YAML configs: `models/models_civitai.yaml`, `models/models_huggingface.yaml`
			- Downloaded to `.cache/` and symlinked to `services/comfyui/models/`
			- Uses external scripts: `artifact_civitai_download.sh`, `artifact_huggingface_download.sh`

			`### Repository Management (Arty)`
			- `arty.yml` defines git repos to clone (ComfyUI + custom nodes)
			- Repos cloned to `services/comfyui/` and `services/comfyui/custom_nodes/`
			- Run `arty sync` to clone/update all dependencies

			`## Common Commands`

			`### Full Setup (on RunPod)`
			```bash
			`arty setup # Complete setup: deps, tailscale, services, comfyui, models, supervisor`
			```

			`### Supervisor Control`
			```bash
			`arty supervisor/start # Start supervisord`
			`arty supervisor/stop # Stop all services`
			`arty supervisor/status # Check service status`
			`arty supervisor/restart # Restart all services`
			`supervisorctl -c supervisord.conf status # Direct status check`
			`supervisorctl -c supervisord.conf start comfyui # Start specific service`
			`supervisorctl -c supervisord.conf tail -f comfyui # Follow logs`
			```

			`### Model Management`
			```bash
			`arty models/download # Download models from Civitai/HuggingFace`
			`arty models/link # Symlink cached models to ComfyUI`
			```

			`### Setup Components`
			```bash
			`arty deps # Clone git references`
			`arty setup/tailscale # Configure Tailscale VPN`
			`arty setup/services # Create venvs for all services`
			`arty setup/comfyui # Install ComfyUI and custom node dependencies`
			```

			`### Docker Build (CI runs on Gitea)`
			```bash
			`docker build -t runpod-ai-orchestrator .`
			```

			`## Environment Variables`

			Required in `.env` (or RunPod template):
			- `HF_TOKEN` - HuggingFace API token
			- `TAILSCALE_AUTHKEY` - Tailscale auth key (optional)
			- `CIVITAI_API_KEY` - Civitai API key for model downloads
			- `WEBDAV_URL`, `WEBDAV_USERNAME`, `WEBDAV_PASSWORD`, `WEBDAV_REMOTE_PATH` - WebDAV sync config
			- `PUBLIC_KEY` - SSH public key for RunPod access

			`## File Structure`

			```
			`├── Dockerfile # Minimal base image (PyTorch + CUDA)`
			`├── start.sh # Container entrypoint`
			`├── supervisord.conf # Process manager config`
			`├── arty.yml # Git repos + setup scripts`
			`├── models/`
			`│ ├── models_civitai.yaml # Civitai model definitions`
			`│ └── models_huggingface.yaml # HuggingFace model definitions`
			`├── services/`
			`│ ├── comfyui/ # ComfyUI + custom_nodes (cloned by arty)`
			`│ ├── audiocraft/ # AudioCraft Studio (cloned by arty)`
			`│ ├── vllm/ # vLLM configs`
			`│ └── webdav-sync/ # Output sync service`
			`└── .gitea/workflows/ # CI/CD for Docker builds`
			```

			`## Important Notes`

			- Network Volume: On RunPod, `/workspace` is the persistent network volume. The orchestrator repo is cloned to `/workspace/orchestrator`.
			`- Service Ports: ComfyUI (8188), Supervisor Web UI (9001), vLLM Llama (8001), vLLM BGE (8002)`
			`- vLLM services are disabled by default (autostart=false) to conserve GPU memory`
			`- Custom nodes have their dependencies installed into the ComfyUI venv during setup`