All checks were successful
Build and Push RunPod Docker Image / build-and-push (push) Successful in 14s
106 lines
4.2 KiB
Markdown
106 lines
4.2 KiB
Markdown
# CLAUDE.md
|
|
|
|
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
|
|
|
## Overview
|
|
|
|
RunPod AI Orchestrator - A Docker-based deployment template for GPU instances on RunPod. Orchestrates AI services (ComfyUI, vLLM, AudioCraft) using Supervisor for process management, with optional Tailscale VPN integration.
|
|
|
|
## Architecture
|
|
|
|
### Service Orchestration
|
|
- **Supervisor** manages all services via `supervisord.conf`
|
|
- Services run in isolated Python venvs under `services/<name>/venv/`
|
|
- Logs written to `.logs/` directory
|
|
- Each service has its own requirements.txt
|
|
|
|
### Services (managed by Supervisor)
|
|
| Service | Port | Description | Auto-start |
|
|
|---------|------|-------------|------------|
|
|
| ComfyUI | 8188 | Node-based image/video/audio generation | Yes |
|
|
| WebDAV Sync | - | Uploads ComfyUI outputs to HiDrive | Yes |
|
|
| AudioCraft | - | Music generation | Yes |
|
|
| vLLM Llama | 8001 | Llama 3.1 8B language model | No |
|
|
| vLLM BGE | 8002 | BGE embedding model | No |
|
|
|
|
### Model Management
|
|
- Models defined in YAML configs: `models/models_civitai.yaml`, `models/models_huggingface.yaml`
|
|
- Downloaded to `.cache/` and symlinked to `services/comfyui/models/`
|
|
- Uses external scripts: `artifact_civitai_download.sh`, `artifact_huggingface_download.sh`
|
|
|
|
### Repository Management (Arty)
|
|
- `arty.yml` defines git repos to clone (ComfyUI + custom nodes)
|
|
- Repos cloned to `services/comfyui/` and `services/comfyui/custom_nodes/`
|
|
- Run `arty sync` to clone/update all dependencies
|
|
|
|
## Common Commands
|
|
|
|
### Full Setup (on RunPod)
|
|
```bash
|
|
arty setup # Complete setup: deps, tailscale, services, comfyui, models, supervisor
|
|
```
|
|
|
|
### Supervisor Control
|
|
```bash
|
|
arty supervisor/start # Start supervisord
|
|
arty supervisor/stop # Stop all services
|
|
arty supervisor/status # Check service status
|
|
arty supervisor/restart # Restart all services
|
|
supervisorctl -c supervisord.conf status # Direct status check
|
|
supervisorctl -c supervisord.conf start comfyui # Start specific service
|
|
supervisorctl -c supervisord.conf tail -f comfyui # Follow logs
|
|
```
|
|
|
|
### Model Management
|
|
```bash
|
|
arty models/download # Download models from Civitai/HuggingFace
|
|
arty models/link # Symlink cached models to ComfyUI
|
|
```
|
|
|
|
### Setup Components
|
|
```bash
|
|
arty deps # Clone git references
|
|
arty setup/tailscale # Configure Tailscale VPN
|
|
arty setup/services # Create venvs for all services
|
|
arty setup/comfyui # Install ComfyUI and custom node dependencies
|
|
```
|
|
|
|
### Docker Build (CI runs on Gitea)
|
|
```bash
|
|
docker build -t runpod-ai-orchestrator .
|
|
```
|
|
|
|
## Environment Variables
|
|
|
|
Required in `.env` (or RunPod template):
|
|
- `HF_TOKEN` - HuggingFace API token
|
|
- `TAILSCALE_AUTHKEY` - Tailscale auth key (optional)
|
|
- `CIVITAI_API_KEY` - Civitai API key for model downloads
|
|
- `WEBDAV_URL`, `WEBDAV_USERNAME`, `WEBDAV_PASSWORD`, `WEBDAV_REMOTE_PATH` - WebDAV sync config
|
|
- `PUBLIC_KEY` - SSH public key for RunPod access
|
|
|
|
## File Structure
|
|
|
|
```
|
|
├── Dockerfile # Minimal base image (PyTorch + CUDA)
|
|
├── start.sh # Container entrypoint
|
|
├── supervisord.conf # Process manager config
|
|
├── arty.yml # Git repos + setup scripts
|
|
├── models/
|
|
│ ├── models_civitai.yaml # Civitai model definitions
|
|
│ └── models_huggingface.yaml # HuggingFace model definitions
|
|
├── services/
|
|
│ ├── comfyui/ # ComfyUI + custom_nodes (cloned by arty)
|
|
│ ├── audiocraft/ # AudioCraft Studio (cloned by arty)
|
|
│ ├── vllm/ # vLLM configs
|
|
│ └── webdav-sync/ # Output sync service
|
|
└── .gitea/workflows/ # CI/CD for Docker builds
|
|
```
|
|
|
|
## Important Notes
|
|
|
|
- **Network Volume**: On RunPod, `/workspace` is the persistent network volume. The orchestrator repo is cloned to `/workspace/orchestrator`.
|
|
- **Service Ports**: ComfyUI (8188), Supervisor Web UI (9001), vLLM Llama (8001), vLLM BGE (8002)
|
|
- **vLLM services** are disabled by default (autostart=false) to conserve GPU memory
|
|
- **Custom nodes** have their dependencies installed into the ComfyUI venv during setup
|