Initial commit

2025-11-26 18:04:53 +01:00
parent 5c61ac5c67
commit 5f8c843b22
5 changed files with 175 additions and 32 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -0,0 +1,105 @@
+# CLAUDE.md
+
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+
+## Overview
+
+RunPod AI Orchestrator - A Docker-based deployment template for GPU instances on RunPod. Orchestrates AI services (ComfyUI, vLLM, AudioCraft) using Supervisor for process management, with optional Tailscale VPN integration.
+
+## Architecture
+
+### Service Orchestration
+- **Supervisor** manages all services via `supervisord.conf`
+- Services run in isolated Python venvs under `services/<name>/venv/`
+- Logs written to `.logs/` directory
+- Each service has its own requirements.txt
+
+### Services (managed by Supervisor)
+| Service | Port | Description | Auto-start |
+|---------|------|-------------|------------|
+| ComfyUI | 8188 | Node-based image/video/audio generation | Yes |
+| WebDAV Sync | - | Uploads ComfyUI outputs to HiDrive | Yes |
+| AudioCraft | - | Music generation | Yes |
+| vLLM Llama | 8001 | Llama 3.1 8B language model | No |
+| vLLM BGE | 8002 | BGE embedding model | No |
+
+### Model Management
+- Models defined in YAML configs: `models/models_civitai.yaml`, `models/models_huggingface.yaml`
+- Downloaded to `.cache/` and symlinked to `services/comfyui/models/`
+- Uses external scripts: `artifact_civitai_download.sh`, `artifact_huggingface_download.sh`
+
+### Repository Management (Arty)
+- `arty.yml` defines git repos to clone (ComfyUI + custom nodes)
+- Repos cloned to `services/comfyui/` and `services/comfyui/custom_nodes/`
+- Run `arty sync` to clone/update all dependencies
+
+## Common Commands
+
+### Full Setup (on RunPod)
+```bash
+arty setup                   # Complete setup: deps, tailscale, services, comfyui, models, supervisor
+```
+
+### Supervisor Control
+```bash
+arty supervisor/start        # Start supervisord
+arty supervisor/stop         # Stop all services
+arty supervisor/status       # Check service status
+arty supervisor/restart      # Restart all services
+supervisorctl -c supervisord.conf status           # Direct status check
+supervisorctl -c supervisord.conf start comfyui    # Start specific service
+supervisorctl -c supervisord.conf tail -f comfyui  # Follow logs
+```
+
+### Model Management
+```bash
+arty models/download         # Download models from Civitai/HuggingFace
+arty models/link             # Symlink cached models to ComfyUI
+```
+
+### Setup Components
+```bash
+arty deps                    # Clone git references
+arty setup/tailscale         # Configure Tailscale VPN
+arty setup/services          # Create venvs for all services
+arty setup/comfyui           # Install ComfyUI and custom node dependencies
+```
+
+### Docker Build (CI runs on Gitea)
+```bash
+docker build -t runpod-ai-orchestrator .
+```
+
+## Environment Variables
+
+Required in `.env` (or RunPod template):
+- `HF_TOKEN` - HuggingFace API token
+- `TAILSCALE_AUTHKEY` - Tailscale auth key (optional)
+- `CIVITAI_API_KEY` - Civitai API key for model downloads
+- `WEBDAV_URL`, `WEBDAV_USERNAME`, `WEBDAV_PASSWORD`, `WEBDAV_REMOTE_PATH` - WebDAV sync config
+- `PUBLIC_KEY` - SSH public key for RunPod access
+
+## File Structure
+
+```
+├── Dockerfile              # Minimal base image (PyTorch + CUDA)
+├── start.sh                # Container entrypoint
+├── supervisord.conf        # Process manager config
+├── arty.yml                # Git repos + setup scripts
+├── models/
+│   ├── models_civitai.yaml      # Civitai model definitions
+│   └── models_huggingface.yaml  # HuggingFace model definitions
+├── services/
+│   ├── comfyui/            # ComfyUI + custom_nodes (cloned by arty)
+│   ├── audiocraft/         # AudioCraft Studio (cloned by arty)
+│   ├── vllm/               # vLLM configs
+│   └── webdav-sync/        # Output sync service
+└── .gitea/workflows/       # CI/CD for Docker builds
+```
+
+## Important Notes
+
+- **Network Volume**: On RunPod, `/workspace` is the persistent network volume. The orchestrator repo is cloned to `/workspace/orchestrator`.
+- **Service Ports**: ComfyUI (8188), Supervisor Web UI (9001), vLLM Llama (8001), vLLM BGE (8002)
+- **vLLM services** are disabled by default (autostart=false) to conserve GPU memory
+- **Custom nodes** have their dependencies installed into the ComfyUI venv during setup