RUNPOD_TEMPLATE.md

# RunPod Template Setup Guide

This guide explains how to deploy the AI Orchestrator (ComfyUI + vLLM) on RunPod using a custom Docker template and network volume.

## Architecture Overview

The deployment uses a **two-tier strategy**:

1. **Docker Image** (software layer) - Contains system packages, Supervisor, Tailscale
2. **Network Volume** (data layer) - Contains models, ComfyUI installation, venvs, configuration

This approach allows fast pod deployment (~2-3 minutes) while keeping all large files (models, ~80-200GB) on a persistent network volume.

## Prerequisites

- RunPod account with credits
- Docker Hub account (for hosting the template image)
- HuggingFace account with API token (for model downloads)
- Tailscale account with auth key (optional, for VPN access)

## Step 1: Build and Push Docker Image

### Option A: Automated Build (Recommended)

The repository includes a Gitea workflow that automatically builds and pushes the Docker image to your Gitea container registry when you push to the `main` branch or create a version tag.

1. **Configure Gitea Secret:**
   - Go to your Gitea repository → Settings → Secrets
   - Add `REGISTRY_TOKEN` = your Gitea access token with registry permissions
   - (The workflow automatically uses your Gitea username via `gitea.actor`)

2. **Trigger Build:**
   ```bash
   # Push to main branch
   git push origin main

   # Or create a version tag
   git tag v1.0.0
   git push origin v1.0.0
   ```

3. **Monitor Build:**
   - Go to Actions tab in Gitea
   - Wait for build to complete (~5-10 minutes)
   - Note the Docker image name: `dev.pivoine.art/valknar/runpod-ai-orchestrator:latest`

### Option B: Manual Build

If you prefer to build manually:

```bash
# From the repository root
cd /path/to/runpod

# Build the image
docker build -t dev.pivoine.art/valknar/runpod-ai-orchestrator:latest .

# Login to your Gitea registry
docker login dev.pivoine.art

# Push to Gitea registry
docker push dev.pivoine.art/valknar/runpod-ai-orchestrator:latest
```

## Step 2: Create Network Volume

Network volumes persist your models and data across pod restarts and rebuilds.

1. **Go to RunPod Dashboard → Storage → Network Volumes**

2. **Click "New Network Volume"**

3. **Configure:**
   - **Name**: `ai-orchestrator-models`
   - **Size**: `200GB` (adjust based on your needs)
     - Essential models only: ~80GB
     - All models: ~137-200GB
   - **Datacenter**: Choose closest to you (volume tied to datacenter)

4. **Click "Create Volume"**

5. **Note the Volume ID** (e.g., `vol-abc123def456`) for pod deployment

### Storage Requirements

| Configuration | Size | Models Included |
|--------------|------|-----------------|
| Essential | ~80GB | FLUX Schnell, 1-2 SDXL checkpoints, MusicGen Medium |
| Complete | ~137GB | All image/video/audio models from playbook |
| Full + vLLM | ~200GB | Complete + Qwen 2.5 7B + Llama 3.1 8B |

## Step 3: Create RunPod Template

1. **Go to RunPod Dashboard → Templates**

2. **Click "New Template"**

3. **Configure Template Settings:**

   **Container Configuration:**
   - **Template Name**: `AI Orchestrator (ComfyUI + vLLM)`
   - **Template Type**: Docker
   - **Container Image**: `dev.pivoine.art/valknar/runpod-ai-orchestrator:latest`
   - **Container Disk**: `50GB` (for system and temp files)
   - **Docker Command**: Leave empty (uses default `/start.sh`)

   **Volume Configuration:**
   - **Volume Mount Path**: `/workspace`
   - **Attach to Network Volume**: Select your volume ID from Step 2

   **Port Configuration:**
   - **Expose HTTP Ports**: `8188, 9000, 9001`
     - `8188` - ComfyUI web interface
     - `9000` - Model orchestrator API
     - `9001` - Supervisor web UI
   - **Expose TCP Ports**: `22` (SSH access)

   **Environment Variables:**
   ```
   HF_TOKEN=your_huggingface_token_here
   TAILSCALE_AUTHKEY=tskey-auth-your_tailscale_authkey_here
   SUPERVISOR_BACKEND_HOST=localhost
   SUPERVISOR_BACKEND_PORT=9001
   ```

   **Advanced Settings:**
   - **Start Jupyter**: No
   - **Start SSH**: Yes (handled by base image)

4. **Click "Save Template"**

## Step 4: First Deployment (Initial Setup)

The first time you deploy, you need to set up the network volume with models and configuration.

### 4.1 Deploy Pod

1. **Go to RunPod Dashboard → Pods**
2. **Click "Deploy"** or "GPU Pods"
3. **Select your custom template**: `AI Orchestrator (ComfyUI + vLLM)`
4. **Configure GPU:**
   - **GPU Type**: RTX 4090 (24GB VRAM) or higher
   - **Network Volume**: Select your volume from Step 2
   - **On-Demand vs Spot**: Choose based on budget
5. **Click "Deploy"**

### 4.2 SSH into Pod

```bash
# Get pod SSH command from RunPod dashboard
ssh root@<pod-ip> -p <port> -i ~/.ssh/id_ed25519

# Or use RunPod web terminal
```

### 4.3 Initial Setup on Network Volume

```bash
# 1. Clone the repository to /workspace/ai
cd /workspace
git clone https://github.com/your-username/runpod.git ai
cd ai

# 2. Create .env file with your credentials
cp .env.example .env
nano .env

# Edit and add:
# HF_TOKEN=your_huggingface_token
# TAILSCALE_AUTHKEY=tskey-auth-your_key
# GPU_TAILSCALE_IP=<will be set automatically>

# 3. Download essential models (this takes 30-60 minutes)
ansible-playbook playbook.yml --tags comfyui-essential

# OR download all models (1-2 hours)
ansible-playbook playbook.yml --tags comfyui-models-all

# 4. Link models to ComfyUI
bash scripts/link-comfyui-models.sh

# OR if arty is available
arty run models/link-comfyui

# 5. Install ComfyUI custom nodes dependencies
cd /workspace/ComfyUI/custom_nodes/ComfyUI-Manager
pip install -r requirements.txt
cd /workspace/ai

# 6. Restart the container to apply all changes
exit
# Go to RunPod dashboard → Stop pod → Start pod
```

### 4.4 Verify Services

After restart, SSH back in and check:

```bash
# Check supervisor status
supervisorctl -c /workspace/supervisord.conf status

# Expected output:
# comfyui                          RUNNING   pid 123, uptime 0:01:00
# (orchestrator is disabled by default - enable for vLLM)

# Test ComfyUI
curl -I http://localhost:8188

# Test Supervisor web UI
curl -I http://localhost:9001
```

## Step 5: Subsequent Deployments

After initial setup, deploying new pods is quick (2-3 minutes):

1. **Deploy pod** with same template + network volume
2. **Wait for startup** (~1-2 minutes for services to start)
3. **Access services:**
   - ComfyUI: `http://<pod-ip>:8188`
   - Supervisor: `http://<pod-ip>:9001`

**All models, configuration, and data persist on the network volume!**

## Step 6: Access Services

### Via Direct IP (HTTP)

Get pod IP and ports from RunPod dashboard:

```
ComfyUI:           http://<pod-ip>:8188
Supervisor UI:     http://<pod-ip>:9001
Orchestrator API:  http://<pod-ip>:9000
SSH:               ssh root@<pod-ip> -p <port>
```

### Via Tailscale VPN (Recommended)

If you configured `TAILSCALE_AUTHKEY`, the pod automatically joins your Tailscale network:

1. **Get Tailscale IP:**
   ```bash
   ssh root@<pod-ip> -p <port>
   tailscale ip -4
   # Example output: 100.114.60.40
   ```

2. **Access via Tailscale:**
   ```
   ComfyUI:      http://<tailscale-ip>:8188
   Supervisor:   http://<tailscale-ip>:9001
   Orchestrator: http://<tailscale-ip>:9000
   SSH:          ssh root@<tailscale-ip>
   ```

3. **Update LiteLLM config** on your VPS with the Tailscale IP

## Service Management

### Start/Stop Services

```bash
# Start all services
supervisorctl -c /workspace/supervisord.conf start all

# Stop all services
supervisorctl -c /workspace/supervisord.conf stop all

# Restart specific service
supervisorctl -c /workspace/supervisord.conf restart comfyui

# View status
supervisorctl -c /workspace/supervisord.conf status
```

### Enable vLLM Models (Text Generation)

By default, only ComfyUI runs (to save VRAM). To enable vLLM:

1. **Stop ComfyUI** (frees up VRAM):
   ```bash
   supervisorctl -c /workspace/supervisord.conf stop comfyui
   ```

2. **Start orchestrator** (manages vLLM models):
   ```bash
   supervisorctl -c /workspace/supervisord.conf start orchestrator
   ```

3. **Test text generation:**
   ```bash
   curl -X POST http://localhost:9000/v1/chat/completions \
     -H 'Content-Type: application/json' \
     -d '{"model":"qwen-2.5-7b","messages":[{"role":"user","content":"Hello"}]}'
   ```

### Switch Back to ComfyUI

```bash
# Stop orchestrator (stops all vLLM models)
supervisorctl -c /workspace/supervisord.conf stop orchestrator

# Start ComfyUI
supervisorctl -c /workspace/supervisord.conf start comfyui
```

## Updating the Template

When you make changes to code or configuration:

### Update Docker Image

```bash
# 1. Make changes to Dockerfile or start.sh
# 2. Push to repository
git add .
git commit -m "Update template configuration"
git push origin main

# 3. Gitea workflow auto-builds new image

# 4. Terminate old pod and deploy new one with updated image
```

### Update Network Volume Data

```bash
# SSH into running pod
ssh root@<pod-ip> -p <port>

# Update repository
cd /workspace/ai
git pull

# Re-run Ansible if needed
ansible-playbook playbook.yml --tags <specific-tag>

# Restart services
supervisorctl -c /workspace/supervisord.conf restart all
```

## Troubleshooting

### Pod fails to start

**Check logs:**
```bash
# Via SSH
cat /workspace/logs/supervisord.log
cat /workspace/logs/comfyui.err.log

# Via RunPod web terminal
tail -f /workspace/logs/*.log
```

**Common issues:**
- Missing `.env` file → Create `/workspace/ai/.env` with required vars
- Supervisor config not found → Ensure `/workspace/ai/supervisord.conf` exists
- Port conflicts → Check if services are already running

### Tailscale not connecting

**Check Tailscale status:**
```bash
tailscale status
tailscale ip -4
```

**Common issues:**
- Missing or invalid `TAILSCALE_AUTHKEY` in `.env`
- Auth key expired → Generate new key in Tailscale admin
- Firewall blocking → RunPod should allow Tailscale by default

### Services not starting

**Check Supervisor:**
```bash
supervisorctl -c /workspace/supervisord.conf status
supervisorctl -c /workspace/supervisord.conf tail -f comfyui
```

**Common issues:**
- venv broken → Re-run `scripts/bootstrap-venvs.sh`
- Models not downloaded → Run Ansible playbook again
- Python version mismatch → Rebuild venvs

### Out of VRAM

**Check GPU memory:**
```bash
nvidia-smi
```

**RTX 4090 (24GB) capacity:**
- ComfyUI (FLUX Schnell): ~23GB (can't run with vLLM)
- vLLM (Qwen 2.5 7B): ~14GB
- vLLM (Llama 3.1 8B): ~17GB

**Solution:** Only run one service at a time (see Service Management section)

### Network volume full

**Check disk usage:**
```bash
df -h /workspace
du -sh /workspace/*
```

**Clean up:**
```bash
# Remove old HuggingFace cache
rm -rf /workspace/huggingface_cache

# Re-download essential models only
cd /workspace/ai
ansible-playbook playbook.yml --tags comfyui-essential
```

## Cost Optimization

### Spot vs On-Demand

- **Spot instances**: ~70% cheaper, can be interrupted
- **On-Demand**: More expensive, guaranteed availability

**Recommendation:** Use spot for development, on-demand for production

### Network Volume Pricing

- First 1TB: $0.07/GB/month
- Beyond 1TB: $0.05/GB/month

**200GB volume cost:** ~$14/month

### Pod Auto-Stop

Configure auto-stop in RunPod pod settings to save costs when idle:
- Stop after 15 minutes idle
- Stop after 1 hour idle
- Manual stop only

## Advanced Configuration

### Custom Environment Variables

Add to template or pod environment variables:

```bash
# Model cache locations
HF_HOME=/workspace/huggingface_cache
TRANSFORMERS_CACHE=/workspace/huggingface_cache

# ComfyUI settings
COMFYUI_PORT=8188
COMFYUI_LISTEN=0.0.0.0

# Orchestrator settings
ORCHESTRATOR_PORT=9000

# GPU settings
CUDA_VISIBLE_DEVICES=0
```

### Multiple Network Volumes

You can attach multiple network volumes for organization:

1. **Models volume** - `/workspace/models` (read-only, shared)
2. **Data volume** - `/workspace/data` (read-write, per-project)

### Custom Startup Script

Override `/start.sh` behavior by creating `/workspace/custom-start.sh`:

```bash
#!/bin/bash
# Custom startup commands

# Source default startup
source /start.sh

# Add your custom commands here
echo "Running custom initialization..."
```

## References

- [RunPod Documentation](https://docs.runpod.io/)
- [RunPod Templates Overview](https://docs.runpod.io/pods/templates/overview)
- [Network Volumes Guide](https://docs.runpod.io/storage/network-volumes)
- [ComfyUI Documentation](https://github.com/comfyanonymous/ComfyUI)
- [Supervisor Documentation](http://supervisord.org/)
- [Tailscale Documentation](https://tailscale.com/kb/)

## Support

For issues or questions:
- Check troubleshooting section above
- Review `/workspace/logs/` files
- Check RunPod community forums
- Open issue in project repository
feat: add RunPod Docker template with automated build workflow - Add Dockerfile with minimal setup (supervisor, tailscale) - Add start.sh bootstrap script for container initialization - Add Gitea workflow for automated Docker image builds - Add comprehensive RUNPOD_TEMPLATE.md documentation - Add bootstrap-venvs.sh for Python venv health checks This enables deployment of the AI orchestrator on RunPod using: - Minimal Docker image (~2-3GB) for fast deployment - Network volume for models and data persistence (~80-200GB) - Automated builds on push to main or version tags - Full Tailscale VPN integration - Supervisor process management 2025-11-23 21:53:56 +01:00			`# RunPod Template Setup Guide`

			`This guide explains how to deploy the AI Orchestrator (ComfyUI + vLLM) on RunPod using a custom Docker template and network volume.`

			`## Architecture Overview`

			`The deployment uses a two-tier strategy:`

			`1. Docker Image (software layer) - Contains system packages, Supervisor, Tailscale`
			`2. Network Volume (data layer) - Contains models, ComfyUI installation, venvs, configuration`

			`This approach allows fast pod deployment (~2-3 minutes) while keeping all large files (models, ~80-200GB) on a persistent network volume.`

			`## Prerequisites`

			`- RunPod account with credits`
			`- Docker Hub account (for hosting the template image)`
			`- HuggingFace account with API token (for model downloads)`
			`- Tailscale account with auth key (optional, for VPN access)`

			`## Step 1: Build and Push Docker Image`

			`### Option A: Automated Build (Recommended)`

fix: update Docker registry from Docker Hub to dev.pivoine.art - Use Gitea container registry instead of Docker Hub - Update workflow to use gitea.actor and REGISTRY_TOKEN - Update documentation to reflect correct registry URL - Match supervisor-ui workflow configuration 2025-11-23 21:57:14 +01:00			The repository includes a Gitea workflow that automatically builds and pushes the Docker image to your Gitea container registry when you push to the `main` branch or create a version tag.
feat: add RunPod Docker template with automated build workflow - Add Dockerfile with minimal setup (supervisor, tailscale) - Add start.sh bootstrap script for container initialization - Add Gitea workflow for automated Docker image builds - Add comprehensive RUNPOD_TEMPLATE.md documentation - Add bootstrap-venvs.sh for Python venv health checks This enables deployment of the AI orchestrator on RunPod using: - Minimal Docker image (~2-3GB) for fast deployment - Network volume for models and data persistence (~80-200GB) - Automated builds on push to main or version tags - Full Tailscale VPN integration - Supervisor process management 2025-11-23 21:53:56 +01:00
fix: update Docker registry from Docker Hub to dev.pivoine.art - Use Gitea container registry instead of Docker Hub - Update workflow to use gitea.actor and REGISTRY_TOKEN - Update documentation to reflect correct registry URL - Match supervisor-ui workflow configuration 2025-11-23 21:57:14 +01:00			`1. Configure Gitea Secret:`
feat: add RunPod Docker template with automated build workflow - Add Dockerfile with minimal setup (supervisor, tailscale) - Add start.sh bootstrap script for container initialization - Add Gitea workflow for automated Docker image builds - Add comprehensive RUNPOD_TEMPLATE.md documentation - Add bootstrap-venvs.sh for Python venv health checks This enables deployment of the AI orchestrator on RunPod using: - Minimal Docker image (~2-3GB) for fast deployment - Network volume for models and data persistence (~80-200GB) - Automated builds on push to main or version tags - Full Tailscale VPN integration - Supervisor process management 2025-11-23 21:53:56 +01:00			`- Go to your Gitea repository → Settings → Secrets`
fix: update Docker registry from Docker Hub to dev.pivoine.art - Use Gitea container registry instead of Docker Hub - Update workflow to use gitea.actor and REGISTRY_TOKEN - Update documentation to reflect correct registry URL - Match supervisor-ui workflow configuration 2025-11-23 21:57:14 +01:00			- Add `REGISTRY_TOKEN` = your Gitea access token with registry permissions
			- (The workflow automatically uses your Gitea username via `gitea.actor`)
feat: add RunPod Docker template with automated build workflow - Add Dockerfile with minimal setup (supervisor, tailscale) - Add start.sh bootstrap script for container initialization - Add Gitea workflow for automated Docker image builds - Add comprehensive RUNPOD_TEMPLATE.md documentation - Add bootstrap-venvs.sh for Python venv health checks This enables deployment of the AI orchestrator on RunPod using: - Minimal Docker image (~2-3GB) for fast deployment - Network volume for models and data persistence (~80-200GB) - Automated builds on push to main or version tags - Full Tailscale VPN integration - Supervisor process management 2025-11-23 21:53:56 +01:00
			`2. Trigger Build:`
			```bash
			`# Push to main branch`
			`git push origin main`

			`# Or create a version tag`
			`git tag v1.0.0`
			`git push origin v1.0.0`
			```

			`3. Monitor Build:`
			`- Go to Actions tab in Gitea`
			`- Wait for build to complete (~5-10 minutes)`
fix: update Docker registry from Docker Hub to dev.pivoine.art - Use Gitea container registry instead of Docker Hub - Update workflow to use gitea.actor and REGISTRY_TOKEN - Update documentation to reflect correct registry URL - Match supervisor-ui workflow configuration 2025-11-23 21:57:14 +01:00			- Note the Docker image name: `dev.pivoine.art/valknar/runpod-ai-orchestrator:latest`
feat: add RunPod Docker template with automated build workflow - Add Dockerfile with minimal setup (supervisor, tailscale) - Add start.sh bootstrap script for container initialization - Add Gitea workflow for automated Docker image builds - Add comprehensive RUNPOD_TEMPLATE.md documentation - Add bootstrap-venvs.sh for Python venv health checks This enables deployment of the AI orchestrator on RunPod using: - Minimal Docker image (~2-3GB) for fast deployment - Network volume for models and data persistence (~80-200GB) - Automated builds on push to main or version tags - Full Tailscale VPN integration - Supervisor process management 2025-11-23 21:53:56 +01:00
			`### Option B: Manual Build`

			`If you prefer to build manually:`

			```bash
			`# From the repository root`
			`cd /path/to/runpod`

			`# Build the image`
fix: update Docker registry from Docker Hub to dev.pivoine.art - Use Gitea container registry instead of Docker Hub - Update workflow to use gitea.actor and REGISTRY_TOKEN - Update documentation to reflect correct registry URL - Match supervisor-ui workflow configuration 2025-11-23 21:57:14 +01:00			`docker build -t dev.pivoine.art/valknar/runpod-ai-orchestrator:latest .`
feat: add RunPod Docker template with automated build workflow - Add Dockerfile with minimal setup (supervisor, tailscale) - Add start.sh bootstrap script for container initialization - Add Gitea workflow for automated Docker image builds - Add comprehensive RUNPOD_TEMPLATE.md documentation - Add bootstrap-venvs.sh for Python venv health checks This enables deployment of the AI orchestrator on RunPod using: - Minimal Docker image (~2-3GB) for fast deployment - Network volume for models and data persistence (~80-200GB) - Automated builds on push to main or version tags - Full Tailscale VPN integration - Supervisor process management 2025-11-23 21:53:56 +01:00
fix: update Docker registry from Docker Hub to dev.pivoine.art - Use Gitea container registry instead of Docker Hub - Update workflow to use gitea.actor and REGISTRY_TOKEN - Update documentation to reflect correct registry URL - Match supervisor-ui workflow configuration 2025-11-23 21:57:14 +01:00			`# Login to your Gitea registry`
			`docker login dev.pivoine.art`

			`# Push to Gitea registry`
			`docker push dev.pivoine.art/valknar/runpod-ai-orchestrator:latest`
feat: add RunPod Docker template with automated build workflow - Add Dockerfile with minimal setup (supervisor, tailscale) - Add start.sh bootstrap script for container initialization - Add Gitea workflow for automated Docker image builds - Add comprehensive RUNPOD_TEMPLATE.md documentation - Add bootstrap-venvs.sh for Python venv health checks This enables deployment of the AI orchestrator on RunPod using: - Minimal Docker image (~2-3GB) for fast deployment - Network volume for models and data persistence (~80-200GB) - Automated builds on push to main or version tags - Full Tailscale VPN integration - Supervisor process management 2025-11-23 21:53:56 +01:00			```

			`## Step 2: Create Network Volume`

			`Network volumes persist your models and data across pod restarts and rebuilds.`

			`1. Go to RunPod Dashboard → Storage → Network Volumes`

			`2. Click "New Network Volume"`

			`3. Configure:`
			- Name: `ai-orchestrator-models`
			- Size: `200GB` (adjust based on your needs)
			`- Essential models only: ~80GB`
			`- All models: ~137-200GB`
			`- Datacenter: Choose closest to you (volume tied to datacenter)`

			`4. Click "Create Volume"`

			5. Note the Volume ID (e.g., `vol-abc123def456`) for pod deployment

			`### Storage Requirements`

			`\| Configuration \| Size \| Models Included \|`
			`\|--------------\|------\|-----------------\|`
			`\| Essential \| ~80GB \| FLUX Schnell, 1-2 SDXL checkpoints, MusicGen Medium \|`
			`\| Complete \| ~137GB \| All image/video/audio models from playbook \|`
			`\| Full + vLLM \| ~200GB \| Complete + Qwen 2.5 7B + Llama 3.1 8B \|`

			`## Step 3: Create RunPod Template`

			`1. Go to RunPod Dashboard → Templates`

			`2. Click "New Template"`

			`3. Configure Template Settings:`

			`Container Configuration:`
			- Template Name: `AI Orchestrator (ComfyUI + vLLM)`
			`- Template Type: Docker`
fix: update Docker registry from Docker Hub to dev.pivoine.art - Use Gitea container registry instead of Docker Hub - Update workflow to use gitea.actor and REGISTRY_TOKEN - Update documentation to reflect correct registry URL - Match supervisor-ui workflow configuration 2025-11-23 21:57:14 +01:00			- Container Image: `dev.pivoine.art/valknar/runpod-ai-orchestrator:latest`
feat: add RunPod Docker template with automated build workflow - Add Dockerfile with minimal setup (supervisor, tailscale) - Add start.sh bootstrap script for container initialization - Add Gitea workflow for automated Docker image builds - Add comprehensive RUNPOD_TEMPLATE.md documentation - Add bootstrap-venvs.sh for Python venv health checks This enables deployment of the AI orchestrator on RunPod using: - Minimal Docker image (~2-3GB) for fast deployment - Network volume for models and data persistence (~80-200GB) - Automated builds on push to main or version tags - Full Tailscale VPN integration - Supervisor process management 2025-11-23 21:53:56 +01:00			- Container Disk: `50GB` (for system and temp files)
			- Docker Command: Leave empty (uses default `/start.sh`)

			`Volume Configuration:`
			- Volume Mount Path: `/workspace`
			`- Attach to Network Volume: Select your volume ID from Step 2`

			`Port Configuration:`
			- Expose HTTP Ports: `8188, 9000, 9001`
			- `8188` - ComfyUI web interface
			- `9000` - Model orchestrator API
			- `9001` - Supervisor web UI
			- Expose TCP Ports: `22` (SSH access)

			`Environment Variables:`
			```
			`HF_TOKEN=your_huggingface_token_here`
			`TAILSCALE_AUTHKEY=tskey-auth-your_tailscale_authkey_here`
			`SUPERVISOR_BACKEND_HOST=localhost`
			`SUPERVISOR_BACKEND_PORT=9001`
			```

			`Advanced Settings:`
			`- Start Jupyter: No`
			`- Start SSH: Yes (handled by base image)`

			`4. Click "Save Template"`

			`## Step 4: First Deployment (Initial Setup)`

			`The first time you deploy, you need to set up the network volume with models and configuration.`

			`### 4.1 Deploy Pod`

			`1. Go to RunPod Dashboard → Pods`
			`2. Click "Deploy" or "GPU Pods"`
			3. Select your custom template: `AI Orchestrator (ComfyUI + vLLM)`
			`4. Configure GPU:`
			`- GPU Type: RTX 4090 (24GB VRAM) or higher`
			`- Network Volume: Select your volume from Step 2`
			`- On-Demand vs Spot: Choose based on budget`
			`5. Click "Deploy"`

			`### 4.2 SSH into Pod`

			```bash
			`# Get pod SSH command from RunPod dashboard`
			`ssh root@<pod-ip> -p <port> -i ~/.ssh/id_ed25519`

			`# Or use RunPod web terminal`
			```

			`### 4.3 Initial Setup on Network Volume`

			```bash
			`# 1. Clone the repository to /workspace/ai`
			`cd /workspace`
			`git clone https://github.com/your-username/runpod.git ai`
			`cd ai`

			`# 2. Create .env file with your credentials`
			`cp .env.example .env`
			`nano .env`

			`# Edit and add:`
			`# HF_TOKEN=your_huggingface_token`
			`# TAILSCALE_AUTHKEY=tskey-auth-your_key`
			`# GPU_TAILSCALE_IP=<will be set automatically>`

			`# 3. Download essential models (this takes 30-60 minutes)`
			`ansible-playbook playbook.yml --tags comfyui-essential`

			`# OR download all models (1-2 hours)`
			`ansible-playbook playbook.yml --tags comfyui-models-all`

			`# 4. Link models to ComfyUI`
			`bash scripts/link-comfyui-models.sh`

			`# OR if arty is available`
			`arty run models/link-comfyui`

			`# 5. Install ComfyUI custom nodes dependencies`
			`cd /workspace/ComfyUI/custom_nodes/ComfyUI-Manager`
			`pip install -r requirements.txt`
			`cd /workspace/ai`

			`# 6. Restart the container to apply all changes`
			`exit`
			`# Go to RunPod dashboard → Stop pod → Start pod`
			```

			`### 4.4 Verify Services`

			`After restart, SSH back in and check:`

			```bash
			`# Check supervisor status`
			`supervisorctl -c /workspace/supervisord.conf status`

			`# Expected output:`
			`# comfyui RUNNING pid 123, uptime 0:01:00`
			`# (orchestrator is disabled by default - enable for vLLM)`

			`# Test ComfyUI`
			`curl -I http://localhost:8188`

			`# Test Supervisor web UI`
			`curl -I http://localhost:9001`
			```

			`## Step 5: Subsequent Deployments`

			`After initial setup, deploying new pods is quick (2-3 minutes):`

			`1. Deploy pod with same template + network volume`
			`2. Wait for startup (~1-2 minutes for services to start)`
			`3. Access services:`
			- ComfyUI: `http://<pod-ip>:8188`
			- Supervisor: `http://<pod-ip>:9001`

			`All models, configuration, and data persist on the network volume!`

			`## Step 6: Access Services`

			`### Via Direct IP (HTTP)`

			`Get pod IP and ports from RunPod dashboard:`

			```
			`ComfyUI: http://<pod-ip>:8188`
			`Supervisor UI: http://<pod-ip>:9001`
			`Orchestrator API: http://<pod-ip>:9000`
			`SSH: ssh root@<pod-ip> -p <port>`
			```

			`### Via Tailscale VPN (Recommended)`

			If you configured `TAILSCALE_AUTHKEY`, the pod automatically joins your Tailscale network:

			`1. Get Tailscale IP:`
			```bash
			`ssh root@<pod-ip> -p <port>`
			`tailscale ip -4`
			`# Example output: 100.114.60.40`
			```

			`2. Access via Tailscale:`
			```
			`ComfyUI: http://<tailscale-ip>:8188`
			`Supervisor: http://<tailscale-ip>:9001`
			`Orchestrator: http://<tailscale-ip>:9000`
			`SSH: ssh root@<tailscale-ip>`
			```

			`3. Update LiteLLM config on your VPS with the Tailscale IP`

			`## Service Management`

			`### Start/Stop Services`

			```bash
			`# Start all services`
			`supervisorctl -c /workspace/supervisord.conf start all`

			`# Stop all services`
			`supervisorctl -c /workspace/supervisord.conf stop all`

			`# Restart specific service`
			`supervisorctl -c /workspace/supervisord.conf restart comfyui`

			`# View status`
			`supervisorctl -c /workspace/supervisord.conf status`
			```

			`### Enable vLLM Models (Text Generation)`

			`By default, only ComfyUI runs (to save VRAM). To enable vLLM:`

			`1. Stop ComfyUI (frees up VRAM):`
			```bash
			`supervisorctl -c /workspace/supervisord.conf stop comfyui`
			```

			`2. Start orchestrator (manages vLLM models):`
			```bash
			`supervisorctl -c /workspace/supervisord.conf start orchestrator`
			```

			`3. Test text generation:`
			```bash
			`curl -X POST http://localhost:9000/v1/chat/completions \`
			`-H 'Content-Type: application/json' \`
			`-d '{"model":"qwen-2.5-7b","messages":[{"role":"user","content":"Hello"}]}'`
			```

			`### Switch Back to ComfyUI`

			```bash
			`# Stop orchestrator (stops all vLLM models)`
			`supervisorctl -c /workspace/supervisord.conf stop orchestrator`

			`# Start ComfyUI`
			`supervisorctl -c /workspace/supervisord.conf start comfyui`
			```

			`## Updating the Template`

			`When you make changes to code or configuration:`

			`### Update Docker Image`

			```bash
			`# 1. Make changes to Dockerfile or start.sh`
			`# 2. Push to repository`
			`git add .`
			`git commit -m "Update template configuration"`
			`git push origin main`

			`# 3. Gitea workflow auto-builds new image`

			`# 4. Terminate old pod and deploy new one with updated image`
			```

			`### Update Network Volume Data`

			```bash
			`# SSH into running pod`
			`ssh root@<pod-ip> -p <port>`

			`# Update repository`
			`cd /workspace/ai`
			`git pull`

			`# Re-run Ansible if needed`
			`ansible-playbook playbook.yml --tags <specific-tag>`

			`# Restart services`
			`supervisorctl -c /workspace/supervisord.conf restart all`
			```

			`## Troubleshooting`

			`### Pod fails to start`

			`Check logs:`
			```bash
			`# Via SSH`
			`cat /workspace/logs/supervisord.log`
			`cat /workspace/logs/comfyui.err.log`

			`# Via RunPod web terminal`
			`tail -f /workspace/logs/*.log`
			```

			`Common issues:`
			- Missing `.env` file → Create `/workspace/ai/.env` with required vars
			- Supervisor config not found → Ensure `/workspace/ai/supervisord.conf` exists
			`- Port conflicts → Check if services are already running`

			`### Tailscale not connecting`

			`Check Tailscale status:`
			```bash
			`tailscale status`
			`tailscale ip -4`
			```

			`Common issues:`
			- Missing or invalid `TAILSCALE_AUTHKEY` in `.env`
			`- Auth key expired → Generate new key in Tailscale admin`
			`- Firewall blocking → RunPod should allow Tailscale by default`

			`### Services not starting`

			`Check Supervisor:`
			```bash
			`supervisorctl -c /workspace/supervisord.conf status`
			`supervisorctl -c /workspace/supervisord.conf tail -f comfyui`
			```

			`Common issues:`
			- venv broken → Re-run `scripts/bootstrap-venvs.sh`
			`- Models not downloaded → Run Ansible playbook again`
			`- Python version mismatch → Rebuild venvs`

			`### Out of VRAM`

			`Check GPU memory:`
			```bash
			`nvidia-smi`
			```

			`RTX 4090 (24GB) capacity:`
			`- ComfyUI (FLUX Schnell): ~23GB (can't run with vLLM)`
			`- vLLM (Qwen 2.5 7B): ~14GB`
			`- vLLM (Llama 3.1 8B): ~17GB`

			`Solution: Only run one service at a time (see Service Management section)`

			`### Network volume full`

			`Check disk usage:`
			```bash
			`df -h /workspace`
			`du -sh /workspace/*`
			```

			`Clean up:`
			```bash
			`# Remove old HuggingFace cache`
			`rm -rf /workspace/huggingface_cache`

			`# Re-download essential models only`
			`cd /workspace/ai`
			`ansible-playbook playbook.yml --tags comfyui-essential`
			```

			`## Cost Optimization`

			`### Spot vs On-Demand`

			`- Spot instances: ~70% cheaper, can be interrupted`
			`- On-Demand: More expensive, guaranteed availability`

			`Recommendation: Use spot for development, on-demand for production`

			`### Network Volume Pricing`

			`- First 1TB: $0.07/GB/month`
			`- Beyond 1TB: $0.05/GB/month`

			`200GB volume cost: ~$14/month`

			`### Pod Auto-Stop`

			`Configure auto-stop in RunPod pod settings to save costs when idle:`
			`- Stop after 15 minutes idle`
			`- Stop after 1 hour idle`
			`- Manual stop only`

			`## Advanced Configuration`

			`### Custom Environment Variables`

			`Add to template or pod environment variables:`

			```bash
			`# Model cache locations`
			`HF_HOME=/workspace/huggingface_cache`
			`TRANSFORMERS_CACHE=/workspace/huggingface_cache`

			`# ComfyUI settings`
			`COMFYUI_PORT=8188`
			`COMFYUI_LISTEN=0.0.0.0`

			`# Orchestrator settings`
			`ORCHESTRATOR_PORT=9000`

			`# GPU settings`
			`CUDA_VISIBLE_DEVICES=0`
			```

			`### Multiple Network Volumes`

			`You can attach multiple network volumes for organization:`

			1. Models volume - `/workspace/models` (read-only, shared)
			2. Data volume - `/workspace/data` (read-write, per-project)

			`### Custom Startup Script`

			Override `/start.sh` behavior by creating `/workspace/custom-start.sh`:

			```bash
			`#!/bin/bash`
			`# Custom startup commands`

			`# Source default startup`
			`source /start.sh`

			`# Add your custom commands here`
			`echo "Running custom initialization..."`
			```

			`## References`

			`- [RunPod Documentation](https://docs.runpod.io/)`
			`- [RunPod Templates Overview](https://docs.runpod.io/pods/templates/overview)`
			`- [Network Volumes Guide](https://docs.runpod.io/storage/network-volumes)`
			`- [ComfyUI Documentation](https://github.com/comfyanonymous/ComfyUI)`
			`- [Supervisor Documentation](http://supervisord.org/)`
			`- [Tailscale Documentation](https://tailscale.com/kb/)`

			`## Support`

			`For issues or questions:`
			`- Check troubleshooting section above`
			- Review `/workspace/logs/` files
			`- Check RunPod community forums`
			`- Open issue in project repository`