Compare commits
79 Commits
8de88d96ac
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
| ea210c73dd | |||
| c4fd23855b | |||
| 3cc9db6632 | |||
| 4851ca10f0 | |||
| c55f41408a | |||
| 7bca766247 | |||
| 120bf7c385 | |||
| 35e0f232f9 | |||
| cd9256f09c | |||
| c9e3a5cc4f | |||
| b2b444fb98 | |||
| dcc29d20f0 | |||
| 19ad30e8c4 | |||
| 99e39ee6e6 | |||
| ed83e64727 | |||
| 18e6741596 | |||
| c83b77ebdb | |||
| 6568dd10b5 | |||
| a6e4540e84 | |||
| dbdf33e78e | |||
| 74f618bcbb | |||
| 0c7fe219f7 | |||
| 6d0a15a969 | |||
| f4dd7c7d9d | |||
| 608b5ba793 | |||
| 2e45252793 | |||
| 20ba9952a1 | |||
| 69869ec3fb | |||
| cc270c8539 | |||
| 8bdcde4b90 | |||
| 2ab43e8fd3 | |||
| 5d232c7d9b | |||
| cef233b678 | |||
| b63ddbffbd | |||
| d57a1241d2 | |||
| ef0309838c | |||
| 071a74a996 | |||
| 74b3748b23 | |||
| 87216ab26a | |||
| 9e2b19e7f6 | |||
| a80c6b931b | |||
| 64c02228d8 | |||
| 55d9bef18a | |||
| 7fc945e179 | |||
| 94ab4ae6dd | |||
| 779e76974d | |||
| f3f32c163f | |||
| e00e959543 | |||
| 0fd2eacad1 | |||
| bf402adb25 | |||
| ae1c349b55 | |||
| 66d8c82e47 | |||
| ea81634ef3 | |||
| 25bd020b93 | |||
| 904f7d3c2e | |||
| 9a964cff3c | |||
| 0999e5d29f | |||
| ec903c16c2 | |||
| 155016da97 | |||
| c81f312e9e | |||
| fe0cf487ee | |||
| 81d4058c5d | |||
| 4a575bc0da | |||
| 01a345979b | |||
| c58b5d36ba | |||
| 62fcf832da | |||
| dfde1df72f | |||
| 42a68bc0b5 | |||
| 699c8537b0 | |||
| ed4d537499 | |||
| 103bbbad51 | |||
| 92a7436716 | |||
| 6aea9d018e | |||
| e2e0927291 | |||
| a5ed2be933 | |||
| d5e37dbd3f | |||
| abcebd1d9b | |||
| 3ed3e68271 | |||
| bb3dabcba7 |
81
CLAUDE.md
81
CLAUDE.md
@@ -25,7 +25,7 @@ Root `compose.yaml` uses Docker Compose's `include` directive to orchestrate mul
|
||||
- **kit**: Unified toolkit with Vert file converter and miniPaint image editor (path-routed)
|
||||
- **jelly**: Jellyfin media server with hardware transcoding
|
||||
- **drop**: PairDrop peer-to-peer file sharing
|
||||
- **ai**: AI infrastructure with Open WebUI, Crawl4AI, and pgvector (PostgreSQL)
|
||||
- **ai**: AI infrastructure with Open WebUI, ComfyUI proxy, Crawl4AI, and pgvector (PostgreSQL)
|
||||
- **asciinema**: Terminal recording and sharing platform (PostgreSQL)
|
||||
- **restic**: Backrest backup system with restic backend
|
||||
- **netdata**: Real-time infrastructure monitoring
|
||||
@@ -451,11 +451,13 @@ AI infrastructure with Open WebUI, Crawl4AI, and dedicated PostgreSQL with pgvec
|
||||
- User signup enabled
|
||||
- Data persisted in `ai_webui_data` volume
|
||||
|
||||
- **crawl4ai**: Crawl4AI web scraping service (internal API, no public access)
|
||||
- Optimized web scraper for LLM content preparation
|
||||
- Internal API on port 11235 (not exposed via Traefik)
|
||||
- Designed for integration with Open WebUI and n8n workflows
|
||||
- Data persisted in `ai_crawl4ai_data` volume
|
||||
- **comfyui**: ComfyUI reverse proxy exposed at `comfy.ai.pivoine.art:80`
|
||||
- Nginx-based proxy to ComfyUI running on RunPod GPU server
|
||||
- Node-based UI for Flux.1 Schnell image generation workflows
|
||||
- Proxies to RunPod via Tailscale VPN (100.121.199.88:8188)
|
||||
- Protected by Authelia SSO authentication
|
||||
- WebSocket support for real-time updates
|
||||
- Stateless architecture (no data persistence on VPS)
|
||||
|
||||
**Configuration**:
|
||||
- **Claude Integration**: Uses Anthropic API with OpenAI-compatible endpoint
|
||||
@@ -476,11 +478,71 @@ AI infrastructure with Open WebUI, Crawl4AI, and dedicated PostgreSQL with pgvec
|
||||
4. Use web search feature for current information
|
||||
5. Integrate with n8n workflows for automation
|
||||
|
||||
**Flux Image Generation** (`functions/flux_image_gen.py`):
|
||||
Open WebUI function for generating images via Flux.1 Schnell on RunPod GPU:
|
||||
- Manifold function adds "Flux.1 Schnell (4-5s)" model to Open WebUI
|
||||
- Routes requests through LiteLLM → Orchestrator → RunPod Flux
|
||||
- Generates 1024x1024 images in 4-5 seconds
|
||||
- Returns images as base64-encoded markdown
|
||||
- Configuration via Valves (API base, timeout, default size)
|
||||
- **Automatically loaded via Docker volume mount** (`./functions:/app/backend/data/functions:ro`)
|
||||
|
||||
**Deployment**:
|
||||
- Function file tracked in `ai/functions/` directory
|
||||
- Automatically available after `pnpm arty up -d ai_webui`
|
||||
- No manual import required - infrastructure as code
|
||||
|
||||
See `ai/FLUX_SETUP.md` for detailed setup instructions and troubleshooting.
|
||||
|
||||
**ComfyUI Image Generation**:
|
||||
ComfyUI provides a professional node-based interface for creating Flux image generation workflows:
|
||||
|
||||
**Architecture**:
|
||||
```
|
||||
User → Traefik (VPS) → Authelia SSO → ComfyUI Proxy (nginx) → Tailscale → ComfyUI (RunPod:8188) → Flux Model (GPU)
|
||||
```
|
||||
|
||||
**Access**:
|
||||
1. Navigate to https://comfy.ai.pivoine.art
|
||||
2. Authenticate via Authelia SSO
|
||||
3. Create node-based workflows in ComfyUI interface
|
||||
4. Use Flux.1 Schnell model from HuggingFace cache at `/workspace/ComfyUI/models/huggingface_cache`
|
||||
|
||||
**RunPod Setup** (via Ansible):
|
||||
ComfyUI is installed on RunPod using the Ansible playbook at `/home/valknar/Projects/runpod/playbook.yml`:
|
||||
- Clone ComfyUI from https://github.com/comfyanonymous/ComfyUI
|
||||
- Install dependencies from `models/comfyui/requirements.txt`
|
||||
- Create model directory structure (checkpoints, unet, vae, loras, clip, controlnet)
|
||||
- Symlink Flux model from HuggingFace cache
|
||||
- Start service via `models/comfyui/start.sh` on port 8188
|
||||
|
||||
**To deploy ComfyUI on RunPod**:
|
||||
```bash
|
||||
# Run Ansible playbook with comfyui tag
|
||||
ssh -p 16186 root@213.173.110.150
|
||||
cd /workspace/ai
|
||||
ansible-playbook playbook.yml --tags comfyui --skip-tags always
|
||||
|
||||
# Start ComfyUI service
|
||||
bash models/comfyui/start.sh &
|
||||
```
|
||||
|
||||
**Proxy Configuration**:
|
||||
The VPS runs an nginx proxy (`ai/comfyui-nginx.conf`) that:
|
||||
- Listens on port 80 inside container
|
||||
- Forwards to RunPod via Tailscale (100.121.199.88:8188)
|
||||
- Supports WebSocket upgrades for real-time updates
|
||||
- Handles large file uploads (100M limit)
|
||||
- Uses extended timeouts for long-running generations (300s)
|
||||
|
||||
**Note**: ComfyUI runs directly on RunPod GPU server, not in a container. All data is stored on RunPod's `/workspace` volume.
|
||||
|
||||
**Integration Points**:
|
||||
- **n8n**: Workflow automation with AI tasks (scraping, RAG ingestion, webhooks)
|
||||
- **Mattermost**: Can send AI-generated notifications via webhooks
|
||||
- **Crawl4AI**: Internal API for advanced web scraping
|
||||
- **Claude API**: Primary LLM provider via Anthropic
|
||||
- **Flux via RunPod**: Image generation through orchestrator (GPU server) or ComfyUI
|
||||
|
||||
**Future Enhancements**:
|
||||
- GPU server integration (IONOS A10 planned)
|
||||
@@ -659,7 +721,7 @@ Backrest backup system with restic backend:
|
||||
- Retention: 7 daily, 4 weekly, 3 monthly
|
||||
|
||||
16. **ai-backup** (3 AM daily)
|
||||
- Paths: `/volumes/ai_postgres_data`, `/volumes/ai_webui_data`, `/volumes/ai_crawl4ai_data`
|
||||
- Paths: `/volumes/ai_postgres_data`, `/volumes/ai_webui_data`
|
||||
- Retention: 7 daily, 4 weekly, 6 monthly, 2 yearly
|
||||
|
||||
17. **asciinema-backup** (11 AM daily)
|
||||
@@ -670,8 +732,7 @@ Backrest backup system with restic backend:
|
||||
All Docker volumes are mounted read-only to `/volumes/` with prefixed names (e.g., `backup_core_postgres_data`) to avoid naming conflicts with other compose stacks.
|
||||
|
||||
**Configuration Management**:
|
||||
- `config.json` template in repository defines all backup plans
|
||||
- On first run, copy config into volume: `docker cp restic/config.json restic_app:/config/config.json`
|
||||
- `core/backrest/config.json` in repository defines all backup plans (bind-mounted to container)
|
||||
- Config version must be `4` for Backrest 1.10.1 compatibility
|
||||
- Backrest manages auth automatically (username: `valknar`, password set via web UI on first access)
|
||||
|
||||
@@ -709,7 +770,7 @@ Each service uses named volumes prefixed with project name:
|
||||
- `vault_data`: Vaultwarden password vault (SQLite database)
|
||||
- `joplin_data`: Joplin note-taking data
|
||||
- `jelly_config`: Jellyfin media server configuration
|
||||
- `ai_postgres_data`, `ai_webui_data`, `ai_crawl4ai_data`: AI stack databases and application data
|
||||
- `ai_postgres_data`, `ai_webui_data`: AI stack databases and application data
|
||||
- `netdata_config`: Netdata monitoring configuration
|
||||
- `restic_data`, `restic_config`, `restic_cache`, `restic_tmp`: Backrest backup system
|
||||
- `proxy_letsencrypt_data`: SSL certificates
|
||||
|
||||
@@ -406,11 +406,10 @@ THE FALCON (falcon_network)
|
||||
│ ├─ vaultwarden [vault.pivoine.art] → Password Manager
|
||||
│ └─ tandoor [tandoor.pivoine.art] → Recipe Manager
|
||||
│
|
||||
├─ 🤖 AI STACK (5 services)
|
||||
├─ 🤖 AI STACK (4 services)
|
||||
│ ├─ ai_postgres [Internal] → pgvector Database
|
||||
│ ├─ webui [ai.pivoine.art] → Open WebUI (Claude)
|
||||
│ ├─ litellm [llm.ai.pivoine.art] → API Proxy
|
||||
│ ├─ crawl4ai [Internal:11235] → Web Scraper
|
||||
│ └─ facefusion [facefusion.ai.pivoine.art] → Face AI
|
||||
│
|
||||
├─ 🛡️ NET STACK (4 services)
|
||||
@@ -435,7 +434,7 @@ THE FALCON (falcon_network)
|
||||
├─ Core: postgres_data, redis_data, backrest_*
|
||||
├─ Sexy: directus_uploads, directus_bundle
|
||||
├─ Util: pairdrop_*, joplin_data, linkwarden_*, mattermost_*, vaultwarden_data, tandoor_*
|
||||
├─ AI: ai_postgres_data, ai_webui_data, ai_crawl4ai_data, facefusion_*
|
||||
├─ AI: ai_postgres_data, ai_webui_data, facefusion_*
|
||||
├─ Net: letsencrypt_data, netdata_*
|
||||
├─ Media: jelly_config, jelly_cache, filestash_data
|
||||
└─ Dev: gitea_*, coolify_data, n8n_data, asciinema_data
|
||||
|
||||
@@ -1,430 +0,0 @@
|
||||
# Docker & NVIDIA Container Toolkit Setup
|
||||
|
||||
## Day 5: Docker Configuration on GPU Server
|
||||
|
||||
This guide sets up Docker with GPU support on your RunPod server.
|
||||
|
||||
---
|
||||
|
||||
## Step 1: Install Docker
|
||||
|
||||
### Quick Install (Recommended)
|
||||
|
||||
```bash
|
||||
# SSH into GPU server
|
||||
ssh gpu-pivoine
|
||||
|
||||
# Download and run Docker install script
|
||||
curl -fsSL https://get.docker.com -o get-docker.sh
|
||||
sh get-docker.sh
|
||||
|
||||
# Verify installation
|
||||
docker --version
|
||||
docker compose version
|
||||
```
|
||||
|
||||
Expected output:
|
||||
```
|
||||
Docker version 24.0.7, build afdd53b
|
||||
Docker Compose version v2.23.0
|
||||
```
|
||||
|
||||
### Manual Install (Alternative)
|
||||
|
||||
```bash
|
||||
# Add Docker's official GPG key
|
||||
apt-get update
|
||||
apt-get install -y ca-certificates curl gnupg
|
||||
install -m 0755 -d /etc/apt/keyrings
|
||||
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | gpg --dearmor -o /etc/apt/keyrings/docker.gpg
|
||||
chmod a+r /etc/apt/keyrings/docker.gpg
|
||||
|
||||
# Add repository
|
||||
echo \
|
||||
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
|
||||
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
|
||||
tee /etc/apt/sources.list.d/docker.list > /dev/null
|
||||
|
||||
# Install Docker
|
||||
apt-get update
|
||||
apt-get install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
|
||||
|
||||
# Start Docker
|
||||
systemctl enable docker
|
||||
systemctl start docker
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 2: Install NVIDIA Container Toolkit
|
||||
|
||||
This enables Docker containers to use the GPU.
|
||||
|
||||
```bash
|
||||
# Add NVIDIA repository
|
||||
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
|
||||
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | \
|
||||
gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
|
||||
|
||||
curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
|
||||
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
|
||||
tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
|
||||
|
||||
# Install toolkit
|
||||
apt-get update
|
||||
apt-get install -y nvidia-container-toolkit
|
||||
|
||||
# Configure Docker to use NVIDIA runtime
|
||||
nvidia-ctk runtime configure --runtime=docker
|
||||
|
||||
# Restart Docker
|
||||
systemctl restart docker
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 3: Test GPU Access in Docker
|
||||
|
||||
### Test 1: Basic CUDA Container
|
||||
|
||||
```bash
|
||||
docker run --rm --runtime=nvidia --gpus all \
|
||||
nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi
|
||||
```
|
||||
|
||||
Expected output: Same as `nvidia-smi` output showing your RTX 4090.
|
||||
|
||||
### Test 2: PyTorch Container
|
||||
|
||||
```bash
|
||||
docker run --rm --runtime=nvidia --gpus all \
|
||||
pytorch/pytorch:2.3.0-cuda12.1-cudnn8-runtime \
|
||||
python -c "import torch; print('CUDA:', torch.cuda.is_available(), 'Device:', torch.cuda.get_device_name(0))"
|
||||
```
|
||||
|
||||
Expected output:
|
||||
```
|
||||
CUDA: True Device: NVIDIA GeForce RTX 4090
|
||||
```
|
||||
|
||||
### Test 3: Multi-GPU Query (if you have multiple GPUs)
|
||||
|
||||
```bash
|
||||
docker run --rm --runtime=nvidia --gpus all \
|
||||
nvidia/cuda:12.1.0-base-ubuntu22.04 \
|
||||
bash -c "echo 'GPU Count:' && nvidia-smi --list-gpus"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 4: Configure Docker Compose with GPU Support
|
||||
|
||||
Docker Compose needs to know about NVIDIA runtime.
|
||||
|
||||
### Create daemon.json
|
||||
|
||||
```bash
|
||||
cat > /etc/docker/daemon.json << 'EOF'
|
||||
{
|
||||
"runtimes": {
|
||||
"nvidia": {
|
||||
"path": "nvidia-container-runtime",
|
||||
"runtimeArgs": []
|
||||
}
|
||||
},
|
||||
"default-runtime": "nvidia",
|
||||
"log-driver": "json-file",
|
||||
"log-opts": {
|
||||
"max-size": "10m",
|
||||
"max-file": "3"
|
||||
}
|
||||
}
|
||||
EOF
|
||||
|
||||
# Restart Docker
|
||||
systemctl restart docker
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 5: Create GPU Project Structure
|
||||
|
||||
```bash
|
||||
cd /workspace
|
||||
|
||||
# Create directory structure
|
||||
mkdir -p gpu-stack/{vllm,comfyui,training,jupyter,monitoring}
|
||||
cd gpu-stack
|
||||
|
||||
# Create .env file
|
||||
cat > .env << 'EOF'
|
||||
# GPU Stack Environment Variables
|
||||
|
||||
# Timezone
|
||||
TIMEZONE=Europe/Berlin
|
||||
|
||||
# VPN Network
|
||||
VPS_IP=10.8.0.1
|
||||
GPU_IP=10.8.0.2
|
||||
|
||||
# Model Storage
|
||||
MODELS_PATH=/workspace/models
|
||||
|
||||
# Hugging Face (optional, for private models)
|
||||
HF_TOKEN=
|
||||
|
||||
# PostgreSQL (on VPS)
|
||||
DB_HOST=10.8.0.1
|
||||
DB_PORT=5432
|
||||
DB_USER=valknar
|
||||
DB_PASSWORD=ragnarok98
|
||||
DB_NAME=openwebui
|
||||
|
||||
# Weights & Biases (optional, for training logging)
|
||||
WANDB_API_KEY=
|
||||
EOF
|
||||
|
||||
chmod 600 .env
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 6: Test Full Stack (Quick Smoke Test)
|
||||
|
||||
Let's deploy a minimal vLLM container to verify everything works:
|
||||
|
||||
```bash
|
||||
cd /workspace/gpu-stack
|
||||
|
||||
# Create test compose file
|
||||
cat > test-compose.yaml << 'EOF'
|
||||
services:
|
||||
test-vllm:
|
||||
image: vllm/vllm-openai:latest
|
||||
container_name: test_vllm
|
||||
runtime: nvidia
|
||||
environment:
|
||||
NVIDIA_VISIBLE_DEVICES: all
|
||||
command:
|
||||
- --model
|
||||
- facebook/opt-125m # Tiny model for testing
|
||||
- --host
|
||||
- 0.0.0.0
|
||||
- --port
|
||||
- 8000
|
||||
ports:
|
||||
- "8000:8000"
|
||||
deploy:
|
||||
resources:
|
||||
reservations:
|
||||
devices:
|
||||
- driver: nvidia
|
||||
count: 1
|
||||
capabilities: [gpu]
|
||||
EOF
|
||||
|
||||
# Start test
|
||||
docker compose -f test-compose.yaml up -d
|
||||
|
||||
# Wait 30 seconds for model download
|
||||
sleep 30
|
||||
|
||||
# Check logs
|
||||
docker compose -f test-compose.yaml logs
|
||||
|
||||
# Test inference
|
||||
curl http://localhost:8000/v1/completions \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"model": "facebook/opt-125m",
|
||||
"prompt": "Hello, my name is",
|
||||
"max_tokens": 10
|
||||
}'
|
||||
```
|
||||
|
||||
Expected output (JSON response with generated text).
|
||||
|
||||
**Clean up test:**
|
||||
```bash
|
||||
docker compose -f test-compose.yaml down
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 7: Install Additional Tools
|
||||
|
||||
```bash
|
||||
# Python tools
|
||||
apt install -y python3-pip python3-venv
|
||||
|
||||
# Monitoring tools
|
||||
apt install -y htop nvtop iotop
|
||||
|
||||
# Network tools
|
||||
apt install -y iperf3 tcpdump
|
||||
|
||||
# Development tools
|
||||
apt install -y build-essential
|
||||
|
||||
# Git LFS (for large model files)
|
||||
apt install -y git-lfs
|
||||
git lfs install
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 8: Configure Automatic Updates (Optional)
|
||||
|
||||
```bash
|
||||
# Install unattended-upgrades
|
||||
apt install -y unattended-upgrades
|
||||
|
||||
# Configure
|
||||
dpkg-reconfigure -plow unattended-upgrades
|
||||
|
||||
# Enable automatic security updates
|
||||
cat > /etc/apt/apt.conf.d/50unattended-upgrades << 'EOF'
|
||||
Unattended-Upgrade::Allowed-Origins {
|
||||
"${distro_id}:${distro_codename}-security";
|
||||
};
|
||||
Unattended-Upgrade::Automatic-Reboot "false";
|
||||
Unattended-Upgrade::Remove-Unused-Dependencies "true";
|
||||
EOF
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Docker can't access GPU
|
||||
|
||||
**Problem:** `docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]]`
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Verify NVIDIA runtime is configured
|
||||
docker info | grep -i runtime
|
||||
|
||||
# Should show nvidia in runtimes list
|
||||
# If not, reinstall nvidia-container-toolkit
|
||||
|
||||
# Check daemon.json
|
||||
cat /etc/docker/daemon.json
|
||||
|
||||
# Restart Docker
|
||||
systemctl restart docker
|
||||
```
|
||||
|
||||
### Permission denied on docker commands
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Add your user to docker group (if not root)
|
||||
usermod -aG docker $USER
|
||||
|
||||
# Or always use sudo
|
||||
sudo docker ...
|
||||
```
|
||||
|
||||
### Out of disk space
|
||||
|
||||
**Check usage:**
|
||||
```bash
|
||||
df -h
|
||||
du -sh /var/lib/docker
|
||||
docker system df
|
||||
```
|
||||
|
||||
**Clean up:**
|
||||
```bash
|
||||
# Remove unused images
|
||||
docker image prune -a
|
||||
|
||||
# Remove unused volumes
|
||||
docker volume prune
|
||||
|
||||
# Full cleanup
|
||||
docker system prune -a --volumes
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
Before deploying the full stack:
|
||||
|
||||
- [ ] Docker installed and running
|
||||
- [ ] `docker --version` shows 24.x or newer
|
||||
- [ ] `docker compose version` works
|
||||
- [ ] NVIDIA Container Toolkit installed
|
||||
- [ ] `docker run --gpus all nvidia/cuda:12.1.0-base nvidia-smi` works
|
||||
- [ ] PyTorch container can see GPU
|
||||
- [ ] Test vLLM deployment successful
|
||||
- [ ] /workspace directory structure created
|
||||
- [ ] .env file configured with VPN IPs
|
||||
- [ ] Additional tools installed (nvtop, htop, etc.)
|
||||
|
||||
---
|
||||
|
||||
## Performance Monitoring Commands
|
||||
|
||||
**GPU Monitoring:**
|
||||
```bash
|
||||
# Real-time GPU stats
|
||||
watch -n 1 nvidia-smi
|
||||
|
||||
# Or with nvtop (prettier)
|
||||
nvtop
|
||||
|
||||
# GPU memory usage
|
||||
nvidia-smi --query-gpu=memory.used,memory.total --format=csv
|
||||
```
|
||||
|
||||
**Docker Stats:**
|
||||
```bash
|
||||
# Container resource usage
|
||||
docker stats
|
||||
|
||||
# Specific container
|
||||
docker stats vllm --no-stream
|
||||
```
|
||||
|
||||
**System Resources:**
|
||||
```bash
|
||||
# Overall system
|
||||
htop
|
||||
|
||||
# I/O stats
|
||||
iotop
|
||||
|
||||
# Network
|
||||
iftop
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Next: Deploy Production Stack
|
||||
|
||||
Now you're ready to deploy the full GPU stack with vLLM, ComfyUI, and training tools.
|
||||
|
||||
**Proceed to:** Deploying the production docker-compose.yaml
|
||||
|
||||
**Save your progress:**
|
||||
|
||||
```bash
|
||||
cat >> /workspace/SERVER_INFO.md << 'EOF'
|
||||
|
||||
## Docker Configuration
|
||||
- Docker Version: [docker --version]
|
||||
- NVIDIA Runtime: Enabled
|
||||
- GPU Access in Containers: ✓
|
||||
- Test vLLM Deployment: Successful
|
||||
- Directory: /workspace/gpu-stack
|
||||
|
||||
## Tools Installed
|
||||
- nvtop: GPU monitoring
|
||||
- htop: System monitoring
|
||||
- Docker Compose: v2.x
|
||||
- Git LFS: Large file support
|
||||
EOF
|
||||
```
|
||||
@@ -1,173 +0,0 @@
|
||||
# GPU Server Deployment Log
|
||||
|
||||
## Current Deployment (2025-11-21)
|
||||
|
||||
### Infrastructure
|
||||
- **Provider**: RunPod (Spot Instance)
|
||||
- **GPU**: NVIDIA RTX 4090 24GB
|
||||
- **Disk**: 50GB local SSD (expanded from 20GB)
|
||||
- **Network Volume**: 922TB at `/workspace`
|
||||
- **Region**: Europe
|
||||
- **Cost**: ~$0.50/hour (~$360/month if running 24/7)
|
||||
|
||||
### Network Configuration
|
||||
- **VPN**: Tailscale (replaces WireGuard due to RunPod UDP restrictions)
|
||||
- **GPU Server Tailscale IP**: 100.100.108.13
|
||||
- **VPS Tailscale IP**: (get with `tailscale ip -4` on VPS)
|
||||
|
||||
### SSH Access
|
||||
```
|
||||
Host gpu-pivoine
|
||||
HostName 213.173.102.232
|
||||
Port 29695
|
||||
User root
|
||||
IdentityFile ~/.ssh/id_ed25519
|
||||
```
|
||||
|
||||
**Note**: RunPod Spot instances can be terminated and restarted with new ports/IPs. Update SSH config accordingly.
|
||||
|
||||
### Software Stack
|
||||
- **Python**: 3.11.10
|
||||
- **vLLM**: 0.6.4.post1 (installed with pip)
|
||||
- **PyTorch**: 2.5.1 with CUDA 12.4
|
||||
- **Tailscale**: Installed via official script
|
||||
|
||||
### vLLM Deployment
|
||||
|
||||
**Custom Server**: `ai/simple_vllm_server.py`
|
||||
- Uses `AsyncLLMEngine` directly to bypass multiprocessing issues
|
||||
- OpenAI-compatible API endpoints:
|
||||
- `GET /v1/models` - List available models
|
||||
- `POST /v1/completions` - Text completion
|
||||
- `POST /v1/chat/completions` - Chat completion
|
||||
- Default model: Qwen/Qwen2.5-7B-Instruct
|
||||
- Cache directory: `/workspace/huggingface_cache`
|
||||
|
||||
**Deployment Command**:
|
||||
```bash
|
||||
# Copy server script to GPU server
|
||||
scp ai/simple_vllm_server.py gpu-pivoine:/workspace/
|
||||
|
||||
# Start server
|
||||
ssh gpu-pivoine "cd /workspace && nohup python3 simple_vllm_server.py > vllm.log 2>&1 &"
|
||||
|
||||
# Check status
|
||||
ssh gpu-pivoine "curl http://localhost:8000/v1/models"
|
||||
```
|
||||
|
||||
**Server Configuration** (environment variables):
|
||||
- `VLLM_HOST`: 0.0.0.0 (default)
|
||||
- `VLLM_PORT`: 8000 (default)
|
||||
|
||||
### Model Configuration
|
||||
- **Model**: Qwen/Qwen2.5-7B-Instruct (no auth required)
|
||||
- **Context Length**: 4096 tokens
|
||||
- **GPU Memory**: 85% utilization
|
||||
- **Tensor Parallel**: 1 (single GPU)
|
||||
|
||||
### Known Issues & Solutions
|
||||
|
||||
#### Issue 1: vLLM Multiprocessing Errors
|
||||
**Problem**: Default vLLM v1 engine fails with ZMQ/CUDA multiprocessing errors on RunPod.
|
||||
**Solution**: Custom `AsyncLLMEngine` FastAPI server bypasses multiprocessing layer entirely.
|
||||
|
||||
#### Issue 2: Disk Space (Solved)
|
||||
**Problem**: Original 20GB disk filled up with Hugging Face cache.
|
||||
**Solution**: Expanded to 50GB and use `/workspace` for model cache.
|
||||
|
||||
#### Issue 3: Gated Models
|
||||
**Problem**: Llama models require Hugging Face authentication.
|
||||
**Solution**: Use Qwen 2.5 7B Instruct (no auth required) or set `HF_TOKEN` environment variable.
|
||||
|
||||
#### Issue 4: Spot Instance Volatility
|
||||
**Problem**: RunPod Spot instances can be terminated anytime.
|
||||
**Solution**: Accept as trade-off for cost savings. Document SSH details for quick reconnection.
|
||||
|
||||
### Monitoring
|
||||
|
||||
**Check vLLM logs**:
|
||||
```bash
|
||||
ssh gpu-pivoine "tail -f /workspace/vllm.log"
|
||||
```
|
||||
|
||||
**Check GPU usage**:
|
||||
```bash
|
||||
ssh gpu-pivoine "nvidia-smi"
|
||||
```
|
||||
|
||||
**Check Tailscale status**:
|
||||
```bash
|
||||
ssh gpu-pivoine "tailscale status"
|
||||
```
|
||||
|
||||
**Test API locally (on GPU server)**:
|
||||
```bash
|
||||
ssh gpu-pivoine "curl http://localhost:8000/v1/models"
|
||||
```
|
||||
|
||||
**Test API via Tailscale (from VPS)**:
|
||||
```bash
|
||||
curl http://100.100.108.13:8000/v1/models
|
||||
```
|
||||
|
||||
### LiteLLM Integration
|
||||
|
||||
Update VPS LiteLLM config at `ai/litellm-config-gpu.yaml`:
|
||||
|
||||
```yaml
|
||||
# Replace old WireGuard IP (10.8.0.2) with Tailscale IP
|
||||
- model_name: qwen-2.5-7b
|
||||
litellm_params:
|
||||
model: openai/qwen-2.5-7b
|
||||
api_base: http://100.100.108.13:8000/v1 # Tailscale IP
|
||||
api_key: dummy
|
||||
rpm: 1000
|
||||
tpm: 100000
|
||||
```
|
||||
|
||||
Restart LiteLLM:
|
||||
```bash
|
||||
arty restart litellm
|
||||
```
|
||||
|
||||
### Troubleshooting
|
||||
|
||||
**Server not responding**:
|
||||
1. Check if process is running: `pgrep -f simple_vllm_server`
|
||||
2. Check logs: `tail -100 /workspace/vllm.log`
|
||||
3. Check GPU availability: `nvidia-smi`
|
||||
4. Restart server: `pkill -f simple_vllm_server && python3 /workspace/simple_vllm_server.py &`
|
||||
|
||||
**Tailscale not connected**:
|
||||
1. Check status: `tailscale status`
|
||||
2. Check daemon: `ps aux | grep tailscaled`
|
||||
3. Restart: `tailscale down && tailscale up`
|
||||
|
||||
**Model download failing**:
|
||||
1. Check disk space: `df -h`
|
||||
2. Check cache directory: `ls -lah /workspace/huggingface_cache`
|
||||
3. Clear cache if needed: `rm -rf /workspace/huggingface_cache/*`
|
||||
|
||||
### Next Steps
|
||||
1. ✅ Deploy vLLM with Qwen 2.5 7B
|
||||
2. ⏳ Test API endpoints locally and via Tailscale
|
||||
3. ⏳ Update VPS LiteLLM configuration
|
||||
4. ⏳ Test end-to-end: Open WebUI → LiteLLM → vLLM
|
||||
5. ⏹️ Monitor performance and costs
|
||||
6. ⏹️ Consider adding more models (Mistral, DeepSeek Coder)
|
||||
7. ⏹️ Set up auto-stop for idle periods to save costs
|
||||
|
||||
### Cost Optimization Ideas
|
||||
1. **Auto-stop**: Configure RunPod to auto-stop after 30 minutes idle
|
||||
2. **Spot Instances**: Already using Spot for 50% cost reduction
|
||||
3. **Scheduled Operation**: Run only during business hours (8 hours/day = $120/month)
|
||||
4. **Smaller Models**: Use Mistral 7B or quantized models for lighter workloads
|
||||
5. **Pay-as-you-go**: Manually start/stop pod as needed
|
||||
|
||||
### Performance Benchmarks
|
||||
*To be measured after deployment*
|
||||
|
||||
Expected (based on RTX 4090):
|
||||
- Qwen 2.5 7B: 50-80 tokens/second
|
||||
- Context processing: ~2-3 seconds for 1000 tokens
|
||||
- First token latency: ~200-300ms
|
||||
File diff suppressed because it is too large
Load Diff
@@ -1,444 +0,0 @@
|
||||
# GPU-Enhanced AI Stack - Implementation Guide
|
||||
|
||||
Welcome to your GPU expansion setup! This directory contains everything you need to deploy a production-ready GPU server for LLM hosting, image generation, and model training.
|
||||
|
||||
## 📚 Documentation Files
|
||||
|
||||
### Planning & Architecture
|
||||
- **`GPU_EXPANSION_PLAN.md`** - Complete 70-page plan with provider comparison, architecture, and roadmap
|
||||
- **`README_GPU_SETUP.md`** - This file
|
||||
|
||||
### Step-by-Step Setup Guides
|
||||
1. **`SETUP_GUIDE.md`** - Day 1-2: RunPod account & GPU server deployment
|
||||
2. **`WIREGUARD_SETUP.md`** - Day 3-4: VPN connection between VPS and GPU server
|
||||
3. **`DOCKER_GPU_SETUP.md`** - Day 5: Docker + NVIDIA Container Toolkit configuration
|
||||
|
||||
### Configuration Files
|
||||
- **`gpu-server-compose.yaml`** - Production Docker Compose for GPU server
|
||||
- **`litellm-config-gpu.yaml`** - Updated LiteLLM config with self-hosted models
|
||||
- **`deploy-gpu-stack.sh`** - Automated deployment script
|
||||
|
||||
---
|
||||
|
||||
## 🚀 Quick Start (Week 1 Checklist)
|
||||
|
||||
### Day 1-2: RunPod & GPU Server ✓
|
||||
- [ ] Create RunPod account at https://www.runpod.io/
|
||||
- [ ] Add billing method ($50 initial credit recommended)
|
||||
- [ ] Deploy RTX 4090 pod with PyTorch template
|
||||
- [ ] Configure 500GB network volume
|
||||
- [ ] Verify SSH access
|
||||
- [ ] Test GPU with `nvidia-smi`
|
||||
- [ ] **Guide:** `SETUP_GUIDE.md`
|
||||
|
||||
### Day 3-4: Network Configuration ✓
|
||||
- [ ] Install Tailscale on VPS
|
||||
- [ ] Install Tailscale on GPU server
|
||||
- [ ] Authenticate both devices
|
||||
- [ ] Test VPN connectivity
|
||||
- [ ] Configure firewall rules
|
||||
- [ ] Verify VPS can reach GPU server
|
||||
- [ ] **Guide:** `TAILSCALE_SETUP.md`
|
||||
|
||||
### Day 5: Docker & GPU Setup ✓
|
||||
- [ ] Install Docker on GPU server
|
||||
- [ ] Install NVIDIA Container Toolkit
|
||||
- [ ] Test GPU access in containers
|
||||
- [ ] Create /workspace/gpu-stack directory
|
||||
- [ ] Copy configuration files
|
||||
- [ ] **Guide:** `DOCKER_GPU_SETUP.md`
|
||||
|
||||
### Day 6-7: Deploy Services ✓
|
||||
- [ ] Copy `gpu-server-compose.yaml` to GPU server
|
||||
- [ ] Edit `.env` with your settings
|
||||
- [ ] Run `./deploy-gpu-stack.sh`
|
||||
- [ ] Wait for vLLM to load model (~5 minutes)
|
||||
- [ ] Test vLLM: `curl http://localhost:8000/v1/models`
|
||||
- [ ] Access ComfyUI: `http://[tailscale-ip]:8188`
|
||||
- [ ] **Script:** `deploy-gpu-stack.sh`
|
||||
|
||||
---
|
||||
|
||||
## 📦 Services Included
|
||||
|
||||
### vLLM (http://[tailscale-ip]:8000)
|
||||
**Purpose:** High-performance LLM inference
|
||||
**Default Model:** Llama 3.1 8B Instruct
|
||||
**Performance:** 50-80 tokens/second on RTX 4090
|
||||
**Use for:** General chat, Q&A, code generation, summarization
|
||||
|
||||
**Switch models:**
|
||||
Edit `gpu-server-compose.yaml`, change `--model` parameter, restart:
|
||||
```bash
|
||||
docker compose restart vllm
|
||||
```
|
||||
|
||||
### ComfyUI (http://[tailscale-ip]:8188)
|
||||
**Purpose:** Advanced Stable Diffusion interface
|
||||
**Features:** FLUX, SDXL, ControlNet, LoRA
|
||||
**Use for:** Image generation, img2img, inpainting
|
||||
|
||||
**Download models:**
|
||||
Access web UI → ComfyUI Manager → Install Models
|
||||
|
||||
### JupyterLab (http://[tailscale-ip]:8888)
|
||||
**Purpose:** Interactive development environment
|
||||
**Token:** `pivoine-ai-2025` (change in `.env`)
|
||||
**Use for:** Research, experimentation, custom training scripts
|
||||
|
||||
### Axolotl (Training - on-demand)
|
||||
**Purpose:** LLM fine-tuning framework
|
||||
**Start:** `docker compose --profile training up -d axolotl`
|
||||
**Use for:** LoRA training, full fine-tuning, RLHF
|
||||
|
||||
### Netdata (http://[tailscale-ip]:19999)
|
||||
**Purpose:** System & GPU monitoring
|
||||
**Features:** Real-time metrics, GPU utilization, memory usage
|
||||
**Use for:** Performance monitoring, troubleshooting
|
||||
|
||||
---
|
||||
|
||||
## 🔧 Configuration
|
||||
|
||||
### Environment Variables (.env)
|
||||
|
||||
```bash
|
||||
# VPN Network (Tailscale)
|
||||
VPS_IP=100.x.x.x # Your VPS Tailscale IP (get with: tailscale ip -4)
|
||||
GPU_IP=100.x.x.x # GPU server Tailscale IP (get with: tailscale ip -4)
|
||||
|
||||
# Model Storage
|
||||
MODELS_PATH=/workspace/models
|
||||
|
||||
# Hugging Face Token (for gated models like Llama)
|
||||
HF_TOKEN=hf_xxxxxxxxxxxxx
|
||||
|
||||
# Weights & Biases (for training logging)
|
||||
WANDB_API_KEY=
|
||||
|
||||
# JupyterLab Access
|
||||
JUPYTER_TOKEN=pivoine-ai-2025
|
||||
|
||||
# PostgreSQL (on VPS)
|
||||
DB_HOST=100.x.x.x # Your VPS Tailscale IP
|
||||
DB_PORT=5432
|
||||
DB_USER=valknar
|
||||
DB_PASSWORD=ragnarok98
|
||||
DB_NAME=openwebui
|
||||
```
|
||||
|
||||
### Updating LiteLLM on VPS
|
||||
|
||||
After GPU server is running, update your VPS LiteLLM config:
|
||||
|
||||
```bash
|
||||
# On VPS
|
||||
cd ~/Projects/docker-compose/ai
|
||||
|
||||
# Backup current config
|
||||
cp litellm-config.yaml litellm-config.yaml.backup
|
||||
|
||||
# Copy new config with GPU models
|
||||
cp litellm-config-gpu.yaml litellm-config.yaml
|
||||
|
||||
# Restart LiteLLM
|
||||
arty restart litellm
|
||||
```
|
||||
|
||||
Now Open WebUI will have access to both Claude (API) and Llama (self-hosted)!
|
||||
|
||||
---
|
||||
|
||||
## 💰 Cost Management
|
||||
|
||||
### Current Costs (24/7 Operation)
|
||||
- **GPU Server:** RTX 4090 @ $0.50/hour = $360/month
|
||||
- **Storage:** 500GB network volume = $50/month
|
||||
- **Total:** **$410/month**
|
||||
|
||||
### Cost-Saving Options
|
||||
|
||||
**1. Pay-as-you-go (8 hours/day)**
|
||||
- GPU: $0.50 × 8 × 30 = $120/month
|
||||
- Storage: $50/month
|
||||
- **Total: $170/month**
|
||||
|
||||
**2. Auto-stop idle pods**
|
||||
RunPod can auto-stop after X minutes idle:
|
||||
- Dashboard → Pod Settings → Auto-stop after 30 minutes
|
||||
|
||||
**3. Use smaller models**
|
||||
- Mistral 7B instead of Llama 8B: Faster, cheaper GPU
|
||||
- Quantized models: 4-bit = 1/4 the VRAM
|
||||
|
||||
**4. Batch image generation**
|
||||
- Generate multiple images at once
|
||||
- Use scheduled jobs (cron) during off-peak hours
|
||||
|
||||
### Cost Tracking
|
||||
|
||||
**Check GPU usage:**
|
||||
```bash
|
||||
# On RunPod dashboard
|
||||
Billing → Usage History
|
||||
|
||||
# See hourly costs, total spent
|
||||
```
|
||||
|
||||
**Check API vs GPU savings:**
|
||||
```bash
|
||||
# On VPS, check LiteLLM logs
|
||||
docker logs ai_litellm | grep "model="
|
||||
|
||||
# Count requests to llama-3.1-8b vs claude-*
|
||||
```
|
||||
|
||||
**Expected savings:**
|
||||
- 80% of requests → self-hosted = $0 cost
|
||||
- 20% of requests → Claude = API cost
|
||||
- Break-even if currently spending >$500/month on APIs
|
||||
|
||||
---
|
||||
|
||||
## 🔍 Monitoring & Troubleshooting
|
||||
|
||||
### Check Service Status
|
||||
|
||||
```bash
|
||||
# On GPU server
|
||||
cd /workspace/gpu-stack
|
||||
|
||||
# View all services
|
||||
docker compose ps
|
||||
|
||||
# Check specific service logs
|
||||
docker compose logs -f vllm
|
||||
docker compose logs -f comfyui
|
||||
docker compose logs -f jupyter
|
||||
|
||||
# Check GPU usage
|
||||
nvidia-smi
|
||||
# or prettier:
|
||||
nvtop
|
||||
```
|
||||
|
||||
### Common Issues
|
||||
|
||||
**vLLM not loading model:**
|
||||
```bash
|
||||
# Check logs
|
||||
docker compose logs vllm
|
||||
|
||||
# Common causes:
|
||||
# - Model download in progress (wait 5-10 minutes)
|
||||
# - Out of VRAM (try smaller model)
|
||||
# - Missing HF_TOKEN (for gated models like Llama)
|
||||
```
|
||||
|
||||
**ComfyUI slow/crashing:**
|
||||
```bash
|
||||
# Check GPU memory
|
||||
nvidia-smi
|
||||
|
||||
# If VRAM full:
|
||||
# - Close vLLM temporarily
|
||||
# - Use smaller models
|
||||
# - Reduce batch size in ComfyUI
|
||||
```
|
||||
|
||||
**Can't access from VPS:**
|
||||
```bash
|
||||
# Test VPN
|
||||
ping [tailscale-ip]
|
||||
|
||||
# If fails:
|
||||
# - Check Tailscale status: tailscale status
|
||||
# - Restart Tailscale: tailscale down && tailscale up
|
||||
# - Check firewall: ufw status
|
||||
```
|
||||
|
||||
**Docker can't see GPU:**
|
||||
```bash
|
||||
# Test GPU access
|
||||
docker run --rm --runtime=nvidia --gpus all nvidia/cuda:12.1.0-base nvidia-smi
|
||||
|
||||
# If fails:
|
||||
# - Check NVIDIA driver: nvidia-smi
|
||||
# - Check nvidia-docker: nvidia-ctk --version
|
||||
# - Restart Docker: systemctl restart docker
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📊 Performance Benchmarks
|
||||
|
||||
### Expected Performance (RTX 4090)
|
||||
|
||||
**LLM Inference (vLLM):**
|
||||
- Llama 3.1 8B: 50-80 tokens/second
|
||||
- Qwen 2.5 14B: 30-50 tokens/second
|
||||
- Batch size 32: ~1500 tokens/second
|
||||
|
||||
**Image Generation (ComfyUI):**
|
||||
- SDXL (1024×1024): ~4-6 seconds
|
||||
- FLUX (1024×1024): ~8-12 seconds
|
||||
- SD 1.5 (512×512): ~1-2 seconds
|
||||
|
||||
**Training (Axolotl):**
|
||||
- LoRA fine-tuning (8B model): ~3-5 hours for 3 epochs
|
||||
- Full fine-tuning: Not recommended on 24GB VRAM
|
||||
|
||||
---
|
||||
|
||||
## 🔐 Security Best Practices
|
||||
|
||||
### Network Security
|
||||
✅ All services behind Tailscale VPN (end-to-end encrypted)
|
||||
✅ No public exposure (except RunPod's SSH)
|
||||
✅ Firewall configured (no additional ports needed)
|
||||
|
||||
### Access Control
|
||||
✅ JupyterLab password-protected
|
||||
✅ ComfyUI accessible via VPN only
|
||||
✅ vLLM internal API (no auth needed)
|
||||
|
||||
### SSH Security
|
||||
```bash
|
||||
# On GPU server, harden SSH
|
||||
nano /etc/ssh/sshd_config
|
||||
|
||||
# Set:
|
||||
PermitRootLogin prohibit-password
|
||||
PasswordAuthentication no
|
||||
PubkeyAuthentication yes
|
||||
|
||||
systemctl restart sshd
|
||||
```
|
||||
|
||||
### Regular Updates
|
||||
```bash
|
||||
# Weekly updates
|
||||
apt update && apt upgrade -y
|
||||
|
||||
# Update Docker images
|
||||
docker compose pull
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📈 Scaling Up
|
||||
|
||||
### When to Add More GPUs
|
||||
|
||||
**Current limitations (1× RTX 4090):**
|
||||
- Can run ONE of these at a time:
|
||||
- 8B LLM at full speed
|
||||
- 14B LLM at moderate speed
|
||||
- SDXL image generation
|
||||
- Training job
|
||||
|
||||
**Add 2nd GPU if:**
|
||||
- You want LLM + image gen simultaneously
|
||||
- Training + inference at same time
|
||||
- Multiple users with high demand
|
||||
|
||||
**Multi-GPU options:**
|
||||
- 2× RTX 4090: Run vLLM + ComfyUI separately ($720/month)
|
||||
- 1× A100 40GB: Larger models (70B with quantization) ($1,080/month)
|
||||
- Mix: RTX 4090 (inference) + A100 (training) (~$1,300/month)
|
||||
|
||||
### Deploying Larger Models
|
||||
|
||||
**70B models (need 2× A100 or 4× RTX 4090):**
|
||||
```yaml
|
||||
# In gpu-server-compose.yaml
|
||||
vllm:
|
||||
command:
|
||||
- --model
|
||||
- meta-llama/Meta-Llama-3.1-70B-Instruct
|
||||
- --tensor-parallel-size
|
||||
- "2" # Split across 2 GPUs
|
||||
deploy:
|
||||
resources:
|
||||
reservations:
|
||||
devices:
|
||||
- driver: nvidia
|
||||
count: 2 # Use 2 GPUs
|
||||
capabilities: [gpu]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🎯 Next Steps (Week 2+)
|
||||
|
||||
### Week 2: LLM Production Deployment
|
||||
- [ ] Test Llama 3.1 8B performance
|
||||
- [ ] Download additional models (Qwen, Mistral)
|
||||
- [ ] Configure model routing in LiteLLM
|
||||
- [ ] Set up usage monitoring
|
||||
- [ ] Benchmark tokens/second for each model
|
||||
|
||||
### Week 3: Image Generation
|
||||
- [ ] Download FLUX and SDXL models
|
||||
- [ ] Install ComfyUI Manager
|
||||
- [ ] Download ControlNet models
|
||||
- [ ] Create sample workflows
|
||||
- [ ] Test API integration with Open WebUI
|
||||
|
||||
### Week 4: Training Infrastructure
|
||||
- [ ] Prepare a sample dataset
|
||||
- [ ] Test LoRA fine-tuning with Axolotl
|
||||
- [ ] Set up Weights & Biases logging
|
||||
- [ ] Create training documentation
|
||||
- [ ] Benchmark training speed
|
||||
|
||||
---
|
||||
|
||||
## 🆘 Getting Help
|
||||
|
||||
### Resources
|
||||
- **RunPod Docs:** https://docs.runpod.io/
|
||||
- **vLLM Docs:** https://docs.vllm.ai/
|
||||
- **ComfyUI Wiki:** https://github.com/comfyanonymous/ComfyUI/wiki
|
||||
- **Axolotl Docs:** https://github.com/OpenAccess-AI-Collective/axolotl
|
||||
|
||||
### Community
|
||||
- **RunPod Discord:** https://discord.gg/runpod
|
||||
- **vLLM Discord:** https://discord.gg/vllm
|
||||
- **r/LocalLLaMA:** https://reddit.com/r/LocalLLaMA
|
||||
|
||||
### Support
|
||||
If you encounter issues:
|
||||
1. Check logs: `docker compose logs -f [service]`
|
||||
2. Check GPU: `nvidia-smi`
|
||||
3. Check VPN: `wg show`
|
||||
4. Restart service: `docker compose restart [service]`
|
||||
5. Full restart: `docker compose down && docker compose up -d`
|
||||
|
||||
---
|
||||
|
||||
## ✅ Success Criteria
|
||||
|
||||
You're ready to proceed when:
|
||||
- [ ] GPU server responds to `ping [tailscale-ip]` from VPS
|
||||
- [ ] vLLM returns models: `curl http://[tailscale-ip]:8000/v1/models`
|
||||
- [ ] ComfyUI web interface loads: `http://[tailscale-ip]:8188`
|
||||
- [ ] JupyterLab accessible with token
|
||||
- [ ] Netdata shows GPU metrics
|
||||
- [ ] Open WebUI shows both Claude and Llama models
|
||||
|
||||
**Total setup time:** 4-6 hours (if following guides sequentially)
|
||||
|
||||
---
|
||||
|
||||
## 🎉 You're All Set!
|
||||
|
||||
Your GPU-enhanced AI stack is ready. You now have:
|
||||
- ✅ Self-hosted LLM inference (saves $$$)
|
||||
- ✅ Advanced image generation (FLUX, SDXL)
|
||||
- ✅ Model training capabilities (LoRA, fine-tuning)
|
||||
- ✅ Secure VPN connection
|
||||
- ✅ Full monitoring and logging
|
||||
|
||||
Enjoy building with your new AI infrastructure! 🚀
|
||||
@@ -1,261 +0,0 @@
|
||||
# GPU Server Setup Guide - Week 1
|
||||
|
||||
## Day 1-2: RunPod Account & GPU Server
|
||||
|
||||
### Step 1: Create RunPod Account
|
||||
|
||||
1. **Go to RunPod**: https://www.runpod.io/
|
||||
2. **Sign up** with email or GitHub
|
||||
3. **Add billing method**:
|
||||
- Credit card required
|
||||
- No charges until you deploy a pod
|
||||
- Recommended: Add $50 initial credit
|
||||
|
||||
4. **Verify email** and complete account setup
|
||||
|
||||
### Step 2: Deploy Your First GPU Pod
|
||||
|
||||
#### 2.1 Navigate to Pods
|
||||
|
||||
1. Click **"Deploy"** in top menu
|
||||
2. Select **"GPU Pods"**
|
||||
|
||||
#### 2.2 Choose GPU Type
|
||||
|
||||
**Recommended: RTX 4090**
|
||||
- 24GB VRAM
|
||||
- ~$0.50/hour
|
||||
- Perfect for LLMs up to 14B params
|
||||
- Great for SDXL/FLUX
|
||||
|
||||
**Filter options:**
|
||||
- GPU Type: RTX 4090
|
||||
- GPU Count: 1
|
||||
- Sort by: Price (lowest first)
|
||||
- Region: Europe (lower latency to Germany)
|
||||
|
||||
#### 2.3 Select Template
|
||||
|
||||
Choose: **"RunPod PyTorch"** template
|
||||
- Includes: CUDA, PyTorch, Python
|
||||
- Pre-configured for GPU workloads
|
||||
- Docker pre-installed
|
||||
|
||||
**Alternative**: "Ubuntu 22.04 with CUDA 12.1" (more control)
|
||||
|
||||
#### 2.4 Configure Pod
|
||||
|
||||
**Container Settings:**
|
||||
- **Container Disk**: 50GB (temporary, auto-included)
|
||||
- **Expose Ports**:
|
||||
- Add: 22 (SSH)
|
||||
- Add: 8000 (vLLM)
|
||||
- Add: 8188 (ComfyUI)
|
||||
- Add: 8888 (JupyterLab)
|
||||
|
||||
**Volume Settings:**
|
||||
- Click **"+ Network Volume"**
|
||||
- **Name**: `gpu-models-storage`
|
||||
- **Size**: 500GB
|
||||
- **Region**: Same as pod
|
||||
- **Cost**: ~$50/month
|
||||
|
||||
**Environment Variables:**
|
||||
- Add later (not needed for initial setup)
|
||||
|
||||
#### 2.5 Deploy Pod
|
||||
|
||||
1. Review configuration
|
||||
2. Click **"Deploy On-Demand"** (not Spot for reliability)
|
||||
3. Wait 2-3 minutes for deployment
|
||||
|
||||
**Expected cost:**
|
||||
- GPU: $0.50/hour = $360/month (24/7)
|
||||
- Storage: $50/month
|
||||
- **Total: $410/month**
|
||||
|
||||
### Step 3: Access Your GPU Server
|
||||
|
||||
#### 3.1 Get Connection Info
|
||||
|
||||
Once deployed, you'll see:
|
||||
- **Pod ID**: e.g., `abc123def456`
|
||||
- **SSH Command**: `ssh root@<pod-id>.runpod.io -p 12345`
|
||||
- **Public IP**: May not be directly accessible (use SSH)
|
||||
|
||||
#### 3.2 SSH Access
|
||||
|
||||
RunPod automatically generates SSH keys for you:
|
||||
|
||||
```bash
|
||||
# Copy the SSH command from RunPod dashboard
|
||||
ssh root@abc123def456.runpod.io -p 12345
|
||||
|
||||
# First time: Accept fingerprint
|
||||
# You should now be in the GPU server!
|
||||
```
|
||||
|
||||
**Verify GPU:**
|
||||
```bash
|
||||
nvidia-smi
|
||||
```
|
||||
|
||||
Expected output:
|
||||
```
|
||||
+-----------------------------------------------------------------------------+
|
||||
| NVIDIA-SMI 535.xx Driver Version: 535.xx CUDA Version: 12.1 |
|
||||
|-------------------------------+----------------------+----------------------+
|
||||
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
|
||||
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|
||||
|===============================+======================+======================|
|
||||
| 0 NVIDIA GeForce ... Off | 00000000:01:00.0 Off | N/A |
|
||||
| 30% 45C P0 50W / 450W | 0MiB / 24564MiB | 0% Default |
|
||||
+-------------------------------+----------------------+----------------------+
|
||||
```
|
||||
|
||||
### Step 4: Initial Server Configuration
|
||||
|
||||
#### 4.1 Update System
|
||||
|
||||
```bash
|
||||
# Update package lists
|
||||
apt update
|
||||
|
||||
# Upgrade existing packages
|
||||
apt upgrade -y
|
||||
|
||||
# Install essential tools
|
||||
apt install -y \
|
||||
vim \
|
||||
htop \
|
||||
tmux \
|
||||
curl \
|
||||
wget \
|
||||
git \
|
||||
net-tools \
|
||||
iptables-persistent
|
||||
```
|
||||
|
||||
#### 4.2 Set Timezone
|
||||
|
||||
```bash
|
||||
timedatectl set-timezone Europe/Berlin
|
||||
date # Verify
|
||||
```
|
||||
|
||||
#### 4.3 Create Working Directory
|
||||
|
||||
```bash
|
||||
# Create workspace
|
||||
mkdir -p /workspace/{models,configs,data,scripts}
|
||||
|
||||
# Check network volume mount
|
||||
ls -la /workspace
|
||||
# Should show your 500GB volume
|
||||
```
|
||||
|
||||
#### 4.4 Configure SSH (Optional but Recommended)
|
||||
|
||||
**Generate your own SSH key on your local machine:**
|
||||
|
||||
```bash
|
||||
# On your local machine (not GPU server)
|
||||
ssh-keygen -t ed25519 -C "gpu-server-pivoine" -f ~/.ssh/gpu_pivoine
|
||||
|
||||
# Copy public key to GPU server
|
||||
ssh-copy-id -i ~/.ssh/gpu_pivoine.pub root@abc123def456.runpod.io -p 12345
|
||||
```
|
||||
|
||||
**Add to your local ~/.ssh/config:**
|
||||
|
||||
```bash
|
||||
Host gpu-pivoine
|
||||
HostName abc123def456.runpod.io
|
||||
Port 12345
|
||||
User root
|
||||
IdentityFile ~/.ssh/gpu_pivoine
|
||||
```
|
||||
|
||||
Now you can connect with: `ssh gpu-pivoine`
|
||||
|
||||
### Step 5: Verify GPU Access
|
||||
|
||||
Run this test:
|
||||
|
||||
```bash
|
||||
# Test CUDA
|
||||
python3 -c "import torch; print('CUDA available:', torch.cuda.is_available()); print('GPU count:', torch.cuda.device_count())"
|
||||
```
|
||||
|
||||
Expected output:
|
||||
```
|
||||
CUDA available: True
|
||||
GPU count: 1
|
||||
```
|
||||
|
||||
### Troubleshooting
|
||||
|
||||
**Problem: Can't connect via SSH**
|
||||
- Check pod is running (not stopped)
|
||||
- Verify port number in SSH command
|
||||
- Try web terminal in RunPod dashboard
|
||||
|
||||
**Problem: GPU not detected**
|
||||
- Run `nvidia-smi`
|
||||
- Check RunPod selected correct GPU type
|
||||
- Restart pod if needed
|
||||
|
||||
**Problem: Network volume not mounted**
|
||||
- Check RunPod dashboard → Volume tab
|
||||
- Verify volume is attached to pod
|
||||
- Try: `df -h` to see mounts
|
||||
|
||||
### Next Steps
|
||||
|
||||
Once SSH access works and GPU is verified:
|
||||
✅ Proceed to **Day 3-4: Network Configuration (Tailscale VPN)**
|
||||
|
||||
### Save Important Info
|
||||
|
||||
Create a file to track your setup:
|
||||
|
||||
```bash
|
||||
# On GPU server
|
||||
cat > /workspace/SERVER_INFO.md << 'EOF'
|
||||
# GPU Server Information
|
||||
|
||||
## Connection
|
||||
- SSH: ssh root@abc123def456.runpod.io -p 12345
|
||||
- Pod ID: abc123def456
|
||||
- Region: [YOUR_REGION]
|
||||
|
||||
## Hardware
|
||||
- GPU: RTX 4090 24GB
|
||||
- CPU: [Check with: lscpu]
|
||||
- RAM: [Check with: free -h]
|
||||
- Storage: 500GB network volume at /workspace
|
||||
|
||||
## Costs
|
||||
- GPU: $0.50/hour
|
||||
- Storage: $50/month
|
||||
- Total: ~$410/month (24/7)
|
||||
|
||||
## Deployed: [DATE]
|
||||
EOF
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Checkpoint ✓
|
||||
|
||||
Before moving to Day 3, verify:
|
||||
- [ ] RunPod account created and billing added
|
||||
- [ ] RTX 4090 pod deployed successfully
|
||||
- [ ] 500GB network volume attached
|
||||
- [ ] SSH access working
|
||||
- [ ] `nvidia-smi` shows GPU
|
||||
- [ ] `torch.cuda.is_available()` returns True
|
||||
- [ ] Timezone set to Europe/Berlin
|
||||
- [ ] Essential tools installed
|
||||
|
||||
**Ready for Tailscale setup? Let's go!**
|
||||
@@ -1,417 +0,0 @@
|
||||
# Tailscale VPN Setup - Better Alternative to WireGuard
|
||||
|
||||
## Why Tailscale?
|
||||
|
||||
RunPod doesn't support UDP ports, which blocks WireGuard. Tailscale solves this by:
|
||||
- ✅ Works over HTTPS (TCP) - no UDP needed
|
||||
- ✅ Zero configuration - automatic setup
|
||||
- ✅ Free for personal use
|
||||
- ✅ Built on WireGuard (same security)
|
||||
- ✅ Automatic NAT traversal
|
||||
- ✅ Peer-to-peer when possible (low latency)
|
||||
|
||||
---
|
||||
|
||||
## Step 1: Create Tailscale Account
|
||||
|
||||
1. Go to: https://tailscale.com/
|
||||
2. Click **"Get Started"**
|
||||
3. Sign up with **GitHub** or **Google** (easiest)
|
||||
4. You'll be redirected to the Tailscale admin console
|
||||
|
||||
**No credit card required!** Free tier is perfect for our use case.
|
||||
|
||||
---
|
||||
|
||||
## Step 2: Install Tailscale on VPS
|
||||
|
||||
**SSH into your VPS:**
|
||||
|
||||
```bash
|
||||
ssh root@vps
|
||||
```
|
||||
|
||||
**Install Tailscale:**
|
||||
|
||||
```bash
|
||||
# Download and run install script
|
||||
curl -fsSL https://tailscale.com/install.sh | sh
|
||||
|
||||
# Start Tailscale
|
||||
tailscale up
|
||||
|
||||
# You'll see a URL like:
|
||||
# https://login.tailscale.com/a/xxxxxxxxxx
|
||||
```
|
||||
|
||||
**Authenticate:**
|
||||
1. Copy the URL and open in browser
|
||||
2. Click **"Connect"** to authorize the device
|
||||
3. Name it: `pivoine-vps`
|
||||
|
||||
**Check status:**
|
||||
```bash
|
||||
tailscale status
|
||||
```
|
||||
|
||||
You should see your VPS listed with an IP like `100.x.x.x`
|
||||
|
||||
**Save your VPS Tailscale IP:**
|
||||
```bash
|
||||
tailscale ip -4
|
||||
# Example output: 100.101.102.103
|
||||
```
|
||||
|
||||
**Write this down - you'll need it!**
|
||||
|
||||
---
|
||||
|
||||
## Step 3: Install Tailscale on GPU Server
|
||||
|
||||
**SSH into your RunPod GPU server:**
|
||||
|
||||
```bash
|
||||
ssh root@abc123def456-12345678.runpod.io -p 12345
|
||||
```
|
||||
|
||||
**Install Tailscale:**
|
||||
|
||||
```bash
|
||||
# Download and run install script
|
||||
curl -fsSL https://tailscale.com/install.sh | sh
|
||||
|
||||
# Start Tailscale
|
||||
tailscale up --advertise-tags=tag:gpu
|
||||
|
||||
# You'll see another URL
|
||||
```
|
||||
|
||||
**Authenticate:**
|
||||
1. Copy the URL and open in browser
|
||||
2. Click **"Connect"**
|
||||
3. Name it: `gpu-runpod`
|
||||
|
||||
**Check status:**
|
||||
```bash
|
||||
tailscale status
|
||||
```
|
||||
|
||||
You should now see BOTH devices:
|
||||
- `pivoine-vps` - 100.x.x.x
|
||||
- `gpu-runpod` - 100.x.x.x
|
||||
|
||||
**Save your GPU server Tailscale IP:**
|
||||
```bash
|
||||
tailscale ip -4
|
||||
# Example output: 100.104.105.106
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 4: Test Connectivity
|
||||
|
||||
**From VPS, ping GPU server:**
|
||||
|
||||
```bash
|
||||
# SSH into VPS
|
||||
ssh root@vps
|
||||
|
||||
# Ping GPU server (use its Tailscale IP)
|
||||
ping 100.104.105.106 -c 4
|
||||
```
|
||||
|
||||
Expected output:
|
||||
```
|
||||
PING 100.104.105.106 (100.104.105.106) 56(84) bytes of data.
|
||||
64 bytes from 100.104.105.106: icmp_seq=1 ttl=64 time=15.3 ms
|
||||
64 bytes from 100.104.105.106: icmp_seq=2 ttl=64 time=14.8 ms
|
||||
...
|
||||
```
|
||||
|
||||
**From GPU server, ping VPS:**
|
||||
|
||||
```bash
|
||||
# SSH into GPU server
|
||||
ssh root@abc123def456-12345678.runpod.io -p 12345
|
||||
|
||||
# Ping VPS (use its Tailscale IP)
|
||||
ping 100.101.102.103 -c 4
|
||||
```
|
||||
|
||||
**Both should work!** ✅
|
||||
|
||||
---
|
||||
|
||||
## Step 5: Update Configuration Files
|
||||
|
||||
Now update the IP addresses in your configs to use Tailscale IPs.
|
||||
|
||||
### On GPU Server (.env file)
|
||||
|
||||
**Edit your .env file:**
|
||||
|
||||
```bash
|
||||
# On GPU server
|
||||
cd /workspace/gpu-stack
|
||||
|
||||
nano .env
|
||||
```
|
||||
|
||||
**Update these lines:**
|
||||
```bash
|
||||
# VPN Network (use your actual Tailscale IPs)
|
||||
VPS_IP=100.101.102.103 # Your VPS Tailscale IP
|
||||
GPU_IP=100.104.105.106 # Your GPU Tailscale IP
|
||||
|
||||
# PostgreSQL (on VPS)
|
||||
DB_HOST=100.101.102.103 # Your VPS Tailscale IP
|
||||
DB_PORT=5432
|
||||
```
|
||||
|
||||
Save and exit (Ctrl+X, Y, Enter)
|
||||
|
||||
### On VPS (LiteLLM config)
|
||||
|
||||
**Edit your LiteLLM config:**
|
||||
|
||||
```bash
|
||||
# On VPS
|
||||
ssh root@vps
|
||||
cd ~/Projects/docker-compose/ai
|
||||
|
||||
nano litellm-config-gpu.yaml
|
||||
```
|
||||
|
||||
**Update the GPU server IP:**
|
||||
|
||||
```yaml
|
||||
# Find this section and update IP:
|
||||
- model_name: llama-3.1-8b
|
||||
litellm_params:
|
||||
model: openai/meta-llama/Meta-Llama-3.1-8B-Instruct
|
||||
api_base: http://100.104.105.106:8000/v1 # Use GPU Tailscale IP
|
||||
api_key: dummy
|
||||
```
|
||||
|
||||
Save and exit.
|
||||
|
||||
---
|
||||
|
||||
## Step 6: Verify PostgreSQL Access
|
||||
|
||||
**From GPU server, test database connection:**
|
||||
|
||||
```bash
|
||||
# Install PostgreSQL client
|
||||
apt install -y postgresql-client
|
||||
|
||||
# Test connection (use your VPS Tailscale IP)
|
||||
psql -h 100.101.102.103 -U valknar -d openwebui -c "SELECT 1;"
|
||||
```
|
||||
|
||||
**If this fails, allow Tailscale network on VPS PostgreSQL:**
|
||||
|
||||
```bash
|
||||
# On VPS
|
||||
ssh root@vps
|
||||
|
||||
# Check if postgres allows Tailscale network
|
||||
docker exec core_postgres cat /var/lib/postgresql/data/pg_hba.conf | grep 100
|
||||
|
||||
# If not present, add it:
|
||||
docker exec -it core_postgres bash
|
||||
|
||||
# Inside container:
|
||||
echo "host all all 100.0.0.0/8 scram-sha-256" >> /var/lib/postgresql/data/pg_hba.conf
|
||||
|
||||
# Restart postgres
|
||||
exit
|
||||
docker restart core_postgres
|
||||
```
|
||||
|
||||
Try connecting again - should work now!
|
||||
|
||||
---
|
||||
|
||||
## Tailscale Management
|
||||
|
||||
### View Connected Devices
|
||||
|
||||
**Web dashboard:**
|
||||
https://login.tailscale.com/admin/machines
|
||||
|
||||
You'll see all your devices with their Tailscale IPs.
|
||||
|
||||
**Command line:**
|
||||
```bash
|
||||
tailscale status
|
||||
```
|
||||
|
||||
### Disconnect/Reconnect
|
||||
|
||||
```bash
|
||||
# Stop Tailscale
|
||||
tailscale down
|
||||
|
||||
# Start Tailscale
|
||||
tailscale up
|
||||
```
|
||||
|
||||
### Remove Device
|
||||
|
||||
From web dashboard:
|
||||
1. Click on device
|
||||
2. Click "..." menu
|
||||
3. Select "Disable" or "Delete"
|
||||
|
||||
---
|
||||
|
||||
## Advantages Over WireGuard
|
||||
|
||||
✅ **Works anywhere** - No UDP ports needed
|
||||
✅ **Auto-reconnect** - Survives network changes
|
||||
✅ **Multiple devices** - Easy to add laptop, phone, etc.
|
||||
✅ **NAT traversal** - Direct peer-to-peer when possible
|
||||
✅ **Access Control** - Manage from web dashboard
|
||||
✅ **Monitoring** - See connection status in real-time
|
||||
|
||||
---
|
||||
|
||||
## Security Notes
|
||||
|
||||
🔒 **Tailscale is secure:**
|
||||
- End-to-end encrypted (WireGuard)
|
||||
- Zero-trust architecture
|
||||
- No Tailscale servers can see your traffic
|
||||
- Only authenticated devices can connect
|
||||
|
||||
🔒 **Access control:**
|
||||
- Only devices you authorize can join
|
||||
- Revoke access anytime from dashboard
|
||||
- Set ACLs for fine-grained control
|
||||
|
||||
---
|
||||
|
||||
## Network Reference (Updated)
|
||||
|
||||
**Old (WireGuard):**
|
||||
- VPS: `10.8.0.1`
|
||||
- GPU: `10.8.0.2`
|
||||
|
||||
**New (Tailscale):**
|
||||
- VPS: `100.101.102.103` (example - use your actual IP)
|
||||
- GPU: `100.104.105.106` (example - use your actual IP)
|
||||
|
||||
**All services now accessible via Tailscale:**
|
||||
|
||||
**From VPS to GPU:**
|
||||
- vLLM: `http://100.104.105.106:8000`
|
||||
- ComfyUI: `http://100.104.105.106:8188`
|
||||
- JupyterLab: `http://100.104.105.106:8888`
|
||||
- Netdata: `http://100.104.105.106:19999`
|
||||
|
||||
**From GPU to VPS:**
|
||||
- PostgreSQL: `100.101.102.103:5432`
|
||||
- Redis: `100.101.102.103:6379`
|
||||
- LiteLLM: `http://100.101.102.103:4000`
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Can't ping between devices
|
||||
|
||||
**Check Tailscale status:**
|
||||
```bash
|
||||
tailscale status
|
||||
```
|
||||
|
||||
Both devices should show "active" or "online".
|
||||
|
||||
**Check connectivity:**
|
||||
```bash
|
||||
tailscale ping 100.104.105.106
|
||||
```
|
||||
|
||||
**Restart Tailscale:**
|
||||
```bash
|
||||
tailscale down && tailscale up
|
||||
```
|
||||
|
||||
### PostgreSQL connection refused
|
||||
|
||||
**Check if postgres is listening on all interfaces:**
|
||||
```bash
|
||||
# On VPS
|
||||
docker exec core_postgres cat /var/lib/postgresql/data/postgresql.conf | grep listen_addresses
|
||||
```
|
||||
|
||||
Should show: `listen_addresses = '*'`
|
||||
|
||||
**Check pg_hba.conf allows Tailscale network:**
|
||||
```bash
|
||||
docker exec core_postgres cat /var/lib/postgresql/data/pg_hba.conf | grep 100
|
||||
```
|
||||
|
||||
Should have line:
|
||||
```
|
||||
host all all 100.0.0.0/8 scram-sha-256
|
||||
```
|
||||
|
||||
### Device not showing in network
|
||||
|
||||
**Re-authenticate:**
|
||||
```bash
|
||||
tailscale logout
|
||||
tailscale up
|
||||
# Click the new URL to re-authenticate
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
Before proceeding:
|
||||
- [ ] Tailscale account created
|
||||
- [ ] Tailscale installed on VPS
|
||||
- [ ] Tailscale installed on GPU server
|
||||
- [ ] Both devices visible in `tailscale status`
|
||||
- [ ] VPS can ping GPU server (via Tailscale IP)
|
||||
- [ ] GPU server can ping VPS (via Tailscale IP)
|
||||
- [ ] PostgreSQL accessible from GPU server
|
||||
- [ ] .env file updated with Tailscale IPs
|
||||
- [ ] LiteLLM config updated with GPU Tailscale IP
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
✅ **Network configured!** Proceed to Docker & GPU setup:
|
||||
|
||||
```bash
|
||||
cat /home/valknar/Projects/docker-compose/ai/DOCKER_GPU_SETUP.md
|
||||
```
|
||||
|
||||
**Your Tailscale IPs (save these!):**
|
||||
- VPS: `__________________` (from `tailscale ip -4` on VPS)
|
||||
- GPU: `__________________` (from `tailscale ip -4` on GPU server)
|
||||
|
||||
---
|
||||
|
||||
## Bonus: Add Your Local Machine
|
||||
|
||||
Want to access GPU server from your laptop?
|
||||
|
||||
```bash
|
||||
# On your local machine
|
||||
curl -fsSL https://tailscale.com/install.sh | sh
|
||||
tailscale up
|
||||
|
||||
# Now you can SSH directly via Tailscale:
|
||||
ssh root@100.104.105.106
|
||||
|
||||
# Or access ComfyUI in browser:
|
||||
# http://100.104.105.106:8188
|
||||
```
|
||||
|
||||
No more port forwarding needed! 🎉
|
||||
@@ -1,393 +0,0 @@
|
||||
# WireGuard VPN Setup - Connecting GPU Server to VPS
|
||||
|
||||
## Day 3-4: Network Configuration
|
||||
|
||||
This guide connects your RunPod GPU server to your VPS via WireGuard VPN, enabling secure, low-latency communication.
|
||||
|
||||
### Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────┐ ┌──────────────────────────────┐
|
||||
│ VPS (pivoine.art) │ │ GPU Server (RunPod) │
|
||||
│ 10.8.0.1 (WireGuard) │◄───────►│ 10.8.0.2 (WireGuard) │
|
||||
├─────────────────────────────┤ ├──────────────────────────────┤
|
||||
│ - LiteLLM Proxy │ │ - vLLM (10.8.0.2:8000) │
|
||||
│ - Open WebUI │ │ - ComfyUI (10.8.0.2:8188) │
|
||||
│ - PostgreSQL │ │ - Training │
|
||||
└─────────────────────────────┘ └──────────────────────────────┘
|
||||
```
|
||||
|
||||
### Prerequisites
|
||||
|
||||
- ✅ VPS with root access
|
||||
- ✅ GPU server with root access
|
||||
- ✅ Both servers have public IPs
|
||||
|
||||
---
|
||||
|
||||
## Method 1: Using Existing wg-easy (Recommended)
|
||||
|
||||
You already have `wg-easy` running on your VPS. Let's use it!
|
||||
|
||||
### Step 1: Access wg-easy Dashboard
|
||||
|
||||
**On your local machine:**
|
||||
|
||||
1. Open browser: https://vpn.pivoine.art (or whatever your wg-easy URL is)
|
||||
2. Login with admin password
|
||||
|
||||
**Don't have wg-easy set up? Skip to Method 2.**
|
||||
|
||||
### Step 2: Create GPU Server Client
|
||||
|
||||
1. In wg-easy dashboard, click **"+ New Client"**
|
||||
2. **Name**: `gpu-server-runpod`
|
||||
3. Click **"Create"**
|
||||
4. **Download** configuration file (or copy QR code data)
|
||||
|
||||
You'll get a file like: `gpu-server-runpod.conf`
|
||||
|
||||
### Step 3: Install WireGuard on GPU Server
|
||||
|
||||
**SSH into GPU server:**
|
||||
|
||||
```bash
|
||||
ssh gpu-pivoine # or your SSH command
|
||||
|
||||
# Install WireGuard
|
||||
apt update
|
||||
apt install -y wireguard wireguard-tools
|
||||
```
|
||||
|
||||
### Step 4: Configure WireGuard on GPU Server
|
||||
|
||||
**Upload the config file:**
|
||||
|
||||
```bash
|
||||
# On your local machine, copy the config to GPU server
|
||||
scp gpu-server-runpod.conf gpu-pivoine:/etc/wireguard/wg0.conf
|
||||
|
||||
# Or manually create it on GPU server:
|
||||
nano /etc/wireguard/wg0.conf
|
||||
# Paste the configuration from wg-easy
|
||||
```
|
||||
|
||||
**Example config (yours will be different):**
|
||||
```ini
|
||||
[Interface]
|
||||
PrivateKey = <PRIVATE_KEY_FROM_WG_EASY>
|
||||
Address = 10.8.0.2/24
|
||||
DNS = 10.8.0.1
|
||||
|
||||
[Peer]
|
||||
PublicKey = <VPS_PUBLIC_KEY_FROM_WG_EASY>
|
||||
PresharedKey = <PRESHARED_KEY>
|
||||
AllowedIPs = 10.8.0.0/24
|
||||
Endpoint = <VPS_PUBLIC_IP>:51820
|
||||
PersistentKeepalive = 25
|
||||
```
|
||||
|
||||
### Step 5: Start WireGuard
|
||||
|
||||
```bash
|
||||
# Enable IP forwarding
|
||||
echo "net.ipv4.ip_forward=1" >> /etc/sysctl.conf
|
||||
sysctl -p
|
||||
|
||||
# Set permissions
|
||||
chmod 600 /etc/wireguard/wg0.conf
|
||||
|
||||
# Start WireGuard
|
||||
systemctl enable wg-quick@wg0
|
||||
systemctl start wg-quick@wg0
|
||||
|
||||
# Check status
|
||||
systemctl status wg-quick@wg0
|
||||
wg show
|
||||
```
|
||||
|
||||
Expected output:
|
||||
```
|
||||
interface: wg0
|
||||
public key: <GPU_SERVER_PUBLIC_KEY>
|
||||
private key: (hidden)
|
||||
listening port: 51820
|
||||
|
||||
peer: <VPS_PUBLIC_KEY>
|
||||
endpoint: <VPS_IP>:51820
|
||||
allowed ips: 10.8.0.0/24
|
||||
latest handshake: 1 second ago
|
||||
transfer: 1.2 KiB received, 892 B sent
|
||||
persistent keepalive: every 25 seconds
|
||||
```
|
||||
|
||||
### Step 6: Test Connectivity
|
||||
|
||||
**From GPU server, ping VPS:**
|
||||
|
||||
```bash
|
||||
ping 10.8.0.1 -c 4
|
||||
```
|
||||
|
||||
Expected output:
|
||||
```
|
||||
PING 10.8.0.1 (10.8.0.1) 56(84) bytes of data.
|
||||
64 bytes from 10.8.0.1: icmp_seq=1 ttl=64 time=25.3 ms
|
||||
64 bytes from 10.8.0.1: icmp_seq=2 ttl=64 time=24.8 ms
|
||||
...
|
||||
```
|
||||
|
||||
**From VPS, ping GPU server:**
|
||||
|
||||
```bash
|
||||
ssh root@vps
|
||||
ping 10.8.0.2 -c 4
|
||||
```
|
||||
|
||||
**Test PostgreSQL access from GPU server:**
|
||||
|
||||
```bash
|
||||
# On GPU server
|
||||
apt install -y postgresql-client
|
||||
|
||||
# Try connecting to VPS postgres
|
||||
psql -h 10.8.0.1 -U valknar -d openwebui -c "SELECT 1;"
|
||||
# Should work if postgres allows 10.8.0.0/24
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Method 2: Manual WireGuard Setup (If no wg-easy)
|
||||
|
||||
### Step 1: Install WireGuard on Both Servers
|
||||
|
||||
**On VPS:**
|
||||
```bash
|
||||
ssh root@vps
|
||||
apt update
|
||||
apt install -y wireguard wireguard-tools
|
||||
```
|
||||
|
||||
**On GPU Server:**
|
||||
```bash
|
||||
ssh gpu-pivoine
|
||||
apt update
|
||||
apt install -y wireguard wireguard-tools
|
||||
```
|
||||
|
||||
### Step 2: Generate Keys
|
||||
|
||||
**On VPS:**
|
||||
```bash
|
||||
cd /etc/wireguard
|
||||
umask 077
|
||||
wg genkey | tee vps-private.key | wg pubkey > vps-public.key
|
||||
```
|
||||
|
||||
**On GPU Server:**
|
||||
```bash
|
||||
cd /etc/wireguard
|
||||
umask 077
|
||||
wg genkey | tee gpu-private.key | wg pubkey > gpu-public.key
|
||||
```
|
||||
|
||||
### Step 3: Create Config on VPS
|
||||
|
||||
**On VPS (`/etc/wireguard/wg0.conf`):**
|
||||
|
||||
```bash
|
||||
cat > /etc/wireguard/wg0.conf << 'EOF'
|
||||
[Interface]
|
||||
PrivateKey = <VPS_PRIVATE_KEY>
|
||||
Address = 10.8.0.1/24
|
||||
ListenPort = 51820
|
||||
SaveConfig = false
|
||||
|
||||
# GPU Server Peer
|
||||
[Peer]
|
||||
PublicKey = <GPU_PUBLIC_KEY>
|
||||
AllowedIPs = 10.8.0.2/32
|
||||
PersistentKeepalive = 25
|
||||
EOF
|
||||
```
|
||||
|
||||
Replace `<VPS_PRIVATE_KEY>` with contents of `vps-private.key`
|
||||
Replace `<GPU_PUBLIC_KEY>` with contents from GPU server's `gpu-public.key`
|
||||
|
||||
### Step 4: Create Config on GPU Server
|
||||
|
||||
**On GPU Server (`/etc/wireguard/wg0.conf`):**
|
||||
|
||||
```bash
|
||||
cat > /etc/wireguard/wg0.conf << 'EOF'
|
||||
[Interface]
|
||||
PrivateKey = <GPU_PRIVATE_KEY>
|
||||
Address = 10.8.0.2/24
|
||||
|
||||
[Peer]
|
||||
PublicKey = <VPS_PUBLIC_KEY>
|
||||
AllowedIPs = 10.8.0.0/24
|
||||
Endpoint = <VPS_PUBLIC_IP>:51820
|
||||
PersistentKeepalive = 25
|
||||
EOF
|
||||
```
|
||||
|
||||
Replace:
|
||||
- `<GPU_PRIVATE_KEY>` with contents of `gpu-private.key`
|
||||
- `<VPS_PUBLIC_KEY>` with contents from VPS's `vps-public.key`
|
||||
- `<VPS_PUBLIC_IP>` with your VPS's public IP address
|
||||
|
||||
### Step 5: Start WireGuard on Both
|
||||
|
||||
**On VPS:**
|
||||
```bash
|
||||
# Enable IP forwarding
|
||||
echo "net.ipv4.ip_forward=1" >> /etc/sysctl.conf
|
||||
sysctl -p
|
||||
|
||||
# Start WireGuard
|
||||
chmod 600 /etc/wireguard/wg0.conf
|
||||
systemctl enable wg-quick@wg0
|
||||
systemctl start wg-quick@wg0
|
||||
```
|
||||
|
||||
**On GPU Server:**
|
||||
```bash
|
||||
# Enable IP forwarding
|
||||
echo "net.ipv4.ip_forward=1" >> /etc/sysctl.conf
|
||||
sysctl -p
|
||||
|
||||
# Start WireGuard
|
||||
chmod 600 /etc/wireguard/wg0.conf
|
||||
systemctl enable wg-quick@wg0
|
||||
systemctl start wg-quick@wg0
|
||||
```
|
||||
|
||||
### Step 6: Configure Firewall
|
||||
|
||||
**On VPS:**
|
||||
```bash
|
||||
# Allow WireGuard port
|
||||
ufw allow 51820/udp
|
||||
ufw reload
|
||||
|
||||
# Or with iptables
|
||||
iptables -A INPUT -p udp --dport 51820 -j ACCEPT
|
||||
iptables-save > /etc/iptables/rules.v4
|
||||
```
|
||||
|
||||
**On GPU Server (RunPod):**
|
||||
```bash
|
||||
# Allow WireGuard
|
||||
ufw allow 51820/udp
|
||||
ufw reload
|
||||
```
|
||||
|
||||
### Step 7: Test Connection
|
||||
|
||||
Same as Method 1 Step 6.
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### No handshake
|
||||
|
||||
**Check:**
|
||||
```bash
|
||||
wg show
|
||||
```
|
||||
|
||||
If "latest handshake" shows "never":
|
||||
1. Verify public keys are correct (easy to swap them!)
|
||||
2. Check firewall allows UDP 51820
|
||||
3. Verify endpoint IP is correct
|
||||
4. Check `systemctl status wg-quick@wg0` for errors
|
||||
|
||||
### Can ping but can't access services
|
||||
|
||||
**On VPS, check PostgreSQL allows 10.8.0.0/24:**
|
||||
|
||||
```bash
|
||||
# Edit postgresql.conf
|
||||
nano /var/lib/postgresql/data/postgresql.conf
|
||||
# Add or modify:
|
||||
listen_addresses = '*'
|
||||
|
||||
# Edit pg_hba.conf
|
||||
nano /var/lib/postgresql/data/pg_hba.conf
|
||||
# Add:
|
||||
host all all 10.8.0.0/24 scram-sha-256
|
||||
|
||||
# Restart
|
||||
docker restart core_postgres
|
||||
```
|
||||
|
||||
### WireGuard won't start
|
||||
|
||||
```bash
|
||||
# Check logs
|
||||
journalctl -u wg-quick@wg0 -n 50
|
||||
|
||||
# Common issues:
|
||||
# - Wrong permissions: chmod 600 /etc/wireguard/wg0.conf
|
||||
# - Invalid keys: regenerate with wg genkey
|
||||
# - Port already in use: lsof -i :51820
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
Before proceeding to Day 5:
|
||||
|
||||
- [ ] WireGuard installed on both VPS and GPU server
|
||||
- [ ] VPN tunnel established (wg show shows handshake)
|
||||
- [ ] GPU server can ping VPS (10.8.0.1)
|
||||
- [ ] VPS can ping GPU server (10.8.0.2)
|
||||
- [ ] Firewall allows WireGuard (UDP 51820)
|
||||
- [ ] PostgreSQL accessible from GPU server
|
||||
- [ ] WireGuard starts on boot (systemctl enable)
|
||||
|
||||
---
|
||||
|
||||
## Network Reference
|
||||
|
||||
**VPN IPs:**
|
||||
- VPS: `10.8.0.1`
|
||||
- GPU Server: `10.8.0.2`
|
||||
|
||||
**Service Access from GPU Server:**
|
||||
- PostgreSQL: `postgresql://valknar:password@10.8.0.1:5432/dbname`
|
||||
- Redis: `10.8.0.1:6379`
|
||||
- LiteLLM: `http://10.8.0.1:4000`
|
||||
- Mailpit: `10.8.0.1:1025`
|
||||
|
||||
**Service Access from VPS:**
|
||||
- vLLM: `http://10.8.0.2:8000`
|
||||
- ComfyUI: `http://10.8.0.2:8188`
|
||||
- JupyterLab: `http://10.8.0.2:8888`
|
||||
|
||||
---
|
||||
|
||||
## Next: Docker & GPU Setup
|
||||
|
||||
Once VPN is working, proceed to **Day 5: Docker & NVIDIA Container Toolkit Setup**.
|
||||
|
||||
**Save connection info:**
|
||||
|
||||
```bash
|
||||
# On GPU server
|
||||
cat >> /workspace/SERVER_INFO.md << 'EOF'
|
||||
|
||||
## VPN Configuration
|
||||
- VPN IP: 10.8.0.2
|
||||
- VPS VPN IP: 10.8.0.1
|
||||
- WireGuard Status: Active
|
||||
- Latest Handshake: [Check with: wg show]
|
||||
|
||||
## Network Access
|
||||
- Can reach VPS services: ✓
|
||||
- VPS can reach GPU services: ✓
|
||||
EOF
|
||||
```
|
||||
290
ai/compose.yaml
290
ai/compose.yaml
@@ -15,7 +15,7 @@ services:
|
||||
- ai_postgres_data:/var/lib/postgresql/data
|
||||
- ./postgres/init:/docker-entrypoint-initdb.d
|
||||
healthcheck:
|
||||
test: ['CMD-SHELL', 'pg_isready -U ${AI_DB_USER}']
|
||||
test: ["CMD-SHELL", "pg_isready -U ${AI_DB_USER}"]
|
||||
interval: 30s
|
||||
timeout: 10s
|
||||
retries: 3
|
||||
@@ -38,6 +38,10 @@ services:
|
||||
OPENAI_API_BASE_URLS: http://litellm:4000
|
||||
OPENAI_API_KEYS: ${AI_LITELLM_API_KEY}
|
||||
|
||||
# Disable Ollama (we only use LiteLLM)
|
||||
ENABLE_OLLAMA_API: false
|
||||
OLLAMA_BASE_URLS: ""
|
||||
|
||||
# WebUI configuration
|
||||
WEBUI_NAME: ${AI_WEBUI_NAME:-Pivoine AI}
|
||||
WEBUI_URL: https://${AI_TRAEFIK_HOST}
|
||||
@@ -62,102 +66,87 @@ services:
|
||||
|
||||
volumes:
|
||||
- ai_webui_data:/app/backend/data
|
||||
- ./functions:/app/backend/data/functions:ro
|
||||
depends_on:
|
||||
- ai_postgres
|
||||
- litellm
|
||||
networks:
|
||||
- compose_network
|
||||
labels:
|
||||
- 'traefik.enable=${AI_TRAEFIK_ENABLED}'
|
||||
- "traefik.enable=${AI_TRAEFIK_ENABLED}"
|
||||
# HTTP to HTTPS redirect
|
||||
- 'traefik.http.middlewares.${AI_COMPOSE_PROJECT_NAME}-redirect-web-secure.redirectscheme.scheme=https'
|
||||
- 'traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-web.middlewares=${AI_COMPOSE_PROJECT_NAME}-redirect-web-secure'
|
||||
- 'traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-web.rule=Host(`${AI_TRAEFIK_HOST}`)'
|
||||
- 'traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-web.entrypoints=web'
|
||||
- "traefik.http.middlewares.${AI_COMPOSE_PROJECT_NAME}-redirect-web-secure.redirectscheme.scheme=https"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-web.middlewares=${AI_COMPOSE_PROJECT_NAME}-redirect-web-secure"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-web.rule=Host(`${AI_TRAEFIK_HOST}`)"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-web.entrypoints=web"
|
||||
# HTTPS router
|
||||
- 'traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-web-secure.rule=Host(`${AI_TRAEFIK_HOST}`)'
|
||||
- 'traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-web-secure.tls.certresolver=resolver'
|
||||
- 'traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-web-secure.entrypoints=web-secure'
|
||||
- 'traefik.http.middlewares.${AI_COMPOSE_PROJECT_NAME}-web-secure-compress.compress=true'
|
||||
- 'traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-web-secure.middlewares=${AI_COMPOSE_PROJECT_NAME}-web-secure-compress,security-headers@file'
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-web-secure.rule=Host(`${AI_TRAEFIK_HOST}`)"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-web-secure.tls.certresolver=resolver"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-web-secure.entrypoints=web-secure"
|
||||
- "traefik.http.middlewares.${AI_COMPOSE_PROJECT_NAME}-web-secure-compress.compress=true"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-web-secure.middlewares=${AI_COMPOSE_PROJECT_NAME}-web-secure-compress,security-headers@file"
|
||||
# Service
|
||||
- 'traefik.http.services.${AI_COMPOSE_PROJECT_NAME}-web-secure.loadbalancer.server.port=8080'
|
||||
- 'traefik.docker.network=${NETWORK_NAME}'
|
||||
- "traefik.http.services.${AI_COMPOSE_PROJECT_NAME}-web-secure.loadbalancer.server.port=8080"
|
||||
- "traefik.docker.network=${NETWORK_NAME}"
|
||||
# Watchtower
|
||||
- 'com.centurylinklabs.watchtower.enable=${WATCHTOWER_LABEL_ENABLE}'
|
||||
- "com.centurylinklabs.watchtower.enable=${WATCHTOWER_LABEL_ENABLE}"
|
||||
|
||||
# LiteLLM - Proxy to convert Anthropic API to OpenAI-compatible format
|
||||
litellm:
|
||||
image: ghcr.io/berriai/litellm:main-latest
|
||||
container_name: ${AI_COMPOSE_PROJECT_NAME}_litellm
|
||||
restart: unless-stopped
|
||||
dns:
|
||||
- 100.100.100.100
|
||||
- 8.8.8.8
|
||||
environment:
|
||||
TZ: ${TIMEZONE:-Europe/Berlin}
|
||||
ANTHROPIC_API_KEY: ${ANTHROPIC_API_KEY}
|
||||
LITELLM_MASTER_KEY: ${AI_LITELLM_API_KEY}
|
||||
DATABASE_URL: postgresql://${AI_DB_USER}:${AI_DB_PASSWORD}@ai_postgres:5432/litellm
|
||||
LITELLM_DROP_PARAMS: 'true'
|
||||
NO_DOCS: 'true'
|
||||
NO_REDOC: 'true'
|
||||
GPU_VLLM_LLAMA_URL: ${GPU_VLLM_LLAMA_URL}
|
||||
GPU_VLLM_BGE_URL: ${GPU_VLLM_BGE_URL}
|
||||
# LITELLM_DROP_PARAMS: 'true' # DISABLED: Was breaking streaming
|
||||
NO_DOCS: "true"
|
||||
NO_REDOC: "true"
|
||||
# Performance optimizations
|
||||
LITELLM_LOG: 'ERROR' # Only log errors
|
||||
LITELLM_MODE: 'PRODUCTION' # Production mode for better performance
|
||||
LITELLM_LOG: "DEBUG" # Enable detailed logging for debugging streaming issues
|
||||
LITELLM_MODE: "PRODUCTION" # Production mode for better performance
|
||||
volumes:
|
||||
- ./litellm-config.yaml:/app/litellm-config.yaml:ro
|
||||
command:
|
||||
[
|
||||
'--config',
|
||||
'/app/litellm-config.yaml',
|
||||
'--host',
|
||||
'0.0.0.0',
|
||||
'--port',
|
||||
'4000',
|
||||
'--drop_params'
|
||||
"--config",
|
||||
"/app/litellm-config.yaml",
|
||||
"--host",
|
||||
"0.0.0.0",
|
||||
"--port",
|
||||
"4000",
|
||||
]
|
||||
depends_on:
|
||||
- ai_postgres
|
||||
networks:
|
||||
- compose_network
|
||||
healthcheck:
|
||||
disable: true
|
||||
labels:
|
||||
- 'traefik.enable=${AI_TRAEFIK_ENABLED}'
|
||||
# HTTP to HTTPS redirect
|
||||
- 'traefik.http.middlewares.${AI_COMPOSE_PROJECT_NAME}-litellm-redirect-web-secure.redirectscheme.scheme=https'
|
||||
- 'traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-litellm-web.middlewares=${AI_COMPOSE_PROJECT_NAME}-litellm-redirect-web-secure'
|
||||
- 'traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-litellm-web.rule=Host(`${AI_LITELLM_TRAEFIK_HOST}`)'
|
||||
- 'traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-litellm-web.entrypoints=web'
|
||||
# HTTPS router
|
||||
- 'traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-litellm-web-secure.rule=Host(`${AI_LITELLM_TRAEFIK_HOST}`)'
|
||||
- 'traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-litellm-web-secure.tls.certresolver=resolver'
|
||||
- 'traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-litellm-web-secure.entrypoints=web-secure'
|
||||
- 'traefik.http.middlewares.${AI_COMPOSE_PROJECT_NAME}-litellm-web-secure-compress.compress=true'
|
||||
- 'traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-litellm-web-secure.middlewares=${AI_COMPOSE_PROJECT_NAME}-litellm-web-secure-compress,security-headers@file'
|
||||
# Service
|
||||
- 'traefik.http.services.${AI_COMPOSE_PROJECT_NAME}-litellm-web-secure.loadbalancer.server.port=4000'
|
||||
- 'traefik.docker.network=${NETWORK_NAME}'
|
||||
# Watchtower
|
||||
- 'com.centurylinklabs.watchtower.enable=${WATCHTOWER_LABEL_ENABLE}'
|
||||
|
||||
# Crawl4AI - Web scraping for LLMs (internal API, no public access)
|
||||
crawl4ai:
|
||||
image: ${AI_CRAWL4AI_IMAGE:-unclecode/crawl4ai:latest}
|
||||
container_name: ${AI_COMPOSE_PROJECT_NAME}_crawl4ai
|
||||
restart: unless-stopped
|
||||
environment:
|
||||
TZ: ${TIMEZONE:-Europe/Berlin}
|
||||
# API configuration
|
||||
PORT: ${AI_CRAWL4AI_PORT:-11235}
|
||||
volumes:
|
||||
- ai_crawl4ai_data:/app/.crawl4ai
|
||||
networks:
|
||||
- compose_network
|
||||
labels:
|
||||
# No Traefik exposure - internal only
|
||||
- 'traefik.enable=false'
|
||||
- "traefik.enable=${AI_TRAEFIK_ENABLED}"
|
||||
# HTTP to HTTPS redirect
|
||||
- "traefik.http.middlewares.${AI_COMPOSE_PROJECT_NAME}-litellm-redirect-web-secure.redirectscheme.scheme=https"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-litellm-web.middlewares=${AI_COMPOSE_PROJECT_NAME}-litellm-redirect-web-secure"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-litellm-web.rule=Host(`${AI_LITELLM_TRAEFIK_HOST}`)"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-litellm-web.entrypoints=web"
|
||||
# HTTPS router
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-litellm-web-secure.rule=Host(`${AI_LITELLM_TRAEFIK_HOST}`)"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-litellm-web-secure.tls.certresolver=resolver"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-litellm-web-secure.entrypoints=web-secure"
|
||||
- "traefik.http.middlewares.${AI_COMPOSE_PROJECT_NAME}-litellm-web-secure-compress.compress=true"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-litellm-web-secure.middlewares=${AI_COMPOSE_PROJECT_NAME}-litellm-web-secure-compress,security-headers@file"
|
||||
# Service
|
||||
- "traefik.http.services.${AI_COMPOSE_PROJECT_NAME}-litellm-web-secure.loadbalancer.server.port=4000"
|
||||
- "traefik.docker.network=${NETWORK_NAME}"
|
||||
# Watchtower
|
||||
- 'com.centurylinklabs.watchtower.enable=${WATCHTOWER_LABEL_ENABLE}'
|
||||
|
||||
- "com.centurylinklabs.watchtower.enable=${WATCHTOWER_LABEL_ENABLE}"
|
||||
# Facefusion - AI face swapping and enhancement
|
||||
facefusion:
|
||||
build:
|
||||
@@ -167,7 +156,7 @@ services:
|
||||
container_name: ${AI_COMPOSE_PROJECT_NAME}_facefusion
|
||||
restart: unless-stopped
|
||||
tty: true
|
||||
command: ['python', '-u', 'facefusion.py', 'run']
|
||||
command: ["python", "-u", "facefusion.py", "run"]
|
||||
environment:
|
||||
TZ: ${TIMEZONE:-Europe/Berlin}
|
||||
GRADIO_SERVER_NAME: "0.0.0.0"
|
||||
@@ -177,30 +166,175 @@ services:
|
||||
networks:
|
||||
- compose_network
|
||||
labels:
|
||||
- 'traefik.enable=${AI_FACEFUSION_TRAEFIK_ENABLED}'
|
||||
- "traefik.enable=${AI_FACEFUSION_TRAEFIK_ENABLED}"
|
||||
# HTTP to HTTPS redirect
|
||||
- 'traefik.http.middlewares.${AI_COMPOSE_PROJECT_NAME}-facefusion-redirect-web-secure.redirectscheme.scheme=https'
|
||||
- 'traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-facefusion-web.middlewares=${AI_COMPOSE_PROJECT_NAME}-facefusion-redirect-web-secure'
|
||||
- 'traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-facefusion-web.rule=Host(`${AI_FACEFUSION_TRAEFIK_HOST}`)'
|
||||
- 'traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-facefusion-web.entrypoints=web'
|
||||
- "traefik.http.middlewares.${AI_COMPOSE_PROJECT_NAME}-facefusion-redirect-web-secure.redirectscheme.scheme=https"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-facefusion-web.middlewares=${AI_COMPOSE_PROJECT_NAME}-facefusion-redirect-web-secure"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-facefusion-web.rule=Host(`${AI_FACEFUSION_TRAEFIK_HOST}`)"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-facefusion-web.entrypoints=web"
|
||||
# HTTPS router with Authelia
|
||||
- 'traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-facefusion-web-secure.rule=Host(`${AI_FACEFUSION_TRAEFIK_HOST}`)'
|
||||
- 'traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-facefusion-web-secure.tls.certresolver=resolver'
|
||||
- 'traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-facefusion-web-secure.entrypoints=web-secure'
|
||||
- 'traefik.http.middlewares.${AI_COMPOSE_PROJECT_NAME}-facefusion-web-secure-compress.compress=true'
|
||||
- 'traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-facefusion-web-secure.middlewares=${AI_COMPOSE_PROJECT_NAME}-facefusion-web-secure-compress,net-authelia,security-headers@file'
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-facefusion-web-secure.rule=Host(`${AI_FACEFUSION_TRAEFIK_HOST}`)"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-facefusion-web-secure.tls.certresolver=resolver"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-facefusion-web-secure.entrypoints=web-secure"
|
||||
- "traefik.http.middlewares.${AI_COMPOSE_PROJECT_NAME}-facefusion-web-secure-compress.compress=true"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-facefusion-web-secure.middlewares=${AI_COMPOSE_PROJECT_NAME}-facefusion-web-secure-compress,net-authelia,security-headers@file"
|
||||
# Service
|
||||
- 'traefik.http.services.${AI_COMPOSE_PROJECT_NAME}-facefusion-web-secure.loadbalancer.server.port=7860'
|
||||
- 'traefik.docker.network=${NETWORK_NAME}'
|
||||
- "traefik.http.services.${AI_COMPOSE_PROJECT_NAME}-facefusion-web-secure.loadbalancer.server.port=7860"
|
||||
- "traefik.docker.network=${NETWORK_NAME}"
|
||||
# Watchtower - disabled for custom local image
|
||||
- 'com.centurylinklabs.watchtower.enable=false'
|
||||
- "com.centurylinklabs.watchtower.enable=false"
|
||||
|
||||
# ComfyUI - Node-based UI for Flux image generation (proxies to RunPod GPU)
|
||||
comfyui:
|
||||
image: nginx:alpine
|
||||
container_name: ${AI_COMPOSE_PROJECT_NAME}_comfyui
|
||||
restart: unless-stopped
|
||||
dns:
|
||||
- 100.100.100.100
|
||||
- 8.8.8.8
|
||||
environment:
|
||||
TZ: ${TIMEZONE:-Europe/Berlin}
|
||||
GPU_SERVICE_HOST: ${GPU_TAILSCALE_HOST:-runpod-ai-orchestrator}
|
||||
GPU_SERVICE_PORT: ${COMFYUI_BACKEND_PORT:-8188}
|
||||
volumes:
|
||||
- ./nginx.conf.template:/etc/nginx/nginx.conf.template:ro
|
||||
command: /bin/sh -c "envsubst '$${GPU_SERVICE_HOST},$${GPU_SERVICE_PORT}' < /etc/nginx/nginx.conf.template > /etc/nginx/nginx.conf && exec nginx -g 'daemon off;'"
|
||||
networks:
|
||||
- compose_network
|
||||
labels:
|
||||
- "traefik.enable=${AI_COMFYUI_TRAEFIK_ENABLED:-true}"
|
||||
# HTTP to HTTPS redirect
|
||||
- "traefik.http.middlewares.${AI_COMPOSE_PROJECT_NAME}-comfyui-redirect-web-secure.redirectscheme.scheme=https"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-comfyui-web.middlewares=${AI_COMPOSE_PROJECT_NAME}-comfyui-redirect-web-secure"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-comfyui-web.rule=Host(`${AI_COMFYUI_TRAEFIK_HOST:-comfy.ai.pivoine.art}`)"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-comfyui-web.entrypoints=web"
|
||||
# HTTPS router with Authelia SSO
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-comfyui-web-secure.rule=Host(`${AI_COMFYUI_TRAEFIK_HOST:-comfy.ai.pivoine.art}`)"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-comfyui-web-secure.tls.certresolver=resolver"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-comfyui-web-secure.entrypoints=web-secure"
|
||||
- "traefik.http.middlewares.${AI_COMPOSE_PROJECT_NAME}-comfyui-web-secure-compress.compress=true"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-comfyui-web-secure.middlewares=${AI_COMPOSE_PROJECT_NAME}-comfyui-web-secure-compress,net-authelia,security-headers@file"
|
||||
# Service
|
||||
- "traefik.http.services.${AI_COMPOSE_PROJECT_NAME}-comfyui-web-secure.loadbalancer.server.port=80"
|
||||
- "traefik.docker.network=${NETWORK_NAME}"
|
||||
# Watchtower
|
||||
- "com.centurylinklabs.watchtower.enable=${WATCHTOWER_LABEL_ENABLE}"
|
||||
|
||||
audiocraft:
|
||||
image: nginx:alpine
|
||||
container_name: ${AI_COMPOSE_PROJECT_NAME}_audiocraft
|
||||
restart: unless-stopped
|
||||
dns:
|
||||
- 100.100.100.100
|
||||
- 8.8.8.8
|
||||
environment:
|
||||
TZ: ${TIMEZONE:-Europe/Berlin}
|
||||
GPU_SERVICE_HOST: ${GPU_TAILSCALE_HOST:-runpod-ai-orchestrator}
|
||||
GPU_SERVICE_PORT: ${AUDIOCRAFT_BACKEND_PORT:-7860}
|
||||
volumes:
|
||||
- ./nginx.conf.template:/etc/nginx/nginx.conf.template:ro
|
||||
command: /bin/sh -c "envsubst '$${GPU_SERVICE_HOST},$${GPU_SERVICE_PORT}' < /etc/nginx/nginx.conf.template > /etc/nginx/nginx.conf && exec nginx -g 'daemon off;'"
|
||||
networks:
|
||||
- compose_network
|
||||
labels:
|
||||
- "traefik.enable=${AI_AUDIOCRAFT_TRAEFIK_ENABLED:-true}"
|
||||
# HTTP to HTTPS redirect
|
||||
- "traefik.http.middlewares.${AI_COMPOSE_PROJECT_NAME}-audiocraft-redirect-web-secure.redirectscheme.scheme=https"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-audiocraft-web.middlewares=${AI_COMPOSE_PROJECT_NAME}-audiocraft-redirect-web-secure"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-audiocraft-web.rule=Host(`${AI_AUDIOCRAFT_TRAEFIK_HOST:-audiocraft.ai.pivoine.art}`)"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-audiocraft-web.entrypoints=web"
|
||||
# HTTPS router with Authelia SSO
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-audiocraft-web-secure.rule=Host(`${AI_AUDIOCRAFT_TRAEFIK_HOST:-audiocraft.ai.pivoine.art}`)"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-audiocraft-web-secure.tls.certresolver=resolver"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-audiocraft-web-secure.entrypoints=web-secure"
|
||||
- "traefik.http.middlewares.${AI_COMPOSE_PROJECT_NAME}-audiocraft-web-secure-compress.compress=true"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-audiocraft-web-secure.middlewares=${AI_COMPOSE_PROJECT_NAME}-audiocraft-web-secure-compress,net-authelia,security-headers@file"
|
||||
# Service
|
||||
- "traefik.http.services.${AI_COMPOSE_PROJECT_NAME}-audiocraft-web-secure.loadbalancer.server.port=80"
|
||||
- "traefik.docker.network=${NETWORK_NAME}"
|
||||
# Watchtower
|
||||
- "com.centurylinklabs.watchtower.enable=${WATCHTOWER_LABEL_ENABLE}"
|
||||
|
||||
upscale:
|
||||
image: nginx:alpine
|
||||
container_name: ${AI_COMPOSE_PROJECT_NAME}_upscale
|
||||
restart: unless-stopped
|
||||
dns:
|
||||
- 100.100.100.100
|
||||
- 8.8.8.8
|
||||
environment:
|
||||
TZ: ${TIMEZONE:-Europe/Berlin}
|
||||
GPU_SERVICE_HOST: ${GPU_TAILSCALE_HOST:-runpod-ai-orchestrator}
|
||||
GPU_SERVICE_PORT: ${UPSCALE_BACKEND_PORT:-8080}
|
||||
volumes:
|
||||
- ./nginx.conf.template:/etc/nginx/nginx.conf.template:ro
|
||||
command: /bin/sh -c "envsubst '$${GPU_SERVICE_HOST},$${GPU_SERVICE_PORT}' < /etc/nginx/nginx.conf.template > /etc/nginx/nginx.conf && exec nginx -g 'daemon off;'"
|
||||
networks:
|
||||
- compose_network
|
||||
labels:
|
||||
- "traefik.enable=${AI_UPSCALE_TRAEFIK_ENABLED:-true}"
|
||||
# HTTP to HTTPS redirect
|
||||
- "traefik.http.middlewares.${AI_COMPOSE_PROJECT_NAME}-upscale-redirect-web-secure.redirectscheme.scheme=https"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-upscale-web.middlewares=${AI_COMPOSE_PROJECT_NAME}-upscale-redirect-web-secure"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-upscale-web.rule=Host(`${AI_UPSCALE_TRAEFIK_HOST:-upscale.ai.pivoine.art}`)"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-upscale-web.entrypoints=web"
|
||||
# HTTPS router with Authelia SSO
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-upscale-web-secure.rule=Host(`${AI_UPSCALE_TRAEFIK_HOST:-upscale.ai.pivoine.art}`)"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-upscale-web-secure.tls.certresolver=resolver"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-upscale-web-secure.entrypoints=web-secure"
|
||||
- "traefik.http.middlewares.${AI_COMPOSE_PROJECT_NAME}-upscale-web-secure-compress.compress=true"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-upscale-web-secure.middlewares=${AI_COMPOSE_PROJECT_NAME}-upscale-web-secure-compress,net-authelia,security-headers@file"
|
||||
# Service
|
||||
- "traefik.http.services.${AI_COMPOSE_PROJECT_NAME}-upscale-web-secure.loadbalancer.server.port=80"
|
||||
- "traefik.docker.network=${NETWORK_NAME}"
|
||||
# Watchtower
|
||||
- "com.centurylinklabs.watchtower.enable=${WATCHTOWER_LABEL_ENABLE}"
|
||||
|
||||
# Supervisor UI - Modern web interface for RunPod process management
|
||||
supervisor:
|
||||
image: dev.pivoine.art/valknar/supervisor-ui:latest
|
||||
container_name: ${AI_COMPOSE_PROJECT_NAME}_supervisor_ui
|
||||
restart: unless-stopped
|
||||
dns:
|
||||
- 100.100.100.100
|
||||
- 8.8.8.8
|
||||
environment:
|
||||
TZ: ${TIMEZONE:-Europe/Berlin}
|
||||
NODE_ENV: production
|
||||
# Connect to RunPod Supervisor via Tailscale (host Tailscale provides DNS)
|
||||
SUPERVISOR_HOST: ${GPU_TAILSCALE_HOST:-runpod-ai-orchestrator}
|
||||
SUPERVISOR_PORT: ${SUPERVISOR_BACKEND_PORT:-9001}
|
||||
# No auth needed - Supervisor has auth disabled (protected by Authelia)
|
||||
networks:
|
||||
- compose_network
|
||||
labels:
|
||||
- "traefik.enable=${AI_SUPERVISOR_TRAEFIK_ENABLED:-true}"
|
||||
# HTTP to HTTPS redirect
|
||||
- "traefik.http.middlewares.${AI_COMPOSE_PROJECT_NAME}-supervisor-redirect-web-secure.redirectscheme.scheme=https"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-supervisor-web.middlewares=${AI_COMPOSE_PROJECT_NAME}-supervisor-redirect-web-secure"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-supervisor-web.rule=Host(`${AI_SUPERVISOR_TRAEFIK_HOST:-supervisor.ai.pivoine.art}`)"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-supervisor-web.entrypoints=web"
|
||||
# HTTPS router with Authelia SSO
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-supervisor-web-secure.rule=Host(`${AI_SUPERVISOR_TRAEFIK_HOST:-supervisor.ai.pivoine.art}`)"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-supervisor-web-secure.tls.certresolver=resolver"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-supervisor-web-secure.entrypoints=web-secure"
|
||||
- "traefik.http.middlewares.${AI_COMPOSE_PROJECT_NAME}-supervisor-web-secure-compress.compress=true"
|
||||
- "traefik.http.routers.${AI_COMPOSE_PROJECT_NAME}-supervisor-web-secure.middlewares=${AI_COMPOSE_PROJECT_NAME}-supervisor-web-secure-compress,net-authelia,security-headers@file"
|
||||
# Service (port 3000 for Next.js app)
|
||||
- "traefik.http.services.${AI_COMPOSE_PROJECT_NAME}-supervisor-web-secure.loadbalancer.server.port=3000"
|
||||
- "traefik.docker.network=${NETWORK_NAME}"
|
||||
# Watchtower
|
||||
- "com.centurylinklabs.watchtower.enable=${WATCHTOWER_LABEL_ENABLE}"
|
||||
|
||||
volumes:
|
||||
ai_postgres_data:
|
||||
name: ${AI_COMPOSE_PROJECT_NAME}_postgres_data
|
||||
ai_webui_data:
|
||||
name: ${AI_COMPOSE_PROJECT_NAME}_webui_data
|
||||
ai_crawl4ai_data:
|
||||
name: ${AI_COMPOSE_PROJECT_NAME}_crawl4ai_data
|
||||
ai_facefusion_data:
|
||||
name: ${AI_COMPOSE_PROJECT_NAME}_facefusion_data
|
||||
|
||||
networks:
|
||||
compose_network:
|
||||
name: ${NETWORK_NAME}
|
||||
external: true
|
||||
|
||||
@@ -1,229 +0,0 @@
|
||||
#!/bin/bash
|
||||
# GPU Stack Deployment Script
|
||||
# Run this on the GPU server after SSH access is established
|
||||
|
||||
set -e # Exit on error
|
||||
|
||||
echo "=================================="
|
||||
echo "GPU Stack Deployment Script"
|
||||
echo "=================================="
|
||||
echo ""
|
||||
|
||||
# Colors for output
|
||||
RED='\033[0;31m'
|
||||
GREEN='\033[0;32m'
|
||||
YELLOW='\033[1;33m'
|
||||
NC='\033[0m' # No Color
|
||||
|
||||
# Functions
|
||||
print_success() {
|
||||
echo -e "${GREEN}✓ $1${NC}"
|
||||
}
|
||||
|
||||
print_error() {
|
||||
echo -e "${RED}✗ $1${NC}"
|
||||
}
|
||||
|
||||
print_info() {
|
||||
echo -e "${YELLOW}→ $1${NC}"
|
||||
}
|
||||
|
||||
# Check if running as root
|
||||
if [[ $EUID -ne 0 ]]; then
|
||||
print_error "This script must be run as root (use sudo)"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Step 1: Check prerequisites
|
||||
print_info "Checking prerequisites..."
|
||||
|
||||
if ! command -v docker &> /dev/null; then
|
||||
print_error "Docker is not installed. Please run DOCKER_GPU_SETUP.md first."
|
||||
exit 1
|
||||
fi
|
||||
print_success "Docker installed"
|
||||
|
||||
if ! command -v nvidia-smi &> /dev/null; then
|
||||
print_error "nvidia-smi not found. Is this a GPU server?"
|
||||
exit 1
|
||||
fi
|
||||
print_success "NVIDIA GPU detected"
|
||||
|
||||
if ! docker run --rm --runtime=nvidia --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi &> /dev/null; then
|
||||
print_error "Docker cannot access GPU. Please configure NVIDIA Container Toolkit."
|
||||
exit 1
|
||||
fi
|
||||
print_success "Docker GPU access working"
|
||||
|
||||
# Step 2: Create directory structure
|
||||
print_info "Creating directory structure..."
|
||||
|
||||
mkdir -p /workspace/gpu-stack/{vllm,comfyui,training/{configs,data,output},notebooks,monitoring}
|
||||
cd /workspace/gpu-stack
|
||||
|
||||
print_success "Directory structure created"
|
||||
|
||||
# Step 3: Create .env file
|
||||
if [ ! -f .env ]; then
|
||||
print_info "Creating .env file..."
|
||||
|
||||
cat > .env << 'EOF'
|
||||
# GPU Stack Environment Variables
|
||||
|
||||
# Timezone
|
||||
TIMEZONE=Europe/Berlin
|
||||
|
||||
# VPN Network
|
||||
VPS_IP=10.8.0.1
|
||||
GPU_IP=10.8.0.2
|
||||
|
||||
# Model Storage (network volume)
|
||||
MODELS_PATH=/workspace/models
|
||||
|
||||
# Hugging Face Token (optional, for gated models like Llama)
|
||||
# Get from: https://huggingface.co/settings/tokens
|
||||
HF_TOKEN=
|
||||
|
||||
# Weights & Biases (optional, for training logging)
|
||||
# Get from: https://wandb.ai/authorize
|
||||
WANDB_API_KEY=
|
||||
|
||||
# JupyterLab Access Token
|
||||
JUPYTER_TOKEN=pivoine-ai-2025
|
||||
|
||||
# PostgreSQL (on VPS)
|
||||
DB_HOST=10.8.0.1
|
||||
DB_PORT=5432
|
||||
DB_USER=valknar
|
||||
DB_PASSWORD=ragnarok98
|
||||
DB_NAME=openwebui
|
||||
EOF
|
||||
|
||||
chmod 600 .env
|
||||
print_success ".env file created (please edit with your tokens)"
|
||||
else
|
||||
print_success ".env file already exists"
|
||||
fi
|
||||
|
||||
# Step 4: Download docker-compose.yaml
|
||||
print_info "Downloading docker-compose.yaml..."
|
||||
|
||||
# In production, this would be copied from the repo
|
||||
# For now, assume it's already in the current directory
|
||||
if [ ! -f docker-compose.yaml ]; then
|
||||
print_error "docker-compose.yaml not found. Please copy gpu-server-compose.yaml to docker-compose.yaml"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
print_success "docker-compose.yaml found"
|
||||
|
||||
# Step 5: Pre-download models (optional but recommended)
|
||||
print_info "Do you want to pre-download models? (y/n)"
|
||||
read -r response
|
||||
|
||||
if [[ "$response" =~ ^[Yy]$ ]]; then
|
||||
print_info "Downloading Llama 3.1 8B Instruct (this will take a while)..."
|
||||
|
||||
mkdir -p /workspace/models
|
||||
|
||||
# Use huggingface-cli to download
|
||||
pip install -q huggingface-hub
|
||||
|
||||
huggingface-cli download \
|
||||
meta-llama/Meta-Llama-3.1-8B-Instruct \
|
||||
--local-dir /workspace/models/Meta-Llama-3.1-8B-Instruct \
|
||||
--local-dir-use-symlinks False || print_error "Model download failed (may need HF_TOKEN)"
|
||||
|
||||
print_success "Model downloaded to /workspace/models"
|
||||
fi
|
||||
|
||||
# Step 6: Start services
|
||||
print_info "Starting GPU stack services..."
|
||||
|
||||
docker compose up -d vllm comfyui jupyter netdata
|
||||
|
||||
print_success "Services starting (this may take a few minutes)..."
|
||||
|
||||
# Step 7: Wait for services
|
||||
print_info "Waiting for services to be ready..."
|
||||
|
||||
sleep 10
|
||||
|
||||
# Check service health
|
||||
print_info "Checking service status..."
|
||||
|
||||
if docker ps | grep -q gpu_vllm; then
|
||||
print_success "vLLM container running"
|
||||
else
|
||||
print_error "vLLM container not running"
|
||||
fi
|
||||
|
||||
if docker ps | grep -q gpu_comfyui; then
|
||||
print_success "ComfyUI container running"
|
||||
else
|
||||
print_error "ComfyUI container not running"
|
||||
fi
|
||||
|
||||
if docker ps | grep -q gpu_jupyter; then
|
||||
print_success "JupyterLab container running"
|
||||
else
|
||||
print_error "JupyterLab container not running"
|
||||
fi
|
||||
|
||||
if docker ps | grep -q gpu_netdata; then
|
||||
print_success "Netdata container running"
|
||||
else
|
||||
print_error "Netdata container not running"
|
||||
fi
|
||||
|
||||
# Step 8: Display access information
|
||||
echo ""
|
||||
echo "=================================="
|
||||
echo "Deployment Complete!"
|
||||
echo "=================================="
|
||||
echo ""
|
||||
echo "Services accessible via VPN (from VPS):"
|
||||
echo " - vLLM API: http://10.8.0.2:8000"
|
||||
echo " - ComfyUI: http://10.8.0.2:8188"
|
||||
echo " - JupyterLab: http://10.8.0.2:8888 (token: pivoine-ai-2025)"
|
||||
echo " - Netdata: http://10.8.0.2:19999"
|
||||
echo ""
|
||||
echo "Local access (from GPU server):"
|
||||
echo " - vLLM API: http://localhost:8000"
|
||||
echo " - ComfyUI: http://localhost:8188"
|
||||
echo " - JupyterLab: http://localhost:8888"
|
||||
echo " - Netdata: http://localhost:19999"
|
||||
echo ""
|
||||
echo "Useful commands:"
|
||||
echo " - View logs: docker compose logs -f"
|
||||
echo " - Check status: docker compose ps"
|
||||
echo " - Stop all: docker compose down"
|
||||
echo " - Restart service: docker compose restart vllm"
|
||||
echo " - Start training: docker compose --profile training up -d axolotl"
|
||||
echo ""
|
||||
echo "Next steps:"
|
||||
echo " 1. Wait for vLLM to load model (check logs: docker compose logs -f vllm)"
|
||||
echo " 2. Test vLLM: curl http://localhost:8000/v1/models"
|
||||
echo " 3. Configure LiteLLM on VPS to use http://10.8.0.2:8000"
|
||||
echo " 4. Download ComfyUI models via web interface"
|
||||
echo ""
|
||||
|
||||
# Step 9: Create helpful aliases
|
||||
print_info "Creating helpful aliases..."
|
||||
|
||||
cat >> ~/.bashrc << 'EOF'
|
||||
|
||||
# GPU Stack Aliases
|
||||
alias gpu-logs='cd /workspace/gpu-stack && docker compose logs -f'
|
||||
alias gpu-ps='cd /workspace/gpu-stack && docker compose ps'
|
||||
alias gpu-restart='cd /workspace/gpu-stack && docker compose restart'
|
||||
alias gpu-down='cd /workspace/gpu-stack && docker compose down'
|
||||
alias gpu-up='cd /workspace/gpu-stack && docker compose up -d'
|
||||
alias gpu-stats='watch -n 1 nvidia-smi'
|
||||
alias gpu-top='nvtop'
|
||||
EOF
|
||||
|
||||
print_success "Aliases added to ~/.bashrc (reload with: source ~/.bashrc)"
|
||||
|
||||
echo ""
|
||||
print_success "All done! 🚀"
|
||||
@@ -1,237 +0,0 @@
|
||||
# GPU Server Docker Compose Configuration
|
||||
# Deploy on RunPod GPU server (10.8.0.2)
|
||||
# Services accessible from VPS (10.8.0.1) via WireGuard VPN
|
||||
|
||||
version: '3.8'
|
||||
|
||||
services:
|
||||
# =============================================================================
|
||||
# vLLM - High-performance LLM Inference Server
|
||||
# =============================================================================
|
||||
vllm:
|
||||
image: vllm/vllm-openai:latest
|
||||
container_name: gpu_vllm
|
||||
restart: unless-stopped
|
||||
runtime: nvidia
|
||||
environment:
|
||||
NVIDIA_VISIBLE_DEVICES: all
|
||||
CUDA_VISIBLE_DEVICES: "0"
|
||||
HF_TOKEN: ${HF_TOKEN:-}
|
||||
volumes:
|
||||
- ${MODELS_PATH:-/workspace/models}:/root/.cache/huggingface
|
||||
command:
|
||||
- --model
|
||||
- meta-llama/Meta-Llama-3.1-8B-Instruct # Change model here
|
||||
- --host
|
||||
- 0.0.0.0
|
||||
- --port
|
||||
- 8000
|
||||
- --tensor-parallel-size
|
||||
- "1"
|
||||
- --gpu-memory-utilization
|
||||
- "0.85" # Leave 15% for other tasks
|
||||
- --max-model-len
|
||||
- "8192"
|
||||
- --dtype
|
||||
- auto
|
||||
- --trust-remote-code
|
||||
ports:
|
||||
- "8000:8000"
|
||||
healthcheck:
|
||||
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
|
||||
interval: 30s
|
||||
timeout: 10s
|
||||
retries: 3
|
||||
start_period: 120s # Model loading takes time
|
||||
deploy:
|
||||
resources:
|
||||
reservations:
|
||||
devices:
|
||||
- driver: nvidia
|
||||
count: 1
|
||||
capabilities: [gpu]
|
||||
labels:
|
||||
- "service=vllm"
|
||||
- "stack=gpu-ai"
|
||||
|
||||
# =============================================================================
|
||||
# ComfyUI - Advanced Stable Diffusion Interface
|
||||
# =============================================================================
|
||||
comfyui:
|
||||
image: ghcr.io/ai-dock/comfyui:latest
|
||||
container_name: gpu_comfyui
|
||||
restart: unless-stopped
|
||||
runtime: nvidia
|
||||
environment:
|
||||
NVIDIA_VISIBLE_DEVICES: all
|
||||
TZ: ${TIMEZONE:-Europe/Berlin}
|
||||
# ComfyUI auto-installs custom nodes on first run
|
||||
COMFYUI_FLAGS: "--listen 0.0.0.0 --port 8188"
|
||||
volumes:
|
||||
- comfyui_data:/data
|
||||
- ${MODELS_PATH:-/workspace/models}/comfyui:/opt/ComfyUI/models
|
||||
- comfyui_output:/opt/ComfyUI/output
|
||||
- comfyui_input:/opt/ComfyUI/input
|
||||
ports:
|
||||
- "8188:8188"
|
||||
healthcheck:
|
||||
test: ["CMD", "curl", "-f", "http://localhost:8188/"]
|
||||
interval: 30s
|
||||
timeout: 10s
|
||||
retries: 3
|
||||
start_period: 60s
|
||||
deploy:
|
||||
resources:
|
||||
reservations:
|
||||
devices:
|
||||
- driver: nvidia
|
||||
count: 1
|
||||
capabilities: [gpu]
|
||||
labels:
|
||||
- "service=comfyui"
|
||||
- "stack=gpu-ai"
|
||||
|
||||
# =============================================================================
|
||||
# Axolotl - LLM Fine-tuning Framework
|
||||
# =============================================================================
|
||||
# Note: This service uses "profiles" - only starts when explicitly requested
|
||||
# Start with: docker compose --profile training up -d axolotl
|
||||
axolotl:
|
||||
image: winglian/axolotl:main-py3.11-cu121-2.2.2
|
||||
container_name: gpu_training
|
||||
runtime: nvidia
|
||||
volumes:
|
||||
- ./training/configs:/workspace/configs
|
||||
- ./training/data:/workspace/data
|
||||
- ./training/output:/workspace/output
|
||||
- ${MODELS_PATH:-/workspace/models}:/workspace/models
|
||||
- training_cache:/root/.cache
|
||||
environment:
|
||||
NVIDIA_VISIBLE_DEVICES: all
|
||||
WANDB_API_KEY: ${WANDB_API_KEY:-}
|
||||
HF_TOKEN: ${HF_TOKEN:-}
|
||||
working_dir: /workspace
|
||||
# Default command - override when running specific training
|
||||
command: sleep infinity
|
||||
deploy:
|
||||
resources:
|
||||
reservations:
|
||||
devices:
|
||||
- driver: nvidia
|
||||
count: 1
|
||||
capabilities: [gpu]
|
||||
profiles:
|
||||
- training
|
||||
labels:
|
||||
- "service=axolotl"
|
||||
- "stack=gpu-ai"
|
||||
|
||||
# =============================================================================
|
||||
# JupyterLab - Interactive Development Environment
|
||||
# =============================================================================
|
||||
jupyter:
|
||||
image: pytorch/pytorch:2.3.0-cuda12.1-cudnn8-devel
|
||||
container_name: gpu_jupyter
|
||||
restart: unless-stopped
|
||||
runtime: nvidia
|
||||
volumes:
|
||||
- ./notebooks:/workspace/notebooks
|
||||
- ${MODELS_PATH:-/workspace/models}:/workspace/models
|
||||
- jupyter_cache:/root/.cache
|
||||
ports:
|
||||
- "8888:8888"
|
||||
environment:
|
||||
NVIDIA_VISIBLE_DEVICES: all
|
||||
JUPYTER_ENABLE_LAB: "yes"
|
||||
JUPYTER_TOKEN: ${JUPYTER_TOKEN:-pivoine-ai-2025}
|
||||
HF_TOKEN: ${HF_TOKEN:-}
|
||||
command: |
|
||||
bash -c "
|
||||
pip install --quiet jupyterlab transformers datasets accelerate bitsandbytes peft trl sentencepiece protobuf &&
|
||||
jupyter lab --ip=0.0.0.0 --port=8888 --allow-root --no-browser --NotebookApp.token='${JUPYTER_TOKEN:-pivoine-ai-2025}'
|
||||
"
|
||||
healthcheck:
|
||||
test: ["CMD", "curl", "-f", "http://localhost:8888/"]
|
||||
interval: 30s
|
||||
timeout: 10s
|
||||
retries: 3
|
||||
start_period: 60s
|
||||
deploy:
|
||||
resources:
|
||||
reservations:
|
||||
devices:
|
||||
- driver: nvidia
|
||||
count: 1
|
||||
capabilities: [gpu]
|
||||
labels:
|
||||
- "service=jupyter"
|
||||
- "stack=gpu-ai"
|
||||
|
||||
# =============================================================================
|
||||
# Netdata - System & GPU Monitoring
|
||||
# =============================================================================
|
||||
netdata:
|
||||
image: netdata/netdata:latest
|
||||
container_name: gpu_netdata
|
||||
restart: unless-stopped
|
||||
runtime: nvidia
|
||||
hostname: gpu-runpod
|
||||
cap_add:
|
||||
- SYS_PTRACE
|
||||
- SYS_ADMIN
|
||||
security_opt:
|
||||
- apparmor:unconfined
|
||||
environment:
|
||||
NVIDIA_VISIBLE_DEVICES: all
|
||||
TZ: ${TIMEZONE:-Europe/Berlin}
|
||||
volumes:
|
||||
- /sys:/host/sys:ro
|
||||
- /proc:/host/proc:ro
|
||||
- /var/run/docker.sock:/var/run/docker.sock:ro
|
||||
- /etc/os-release:/host/etc/os-release:ro
|
||||
- netdata_config:/etc/netdata
|
||||
- netdata_cache:/var/cache/netdata
|
||||
- netdata_lib:/var/lib/netdata
|
||||
ports:
|
||||
- "19999:19999"
|
||||
labels:
|
||||
- "service=netdata"
|
||||
- "stack=gpu-ai"
|
||||
|
||||
# =============================================================================
|
||||
# Volumes
|
||||
# =============================================================================
|
||||
volumes:
|
||||
# ComfyUI data
|
||||
comfyui_data:
|
||||
driver: local
|
||||
comfyui_output:
|
||||
driver: local
|
||||
comfyui_input:
|
||||
driver: local
|
||||
|
||||
# Training data
|
||||
training_cache:
|
||||
driver: local
|
||||
|
||||
# Jupyter data
|
||||
jupyter_cache:
|
||||
driver: local
|
||||
|
||||
# Netdata data
|
||||
netdata_config:
|
||||
driver: local
|
||||
netdata_cache:
|
||||
driver: local
|
||||
netdata_lib:
|
||||
driver: local
|
||||
|
||||
# =============================================================================
|
||||
# Networks
|
||||
# =============================================================================
|
||||
networks:
|
||||
default:
|
||||
driver: bridge
|
||||
ipam:
|
||||
config:
|
||||
- subnet: 172.25.0.0/24
|
||||
@@ -1,199 +0,0 @@
|
||||
# LiteLLM Configuration with GPU Server Integration
|
||||
# This config includes both Anthropic Claude (API) and self-hosted models (vLLM on GPU server)
|
||||
|
||||
model_list:
|
||||
# =============================================================================
|
||||
# Anthropic Claude Models (API-based, for complex reasoning)
|
||||
# =============================================================================
|
||||
|
||||
- model_name: claude-sonnet-4
|
||||
litellm_params:
|
||||
model: anthropic/claude-sonnet-4-20250514
|
||||
api_key: os.environ/ANTHROPIC_API_KEY
|
||||
|
||||
- model_name: claude-sonnet-4.5
|
||||
litellm_params:
|
||||
model: anthropic/claude-sonnet-4-5-20250929
|
||||
api_key: os.environ/ANTHROPIC_API_KEY
|
||||
|
||||
- model_name: claude-3-5-sonnet
|
||||
litellm_params:
|
||||
model: anthropic/claude-3-5-sonnet-20241022
|
||||
api_key: os.environ/ANTHROPIC_API_KEY
|
||||
|
||||
- model_name: claude-3-opus
|
||||
litellm_params:
|
||||
model: anthropic/claude-3-opus-20240229
|
||||
api_key: os.environ/ANTHROPIC_API_KEY
|
||||
|
||||
- model_name: claude-3-haiku
|
||||
litellm_params:
|
||||
model: anthropic/claude-3-haiku-20240307
|
||||
api_key: os.environ/ANTHROPIC_API_KEY
|
||||
|
||||
# =============================================================================
|
||||
# Self-Hosted Models (vLLM on GPU server via WireGuard VPN)
|
||||
# =============================================================================
|
||||
|
||||
# Llama 3.1 8B Instruct - Fast, general-purpose, good for routine tasks
|
||||
- model_name: llama-3.1-8b
|
||||
litellm_params:
|
||||
model: openai/meta-llama/Meta-Llama-3.1-8B-Instruct
|
||||
api_base: http://10.8.0.2:8000/v1
|
||||
api_key: dummy # vLLM doesn't require auth
|
||||
rpm: 1000 # Rate limit: requests per minute
|
||||
tpm: 100000 # Rate limit: tokens per minute
|
||||
|
||||
# Alternative models (uncomment and configure on GPU server as needed)
|
||||
|
||||
# Qwen 2.5 14B Instruct - Excellent multilingual, stronger reasoning
|
||||
# - model_name: qwen-2.5-14b
|
||||
# litellm_params:
|
||||
# model: openai/Qwen/Qwen2.5-14B-Instruct
|
||||
# api_base: http://10.8.0.2:8000/v1
|
||||
# api_key: dummy
|
||||
# rpm: 800
|
||||
# tpm: 80000
|
||||
|
||||
# Mistral 7B Instruct - Very fast, lightweight
|
||||
# - model_name: mistral-7b
|
||||
# litellm_params:
|
||||
# model: openai/mistralai/Mistral-7B-Instruct-v0.3
|
||||
# api_base: http://10.8.0.2:8000/v1
|
||||
# api_key: dummy
|
||||
# rpm: 1200
|
||||
# tpm: 120000
|
||||
|
||||
# DeepSeek Coder 6.7B - Code generation specialist
|
||||
# - model_name: deepseek-coder-6.7b
|
||||
# litellm_params:
|
||||
# model: openai/deepseek-ai/deepseek-coder-6.7b-instruct
|
||||
# api_base: http://10.8.0.2:8000/v1
|
||||
# api_key: dummy
|
||||
# rpm: 1000
|
||||
# tpm: 100000
|
||||
|
||||
# =============================================================================
|
||||
# Router Settings - Intelligent Model Selection
|
||||
# =============================================================================
|
||||
|
||||
# Model aliases for easy switching in Open WebUI
|
||||
model_name_map:
|
||||
# Default model (self-hosted, fast)
|
||||
gpt-3.5-turbo: llama-3.1-8b
|
||||
|
||||
# Power users can use Claude for complex tasks
|
||||
gpt-4: claude-sonnet-4.5
|
||||
gpt-4-turbo: claude-sonnet-4.5
|
||||
|
||||
# LiteLLM Settings
|
||||
litellm_settings:
|
||||
drop_params: true
|
||||
set_verbose: false # Disable verbose logging for better performance
|
||||
|
||||
# Enable caching with Redis for better performance
|
||||
cache: true
|
||||
cache_params:
|
||||
type: redis
|
||||
host: redis
|
||||
port: 6379
|
||||
ttl: 3600 # Cache for 1 hour
|
||||
|
||||
# Force strip specific parameters globally
|
||||
allowed_fails: 0
|
||||
|
||||
# Modify params before sending to provider
|
||||
modify_params: true
|
||||
|
||||
# Enable success and failure logging but minimize overhead
|
||||
success_callback: [] # Disable all success callbacks to reduce DB writes
|
||||
failure_callback: [] # Disable all failure callbacks
|
||||
|
||||
# Router Settings
|
||||
router_settings:
|
||||
allowed_fails: 0
|
||||
|
||||
# Routing strategy: Try self-hosted first, fallback to Claude on failure
|
||||
routing_strategy: simple-shuffle
|
||||
|
||||
# Cooldown for failed models
|
||||
cooldown_time: 30 # seconds
|
||||
|
||||
# Drop unsupported parameters
|
||||
default_litellm_params:
|
||||
drop_params: true
|
||||
|
||||
# General Settings
|
||||
general_settings:
|
||||
disable_responses_id_security: true
|
||||
|
||||
# Disable spend tracking to reduce database overhead
|
||||
disable_spend_logs: false # Keep enabled to track API vs GPU costs
|
||||
|
||||
# Disable tag tracking
|
||||
disable_tag_tracking: true
|
||||
|
||||
# Disable daily spend updates
|
||||
disable_daily_spend_logs: false # Keep enabled for cost analysis
|
||||
|
||||
# Master key for authentication (set via env var)
|
||||
master_key: os.environ/LITELLM_MASTER_KEY
|
||||
|
||||
# Database for logging (optional but recommended for cost tracking)
|
||||
database_url: os.environ/DATABASE_URL
|
||||
|
||||
# Enable OpenAPI docs
|
||||
docs_url: /docs
|
||||
|
||||
# =============================================================================
|
||||
# Usage Guidelines (for Open WebUI users)
|
||||
# =============================================================================
|
||||
#
|
||||
# Model Selection Guide:
|
||||
#
|
||||
# Use llama-3.1-8b for:
|
||||
# - General chat and Q&A
|
||||
# - Simple code generation
|
||||
# - Data extraction
|
||||
# - Summarization
|
||||
# - Translation
|
||||
# - Most routine tasks
|
||||
# Cost: ~$0/month (self-hosted)
|
||||
# Speed: ~50-80 tokens/second
|
||||
#
|
||||
# Use qwen-2.5-14b for:
|
||||
# - Complex reasoning
|
||||
# - Multi-step problems
|
||||
# - Advanced code generation
|
||||
# - Multilingual tasks
|
||||
# Cost: ~$0/month (self-hosted)
|
||||
# Speed: ~30-50 tokens/second
|
||||
#
|
||||
# Use claude-sonnet-4.5 for:
|
||||
# - Very complex reasoning
|
||||
# - Long documents (200K context)
|
||||
# - Production-critical code
|
||||
# - When quality matters most
|
||||
# Cost: ~$3/million input tokens, ~$15/million output tokens
|
||||
# Speed: ~30-40 tokens/second
|
||||
#
|
||||
# Use claude-3-haiku for:
|
||||
# - API fallback (if self-hosted down)
|
||||
# - Very fast responses needed
|
||||
# Cost: ~$0.25/million input tokens, ~$1.25/million output tokens
|
||||
# Speed: ~60-80 tokens/second
|
||||
#
|
||||
# =============================================================================
|
||||
|
||||
# Health Check Configuration
|
||||
health_check:
|
||||
# Check vLLM health endpoint
|
||||
enabled: true
|
||||
interval: 30 # seconds
|
||||
timeout: 5 # seconds
|
||||
|
||||
# Fallback Configuration
|
||||
# If GPU server is down, automatically use Claude
|
||||
fallback:
|
||||
- ["llama-3.1-8b", "claude-3-haiku"]
|
||||
- ["qwen-2.5-14b", "claude-sonnet-4.5"]
|
||||
@@ -24,30 +24,57 @@ model_list:
|
||||
model: anthropic/claude-3-haiku-20240307
|
||||
api_key: os.environ/ANTHROPIC_API_KEY
|
||||
|
||||
# ===========================================================================
|
||||
# SELF-HOSTED MODELS - DIRECT vLLM SERVERS (GPU Server via Tailscale VPN)
|
||||
# ===========================================================================
|
||||
# Direct connections to dedicated vLLM servers (no orchestrator)
|
||||
|
||||
# Text Generation - Llama 3.1 8B (Port 8001)
|
||||
- model_name: llama-3.1-8b
|
||||
litellm_params:
|
||||
model: hosted_vllm/meta-llama/Llama-3.1-8B-Instruct # hosted_vllm/openai/ prefix for proper streaming
|
||||
api_base: os.environ/GPU_VLLM_LLAMA_URL # Direct to vLLM Llama server
|
||||
api_key: "EMPTY" # vLLM doesn't validate API keys
|
||||
rpm: 1000
|
||||
tpm: 100000
|
||||
timeout: 600 # 10 minutes for generation
|
||||
stream_timeout: 600
|
||||
supports_system_messages: true # Llama supports system messages
|
||||
stream: true # Enable streaming by default
|
||||
|
||||
# Embeddings - BGE Large (Port 8002)
|
||||
- model_name: bge-large-en
|
||||
litellm_params:
|
||||
model: openai/BAAI/bge-large-en-v1.5
|
||||
api_base: os.environ/GPU_VLLM_BGE_URL
|
||||
api_key: "EMPTY"
|
||||
rpm: 1000
|
||||
tpm: 500000
|
||||
|
||||
litellm_settings:
|
||||
drop_params: true
|
||||
set_verbose: false # Disable verbose logging for better performance
|
||||
# Enable caching with Redis for better performance
|
||||
drop_params: false # DISABLED: Was breaking streaming
|
||||
set_verbose: true # Enable verbose logging for debugging streaming issues
|
||||
# Enable caching now that streaming is fixed
|
||||
cache: true
|
||||
cache_params:
|
||||
type: redis
|
||||
host: redis
|
||||
host: core_redis
|
||||
port: 6379
|
||||
ttl: 3600 # Cache for 1 hour
|
||||
ttl: 3600 # Cache for 1 hour
|
||||
# Force strip specific parameters globally
|
||||
allowed_fails: 0
|
||||
# Modify params before sending to provider
|
||||
modify_params: true
|
||||
modify_params: false # DISABLED: Was breaking streaming
|
||||
# Enable success and failure logging but minimize overhead
|
||||
success_callback: [] # Disable all success callbacks to reduce DB writes
|
||||
failure_callback: [] # Disable all failure callbacks
|
||||
success_callback: [] # Disable all success callbacks to reduce DB writes
|
||||
failure_callback: [] # Disable all failure callbacks
|
||||
|
||||
router_settings:
|
||||
allowed_fails: 0
|
||||
|
||||
# Drop unsupported parameters
|
||||
default_litellm_params:
|
||||
drop_params: true
|
||||
drop_params: false # DISABLED: Was breaking streaming
|
||||
|
||||
general_settings:
|
||||
disable_responses_id_security: true
|
||||
|
||||
60
ai/nginx.conf.template
Normal file
60
ai/nginx.conf.template
Normal file
@@ -0,0 +1,60 @@
|
||||
events {
|
||||
worker_connections 1024;
|
||||
}
|
||||
|
||||
http {
|
||||
# MIME types
|
||||
include /etc/nginx/mime.types;
|
||||
default_type application/octet-stream;
|
||||
|
||||
# DNS resolver for Tailscale MagicDNS
|
||||
resolver 100.100.100.100 8.8.8.8 valid=30s;
|
||||
resolver_timeout 5s;
|
||||
|
||||
# Proxy settings
|
||||
proxy_http_version 1.1;
|
||||
proxy_buffering off;
|
||||
proxy_set_header Host $host;
|
||||
proxy_set_header X-Real-IP $remote_addr;
|
||||
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
||||
proxy_set_header X-Forwarded-Proto $scheme;
|
||||
|
||||
# WebSocket support
|
||||
proxy_set_header Upgrade $http_upgrade;
|
||||
proxy_set_header Connection "upgrade";
|
||||
|
||||
# Timeouts for long-running audio/image generation
|
||||
proxy_connect_timeout 600;
|
||||
proxy_send_timeout 600;
|
||||
proxy_read_timeout 600;
|
||||
send_timeout 600;
|
||||
|
||||
server {
|
||||
listen 80;
|
||||
server_name _;
|
||||
|
||||
# Increase client body size for image uploads
|
||||
client_max_body_size 100M;
|
||||
|
||||
location / {
|
||||
# Proxy to service on RunPod via Tailscale
|
||||
# Use variable to force runtime DNS resolution (not startup)
|
||||
set $backend http://${GPU_SERVICE_HOST}:${GPU_SERVICE_PORT};
|
||||
proxy_pass $backend;
|
||||
|
||||
# WebSocket upgrade
|
||||
proxy_http_version 1.1;
|
||||
proxy_set_header Upgrade $http_upgrade;
|
||||
proxy_set_header Connection "upgrade";
|
||||
|
||||
# Proxy headers
|
||||
proxy_set_header Host $host;
|
||||
proxy_set_header X-Real-IP $remote_addr;
|
||||
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
||||
proxy_set_header X-Forwarded-Proto $scheme;
|
||||
|
||||
# Disable buffering for real-time updates
|
||||
proxy_buffering off;
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -1,302 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Simple vLLM server using AsyncLLMEngine directly
|
||||
Bypasses the multiprocessing issues we hit with the default vLLM API server
|
||||
OpenAI-compatible endpoints: /v1/models and /v1/completions
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
from typing import AsyncIterator, Dict, List, Optional
|
||||
|
||||
from fastapi import FastAPI, Request
|
||||
from fastapi.responses import JSONResponse, StreamingResponse
|
||||
from pydantic import BaseModel, Field
|
||||
from vllm import AsyncLLMEngine, AsyncEngineArgs, SamplingParams
|
||||
from vllm.utils import random_uuid
|
||||
|
||||
# Configure logging
|
||||
logging.basicConfig(
|
||||
level=logging.INFO,
|
||||
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
|
||||
)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# FastAPI app
|
||||
app = FastAPI(title="Simple vLLM Server", version="1.0.0")
|
||||
|
||||
# Global engine instance
|
||||
engine: Optional[AsyncLLMEngine] = None
|
||||
model_name: str = "Qwen/Qwen2.5-7B-Instruct"
|
||||
|
||||
# Request/Response models
|
||||
class CompletionRequest(BaseModel):
|
||||
"""OpenAI-compatible completion request"""
|
||||
model: str = Field(default="qwen-2.5-7b")
|
||||
prompt: str | List[str] = Field(..., description="Text prompt(s)")
|
||||
max_tokens: int = Field(default=512, ge=1, le=4096)
|
||||
temperature: float = Field(default=0.7, ge=0.0, le=2.0)
|
||||
top_p: float = Field(default=1.0, ge=0.0, le=1.0)
|
||||
n: int = Field(default=1, ge=1, le=10)
|
||||
stream: bool = Field(default=False)
|
||||
stop: Optional[str | List[str]] = None
|
||||
presence_penalty: float = Field(default=0.0, ge=-2.0, le=2.0)
|
||||
frequency_penalty: float = Field(default=0.0, ge=-2.0, le=2.0)
|
||||
|
||||
class ChatMessage(BaseModel):
|
||||
"""Chat message format"""
|
||||
role: str = Field(..., description="Role: system, user, or assistant")
|
||||
content: str = Field(..., description="Message content")
|
||||
|
||||
class ChatCompletionRequest(BaseModel):
|
||||
"""OpenAI-compatible chat completion request"""
|
||||
model: str = Field(default="qwen-2.5-7b")
|
||||
messages: List[ChatMessage] = Field(..., description="Chat messages")
|
||||
max_tokens: int = Field(default=512, ge=1, le=4096)
|
||||
temperature: float = Field(default=0.7, ge=0.0, le=2.0)
|
||||
top_p: float = Field(default=1.0, ge=0.0, le=1.0)
|
||||
n: int = Field(default=1, ge=1, le=10)
|
||||
stream: bool = Field(default=False)
|
||||
stop: Optional[str | List[str]] = None
|
||||
|
||||
@app.on_event("startup")
|
||||
async def startup_event():
|
||||
"""Initialize vLLM engine on startup"""
|
||||
global engine, model_name
|
||||
|
||||
logger.info(f"Initializing vLLM AsyncLLMEngine with model: {model_name}")
|
||||
|
||||
# Configure engine
|
||||
engine_args = AsyncEngineArgs(
|
||||
model=model_name,
|
||||
tensor_parallel_size=1, # Single GPU
|
||||
gpu_memory_utilization=0.85, # Use 85% of GPU memory
|
||||
max_model_len=4096, # Context length
|
||||
dtype="auto", # Auto-detect dtype
|
||||
download_dir="/workspace/huggingface_cache", # Large disk
|
||||
trust_remote_code=True, # Some models require this
|
||||
enforce_eager=False, # Use CUDA graphs for better performance
|
||||
)
|
||||
|
||||
# Create async engine
|
||||
engine = AsyncLLMEngine.from_engine_args(engine_args)
|
||||
|
||||
logger.info("vLLM AsyncLLMEngine initialized successfully")
|
||||
|
||||
@app.get("/")
|
||||
async def root():
|
||||
"""Health check endpoint"""
|
||||
return {"status": "ok", "model": model_name}
|
||||
|
||||
@app.get("/health")
|
||||
async def health():
|
||||
"""Detailed health check"""
|
||||
return {
|
||||
"status": "healthy" if engine else "initializing",
|
||||
"model": model_name,
|
||||
"ready": engine is not None
|
||||
}
|
||||
|
||||
@app.get("/v1/models")
|
||||
async def list_models():
|
||||
"""OpenAI-compatible models endpoint"""
|
||||
return {
|
||||
"object": "list",
|
||||
"data": [
|
||||
{
|
||||
"id": "qwen-2.5-7b",
|
||||
"object": "model",
|
||||
"created": 1234567890,
|
||||
"owned_by": "pivoine-gpu",
|
||||
"permission": [],
|
||||
"root": model_name,
|
||||
"parent": None,
|
||||
}
|
||||
]
|
||||
}
|
||||
|
||||
def messages_to_prompt(messages: List[ChatMessage]) -> str:
|
||||
"""Convert chat messages to a single prompt string"""
|
||||
# Qwen 2.5 chat template format
|
||||
prompt_parts = []
|
||||
|
||||
for msg in messages:
|
||||
role = msg.role
|
||||
content = msg.content
|
||||
|
||||
if role == "system":
|
||||
prompt_parts.append(f"<|im_start|>system\n{content}<|im_end|>")
|
||||
elif role == "user":
|
||||
prompt_parts.append(f"<|im_start|>user\n{content}<|im_end|>")
|
||||
elif role == "assistant":
|
||||
prompt_parts.append(f"<|im_start|>assistant\n{content}<|im_end|>")
|
||||
|
||||
# Add final assistant prompt
|
||||
prompt_parts.append("<|im_start|>assistant\n")
|
||||
|
||||
return "\n".join(prompt_parts)
|
||||
|
||||
@app.post("/v1/completions")
|
||||
async def create_completion(request: CompletionRequest):
|
||||
"""OpenAI-compatible completion endpoint"""
|
||||
if not engine:
|
||||
return JSONResponse(
|
||||
status_code=503,
|
||||
content={"error": "Engine not initialized"}
|
||||
)
|
||||
|
||||
# Handle both single prompt and batch prompts
|
||||
prompts = [request.prompt] if isinstance(request.prompt, str) else request.prompt
|
||||
|
||||
# Configure sampling parameters
|
||||
sampling_params = SamplingParams(
|
||||
temperature=request.temperature,
|
||||
top_p=request.top_p,
|
||||
max_tokens=request.max_tokens,
|
||||
n=request.n,
|
||||
stop=request.stop if request.stop else [],
|
||||
presence_penalty=request.presence_penalty,
|
||||
frequency_penalty=request.frequency_penalty,
|
||||
)
|
||||
|
||||
# Generate completions
|
||||
results = []
|
||||
for prompt in prompts:
|
||||
request_id = random_uuid()
|
||||
|
||||
if request.stream:
|
||||
# Streaming response
|
||||
async def generate_stream():
|
||||
async for output in engine.generate(prompt, sampling_params, request_id):
|
||||
chunk = {
|
||||
"id": request_id,
|
||||
"object": "text_completion",
|
||||
"created": 1234567890,
|
||||
"model": request.model,
|
||||
"choices": [
|
||||
{
|
||||
"text": output.outputs[0].text,
|
||||
"index": 0,
|
||||
"logprobs": None,
|
||||
"finish_reason": output.outputs[0].finish_reason,
|
||||
}
|
||||
]
|
||||
}
|
||||
yield f"data: {json.dumps(chunk)}\n\n"
|
||||
yield "data: [DONE]\n\n"
|
||||
|
||||
return StreamingResponse(generate_stream(), media_type="text/event-stream")
|
||||
else:
|
||||
# Non-streaming response
|
||||
async for output in engine.generate(prompt, sampling_params, request_id):
|
||||
final_output = output
|
||||
|
||||
results.append({
|
||||
"text": final_output.outputs[0].text,
|
||||
"index": len(results),
|
||||
"logprobs": None,
|
||||
"finish_reason": final_output.outputs[0].finish_reason,
|
||||
})
|
||||
|
||||
return {
|
||||
"id": random_uuid(),
|
||||
"object": "text_completion",
|
||||
"created": 1234567890,
|
||||
"model": request.model,
|
||||
"choices": results,
|
||||
"usage": {
|
||||
"prompt_tokens": 0, # vLLM doesn't expose this easily
|
||||
"completion_tokens": 0,
|
||||
"total_tokens": 0,
|
||||
}
|
||||
}
|
||||
|
||||
@app.post("/v1/chat/completions")
|
||||
async def create_chat_completion(request: ChatCompletionRequest):
|
||||
"""OpenAI-compatible chat completion endpoint"""
|
||||
if not engine:
|
||||
return JSONResponse(
|
||||
status_code=503,
|
||||
content={"error": "Engine not initialized"}
|
||||
)
|
||||
|
||||
# Convert messages to prompt
|
||||
prompt = messages_to_prompt(request.messages)
|
||||
|
||||
# Configure sampling parameters
|
||||
sampling_params = SamplingParams(
|
||||
temperature=request.temperature,
|
||||
top_p=request.top_p,
|
||||
max_tokens=request.max_tokens,
|
||||
n=request.n,
|
||||
stop=request.stop if request.stop else ["<|im_end|>"],
|
||||
)
|
||||
|
||||
request_id = random_uuid()
|
||||
|
||||
if request.stream:
|
||||
# Streaming response
|
||||
async def generate_stream():
|
||||
async for output in engine.generate(prompt, sampling_params, request_id):
|
||||
chunk = {
|
||||
"id": request_id,
|
||||
"object": "chat.completion.chunk",
|
||||
"created": 1234567890,
|
||||
"model": request.model,
|
||||
"choices": [
|
||||
{
|
||||
"index": 0,
|
||||
"delta": {"content": output.outputs[0].text},
|
||||
"finish_reason": output.outputs[0].finish_reason,
|
||||
}
|
||||
]
|
||||
}
|
||||
yield f"data: {json.dumps(chunk)}\n\n"
|
||||
yield "data: [DONE]\n\n"
|
||||
|
||||
return StreamingResponse(generate_stream(), media_type="text/event-stream")
|
||||
else:
|
||||
# Non-streaming response
|
||||
async for output in engine.generate(prompt, sampling_params, request_id):
|
||||
final_output = output
|
||||
|
||||
return {
|
||||
"id": request_id,
|
||||
"object": "chat.completion",
|
||||
"created": 1234567890,
|
||||
"model": request.model,
|
||||
"choices": [
|
||||
{
|
||||
"index": 0,
|
||||
"message": {
|
||||
"role": "assistant",
|
||||
"content": final_output.outputs[0].text,
|
||||
},
|
||||
"finish_reason": final_output.outputs[0].finish_reason,
|
||||
}
|
||||
],
|
||||
"usage": {
|
||||
"prompt_tokens": 0,
|
||||
"completion_tokens": 0,
|
||||
"total_tokens": 0,
|
||||
}
|
||||
}
|
||||
|
||||
if __name__ == "__main__":
|
||||
import uvicorn
|
||||
|
||||
# Get configuration from environment
|
||||
host = os.getenv("VLLM_HOST", "0.0.0.0")
|
||||
port = int(os.getenv("VLLM_PORT", "8000"))
|
||||
|
||||
logger.info(f"Starting vLLM server on {host}:{port}")
|
||||
|
||||
uvicorn.run(
|
||||
app,
|
||||
host=host,
|
||||
port=port,
|
||||
log_level="info",
|
||||
access_log=True,
|
||||
)
|
||||
67
arty.yml
67
arty.yml
@@ -78,16 +78,20 @@ envs:
|
||||
UTIL_JOPLIN_DB_NAME: joplin
|
||||
# PairDrop
|
||||
UTIL_DROP_TRAEFIK_HOST: drop.pivoine.art
|
||||
# Media Stack (Jellyfin, Filestash, Pinchflat)
|
||||
# Media Stack (Jellyfin, Pinchflat)
|
||||
MEDIA_TRAEFIK_ENABLED: true
|
||||
MEDIA_COMPOSE_PROJECT_NAME: media
|
||||
MEDIA_JELLYFIN_IMAGE: jellyfin/jellyfin:latest
|
||||
MEDIA_JELLYFIN_TRAEFIK_HOST: jellyfin.media.pivoine.art
|
||||
MEDIA_FILESTASH_IMAGE: machines/filestash:latest
|
||||
MEDIA_FILESTASH_TRAEFIK_HOST: filestash.media.pivoine.art
|
||||
MEDIA_FILESTASH_CANARY: true
|
||||
MEDIA_PINCHFLAT_IMAGE: ghcr.io/kieraneglin/pinchflat:latest
|
||||
MEDIA_PINCHFLAT_TRAEFIK_HOST: pinchflat.media.pivoine.art
|
||||
# Immich - Photo and video management
|
||||
MEDIA_IMMICH_SERVER_IMAGE: ghcr.io/immich-app/immich-server:release
|
||||
MEDIA_IMMICH_ML_IMAGE: ghcr.io/immich-app/immich-machine-learning:release
|
||||
MEDIA_IMMICH_POSTGRES_IMAGE: ghcr.io/immich-app/postgres:14-vectorchord0.4.3-pgvectors0.2.0
|
||||
MEDIA_IMMICH_TRAEFIK_HOST: immich.media.pivoine.art
|
||||
MEDIA_IMMICH_DB_NAME: immich
|
||||
MEDIA_IMMICH_DB_USER: immich
|
||||
# Dev (Gitea + Coolify)
|
||||
DEV_TRAEFIK_ENABLED: true
|
||||
DEV_COMPOSE_PROJECT_NAME: dev
|
||||
@@ -135,7 +139,6 @@ envs:
|
||||
AI_COMPOSE_PROJECT_NAME: ai
|
||||
AI_POSTGRES_IMAGE: pgvector/pgvector:pg16
|
||||
AI_WEBUI_IMAGE: ghcr.io/open-webui/open-webui:main
|
||||
AI_CRAWL4AI_IMAGE: unclecode/crawl4ai:latest
|
||||
AI_FACEFUSION_IMAGE: facefusion/facefusion:3.5.0-cpu
|
||||
AI_FACEFUSION_TRAEFIK_ENABLED: true
|
||||
AI_FACEFUSION_TRAEFIK_HOST: facefusion.ai.pivoine.art
|
||||
@@ -263,3 +266,57 @@ scripts:
|
||||
docker restart sexy_api &&
|
||||
echo "✓ Directus API restarted"
|
||||
net/create: docker network create "$NETWORK_NAME"
|
||||
# Setup iptables NAT for Docker containers to reach Tailscale network
|
||||
# Requires Tailscale installed on host: curl -fsSL https://tailscale.com/install.sh | sh
|
||||
tailscale/setup: |
|
||||
echo "Setting up iptables for Docker-to-Tailscale routing..."
|
||||
|
||||
# Enable IP forwarding
|
||||
sudo sysctl -w net.ipv4.ip_forward=1
|
||||
grep -q "net.ipv4.ip_forward=1" /etc/sysctl.conf || echo "net.ipv4.ip_forward=1" | sudo tee -a /etc/sysctl.conf
|
||||
|
||||
# Get Docker network CIDR
|
||||
DOCKER_CIDR=$(docker network inspect ${NETWORK_NAME} --format '{{range .IPAM.Config}}{{.Subnet}}{{end}}' 2>/dev/null || echo "172.18.0.0/16")
|
||||
echo "Docker network CIDR: $DOCKER_CIDR"
|
||||
|
||||
# Add NAT rule (check if already exists)
|
||||
if ! sudo iptables -t nat -C POSTROUTING -s "$DOCKER_CIDR" -o tailscale0 -j MASQUERADE 2>/dev/null; then
|
||||
sudo iptables -t nat -A POSTROUTING -s "$DOCKER_CIDR" -o tailscale0 -j MASQUERADE
|
||||
echo "✓ iptables NAT rule added"
|
||||
else
|
||||
echo "✓ iptables NAT rule already exists"
|
||||
fi
|
||||
|
||||
# Persist rules
|
||||
sudo netfilter-persistent save 2>/dev/null || echo "Install iptables-persistent to persist rules: sudo apt install iptables-persistent"
|
||||
|
||||
echo "✓ Tailscale routing configured"
|
||||
|
||||
# Install and configure Tailscale on host with persistent state
|
||||
tailscale/install: |
|
||||
echo "Installing Tailscale..."
|
||||
|
||||
# Install Tailscale if not present
|
||||
if ! command -v tailscale &> /dev/null; then
|
||||
curl -fsSL https://tailscale.com/install.sh | sh
|
||||
else
|
||||
echo "✓ Tailscale already installed"
|
||||
fi
|
||||
|
||||
# Create state directory for persistence
|
||||
TAILSCALE_STATE="/var/lib/tailscale"
|
||||
sudo mkdir -p "$TAILSCALE_STATE"
|
||||
|
||||
# Start and enable tailscaled service
|
||||
sudo systemctl enable --now tailscaled
|
||||
|
||||
# Connect to Tailscale network
|
||||
echo "Connecting to Tailscale..."
|
||||
sudo tailscale up --authkey="$TAILSCALE_AUTHKEY" --hostname=vps
|
||||
|
||||
# Show status
|
||||
echo ""
|
||||
tailscale status
|
||||
echo ""
|
||||
echo "✓ Tailscale installed and connected"
|
||||
echo " Run 'arty tailscale/setup' to configure iptables routing for Docker"
|
||||
|
||||
370
core/backrest/config.json
Normal file
370
core/backrest/config.json
Normal file
@@ -0,0 +1,370 @@
|
||||
{
|
||||
"modno": 1,
|
||||
"version": 4,
|
||||
"instance": "falcon",
|
||||
"repos": [
|
||||
{
|
||||
"id": "hidrive-backup",
|
||||
"uri": "/repos",
|
||||
"guid": "df03886ea215b0a3ff9730190d906d7034032bf0f1906ed4ad00f2c4f1748215",
|
||||
"password": "falcon-backup-2025",
|
||||
"prunePolicy": {
|
||||
"schedule": {
|
||||
"cron": "0 2 * * 0"
|
||||
}
|
||||
},
|
||||
"checkPolicy": {
|
||||
"schedule": {
|
||||
"cron": "0 3 * * 0"
|
||||
}
|
||||
},
|
||||
"autoUnlock": true
|
||||
}
|
||||
],
|
||||
"plans": [
|
||||
{
|
||||
"id": "ai-backup",
|
||||
"repo": "hidrive-backup",
|
||||
"paths": [
|
||||
"/volumes/ai_postgres_data",
|
||||
"/volumes/ai_webui_data"
|
||||
],
|
||||
"schedule": {
|
||||
"cron": "0 3 * * *"
|
||||
},
|
||||
"retention": {
|
||||
"policyTimeBucketed": {
|
||||
"daily": 7,
|
||||
"weekly": 4,
|
||||
"monthly": 6,
|
||||
"yearly": 2
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"id": "asciinema-backup",
|
||||
"repo": "hidrive-backup",
|
||||
"paths": [
|
||||
"/volumes/asciinema_data"
|
||||
],
|
||||
"schedule": {
|
||||
"cron": "0 11 * * *"
|
||||
},
|
||||
"retention": {
|
||||
"policyTimeBucketed": {
|
||||
"daily": 7,
|
||||
"weekly": 4,
|
||||
"monthly": 6,
|
||||
"yearly": 2
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"id": "coolify-backup",
|
||||
"repo": "hidrive-backup",
|
||||
"paths": [
|
||||
"/volumes/dev_coolify_data"
|
||||
],
|
||||
"schedule": {
|
||||
"cron": "0 0 * * *"
|
||||
},
|
||||
"retention": {
|
||||
"policyTimeBucketed": {
|
||||
"daily": 7,
|
||||
"weekly": 4,
|
||||
"monthly": 6,
|
||||
"yearly": 2
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"id": "directus-bundle-backup",
|
||||
"repo": "hidrive-backup",
|
||||
"paths": [
|
||||
"/volumes/directus_bundle"
|
||||
],
|
||||
"schedule": {
|
||||
"cron": "0 4 * * *"
|
||||
},
|
||||
"retention": {
|
||||
"policyTimeBucketed": {
|
||||
"daily": 7,
|
||||
"weekly": 4,
|
||||
"monthly": 3
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"id": "directus-uploads-backup",
|
||||
"repo": "hidrive-backup",
|
||||
"paths": [
|
||||
"/volumes/directus_uploads"
|
||||
],
|
||||
"schedule": {
|
||||
"cron": "0 4 * * *"
|
||||
},
|
||||
"retention": {
|
||||
"policyTimeBucketed": {
|
||||
"daily": 7,
|
||||
"weekly": 4,
|
||||
"monthly": 6,
|
||||
"yearly": 2
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"id": "gitea-backup",
|
||||
"repo": "hidrive-backup",
|
||||
"paths": [
|
||||
"/volumes/dev_gitea_config",
|
||||
"/volumes/dev_gitea_data",
|
||||
"/volumes/dev_gitea_runner_data"
|
||||
],
|
||||
"schedule": {
|
||||
"cron": "0 11 * * *"
|
||||
},
|
||||
"retention": {
|
||||
"policyTimeBucketed": {
|
||||
"daily": 7,
|
||||
"weekly": 4,
|
||||
"monthly": 6,
|
||||
"yearly": 2
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"id": "jellyfin-backup",
|
||||
"repo": "hidrive-backup",
|
||||
"paths": [
|
||||
"/volumes/jelly_config"
|
||||
],
|
||||
"schedule": {
|
||||
"cron": "0 9 * * *"
|
||||
},
|
||||
"retention": {
|
||||
"policyTimeBucketed": {
|
||||
"daily": 7,
|
||||
"weekly": 4,
|
||||
"monthly": 6,
|
||||
"yearly": 2
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"id": "joplin-backup",
|
||||
"repo": "hidrive-backup",
|
||||
"paths": [
|
||||
"/volumes/joplin_data"
|
||||
],
|
||||
"schedule": {
|
||||
"cron": "0 2 * * *"
|
||||
},
|
||||
"retention": {
|
||||
"policyTimeBucketed": {
|
||||
"daily": 7,
|
||||
"weekly": 4,
|
||||
"monthly": 6,
|
||||
"yearly": 2
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"id": "letsencrypt-backup",
|
||||
"repo": "hidrive-backup",
|
||||
"paths": [
|
||||
"/volumes/letsencrypt_data"
|
||||
],
|
||||
"schedule": {
|
||||
"cron": "0 8 * * *"
|
||||
},
|
||||
"retention": {
|
||||
"policyTimeBucketed": {
|
||||
"daily": 7,
|
||||
"weekly": 4,
|
||||
"monthly": 12,
|
||||
"yearly": 3
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"id": "linkwarden-backup",
|
||||
"repo": "hidrive-backup",
|
||||
"paths": [
|
||||
"/volumes/linkwarden_data",
|
||||
"/volumes/linkwarden_meili_data"
|
||||
],
|
||||
"schedule": {
|
||||
"cron": "0 7 * * *"
|
||||
},
|
||||
"retention": {
|
||||
"policyTimeBucketed": {
|
||||
"daily": 7,
|
||||
"weekly": 4,
|
||||
"monthly": 6
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"id": "mattermost-backup",
|
||||
"repo": "hidrive-backup",
|
||||
"paths": [
|
||||
"/volumes/mattermost_config",
|
||||
"/volumes/mattermost_data",
|
||||
"/volumes/mattermost_plugins"
|
||||
],
|
||||
"schedule": {
|
||||
"cron": "0 5 * * *"
|
||||
},
|
||||
"retention": {
|
||||
"policyTimeBucketed": {
|
||||
"daily": 7,
|
||||
"weekly": 4,
|
||||
"monthly": 6,
|
||||
"yearly": 2
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"id": "n8n-backup",
|
||||
"repo": "hidrive-backup",
|
||||
"paths": [
|
||||
"/volumes/n8n_data"
|
||||
],
|
||||
"schedule": {
|
||||
"cron": "0 6 * * *"
|
||||
},
|
||||
"retention": {
|
||||
"policyTimeBucketed": {
|
||||
"daily": 7,
|
||||
"weekly": 4,
|
||||
"monthly": 6
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"id": "netdata-backup",
|
||||
"repo": "hidrive-backup",
|
||||
"paths": [
|
||||
"/volumes/netdata_config"
|
||||
],
|
||||
"schedule": {
|
||||
"cron": "0 10 * * *"
|
||||
},
|
||||
"retention": {
|
||||
"policyTimeBucketed": {
|
||||
"daily": 7,
|
||||
"weekly": 4,
|
||||
"monthly": 3
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"id": "postgres-backup",
|
||||
"repo": "hidrive-backup",
|
||||
"paths": [
|
||||
"/volumes/core_postgres_data"
|
||||
],
|
||||
"schedule": {
|
||||
"cron": "0 2 * * *"
|
||||
},
|
||||
"retention": {
|
||||
"policyTimeBucketed": {
|
||||
"daily": 7,
|
||||
"weekly": 4,
|
||||
"monthly": 6,
|
||||
"yearly": 2
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"id": "redis-backup",
|
||||
"repo": "hidrive-backup",
|
||||
"paths": [
|
||||
"/volumes/core_redis_data"
|
||||
],
|
||||
"schedule": {
|
||||
"cron": "0 3 * * *"
|
||||
},
|
||||
"retention": {
|
||||
"policyTimeBucketed": {
|
||||
"daily": 7,
|
||||
"weekly": 4,
|
||||
"monthly": 3
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"id": "scrapy-backup",
|
||||
"repo": "hidrive-backup",
|
||||
"paths": [
|
||||
"/volumes/scrapy_code",
|
||||
"/volumes/scrapyd_data"
|
||||
],
|
||||
"schedule": {
|
||||
"cron": "0 6 * * *"
|
||||
},
|
||||
"retention": {
|
||||
"policyTimeBucketed": {
|
||||
"daily": 7,
|
||||
"weekly": 4,
|
||||
"monthly": 3
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"id": "tandoor-backup",
|
||||
"repo": "hidrive-backup",
|
||||
"paths": [
|
||||
"/volumes/tandoor_mediafiles",
|
||||
"/volumes/tandoor_staticfiles"
|
||||
],
|
||||
"schedule": {
|
||||
"cron": "0 5 * * *"
|
||||
},
|
||||
"retention": {
|
||||
"policyTimeBucketed": {
|
||||
"daily": 7,
|
||||
"weekly": 4,
|
||||
"monthly": 6
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"id": "vaultwarden-backup",
|
||||
"repo": "hidrive-backup",
|
||||
"paths": [
|
||||
"/volumes/vaultwarden_data"
|
||||
],
|
||||
"schedule": {
|
||||
"cron": "0 8 * * *"
|
||||
},
|
||||
"retention": {
|
||||
"policyTimeBucketed": {
|
||||
"daily": 7,
|
||||
"weekly": 4,
|
||||
"monthly": 12,
|
||||
"yearly": 3
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"id": "immich-backup",
|
||||
"repo": "hidrive-backup",
|
||||
"paths": [
|
||||
"/volumes/immich_postgres_data",
|
||||
"/volumes/immich_upload"
|
||||
],
|
||||
"schedule": {
|
||||
"cron": "0 12 * * *"
|
||||
},
|
||||
"retention": {
|
||||
"policyTimeBucketed": {
|
||||
"daily": 7,
|
||||
"weekly": 4,
|
||||
"monthly": 6,
|
||||
"yearly": 2
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
@@ -56,7 +56,7 @@ services:
|
||||
volumes:
|
||||
# Backrest application data
|
||||
- backrest_data:/data
|
||||
- backrest_config:/config
|
||||
- ./backrest:/config
|
||||
- backrest_cache:/cache
|
||||
- backrest_tmp:/tmp
|
||||
|
||||
@@ -74,7 +74,6 @@ services:
|
||||
- backup_util_tandoor_staticfiles:/volumes/tandoor_staticfiles:ro
|
||||
- backup_util_tandoor_mediafiles:/volumes/tandoor_mediafiles:ro
|
||||
- backup_n8n_data:/volumes/n8n_data:ro
|
||||
- backup_filestash_data:/volumes/filestash_data:ro
|
||||
- backup_util_linkwarden_data:/volumes/linkwarden_data:ro
|
||||
- backup_util_linkwarden_meili_data:/volumes/linkwarden_meili_data:ro
|
||||
- backup_letsencrypt_data:/volumes/letsencrypt_data:ro
|
||||
@@ -84,12 +83,13 @@ services:
|
||||
- backup_netdata_config:/volumes/netdata_config:ro
|
||||
- backup_ai_postgres_data:/volumes/ai_postgres_data:ro
|
||||
- backup_ai_webui_data:/volumes/ai_webui_data:ro
|
||||
- backup_ai_crawl4ai_data:/volumes/ai_crawl4ai_data:ro
|
||||
- backup_asciinema_data:/volumes/asciinema_data:ro
|
||||
- backup_dev_gitea_data:/volumes/dev_gitea_data:ro
|
||||
- backup_dev_gitea_config:/volumes/dev_gitea_config:ro
|
||||
- backup_dev_gitea_runner_data:/volumes/dev_gitea_runner_data:ro
|
||||
- backup_dev_coolify_data:/volumes/dev_coolify_data:ro
|
||||
- backup_media_immich_postgres_data:/volumes/immich_postgres_data:ro
|
||||
- backup_media_immich_upload:/volumes/immich_upload:ro
|
||||
|
||||
environment:
|
||||
TZ: ${TIMEZONE:-Europe/Berlin}
|
||||
@@ -124,8 +124,6 @@ volumes:
|
||||
name: ${CORE_COMPOSE_PROJECT_NAME}_redis_data
|
||||
backrest_data:
|
||||
name: ${CORE_COMPOSE_PROJECT_NAME}_backrest_data
|
||||
backrest_config:
|
||||
name: ${CORE_COMPOSE_PROJECT_NAME}_backrest_config
|
||||
backrest_cache:
|
||||
name: ${CORE_COMPOSE_PROJECT_NAME}_backrest_cache
|
||||
backrest_tmp:
|
||||
@@ -162,9 +160,6 @@ volumes:
|
||||
backup_n8n_data:
|
||||
name: dev_n8n_data
|
||||
external: true
|
||||
backup_filestash_data:
|
||||
name: stash_filestash_data
|
||||
external: true
|
||||
backup_util_linkwarden_data:
|
||||
name: util_linkwarden_data
|
||||
external: true
|
||||
@@ -192,9 +187,6 @@ volumes:
|
||||
backup_ai_webui_data:
|
||||
name: ai_webui_data
|
||||
external: true
|
||||
backup_ai_crawl4ai_data:
|
||||
name: ai_crawl4ai_data
|
||||
external: true
|
||||
backup_asciinema_data:
|
||||
name: dev_asciinema_data
|
||||
external: true
|
||||
@@ -210,3 +202,9 @@ volumes:
|
||||
backup_dev_coolify_data:
|
||||
name: dev_coolify_data
|
||||
external: true
|
||||
backup_media_immich_postgres_data:
|
||||
name: media_immich_postgres_data
|
||||
external: true
|
||||
backup_media_immich_upload:
|
||||
name: media_immich_upload
|
||||
external: true
|
||||
|
||||
@@ -33,36 +33,6 @@ services:
|
||||
# Watchtower
|
||||
- 'com.centurylinklabs.watchtower.enable=${WATCHTOWER_LABEL_ENABLE}'
|
||||
|
||||
# Filestash - Web-based file manager
|
||||
filestash:
|
||||
image: ${MEDIA_FILESTASH_IMAGE:-machines/filestash:latest}
|
||||
container_name: ${MEDIA_COMPOSE_PROJECT_NAME}_filestash
|
||||
restart: unless-stopped
|
||||
volumes:
|
||||
- filestash_data:/app/data/state/
|
||||
tmpfs:
|
||||
- /tmp:exec
|
||||
environment:
|
||||
TZ: ${TIMEZONE:-Europe/Berlin}
|
||||
APPLICATION_URL: ${MEDIA_FILESTASH_TRAEFIK_HOST}
|
||||
CANARY: ${MEDIA_FILESTASH_CANARY:-true}
|
||||
networks:
|
||||
- compose_network
|
||||
labels:
|
||||
- 'traefik.enable=${MEDIA_TRAEFIK_ENABLED}'
|
||||
- 'traefik.http.middlewares.${MEDIA_COMPOSE_PROJECT_NAME}-filestash-redirect-web-secure.redirectscheme.scheme=https'
|
||||
- 'traefik.http.routers.${MEDIA_COMPOSE_PROJECT_NAME}-filestash-web.middlewares=${MEDIA_COMPOSE_PROJECT_NAME}-filestash-redirect-web-secure'
|
||||
- 'traefik.http.routers.${MEDIA_COMPOSE_PROJECT_NAME}-filestash-web.rule=Host(`${MEDIA_FILESTASH_TRAEFIK_HOST}`)'
|
||||
- 'traefik.http.routers.${MEDIA_COMPOSE_PROJECT_NAME}-filestash-web.entrypoints=web'
|
||||
- 'traefik.http.routers.${MEDIA_COMPOSE_PROJECT_NAME}-filestash-web-secure.rule=Host(`${MEDIA_FILESTASH_TRAEFIK_HOST}`)'
|
||||
- 'traefik.http.routers.${MEDIA_COMPOSE_PROJECT_NAME}-filestash-web-secure.tls.certresolver=resolver'
|
||||
- 'traefik.http.routers.${MEDIA_COMPOSE_PROJECT_NAME}-filestash-web-secure.entrypoints=web-secure'
|
||||
- 'traefik.http.middlewares.${MEDIA_COMPOSE_PROJECT_NAME}-filestash-web-secure-compress.compress=true'
|
||||
- 'traefik.http.routers.${MEDIA_COMPOSE_PROJECT_NAME}-filestash-web-secure.middlewares=${MEDIA_COMPOSE_PROJECT_NAME}-filestash-web-secure-compress'
|
||||
- 'traefik.http.services.${MEDIA_COMPOSE_PROJECT_NAME}-filestash-web-secure.loadbalancer.server.port=8334'
|
||||
- 'traefik.docker.network=${NETWORK_NAME}'
|
||||
- 'com.centurylinklabs.watchtower.enable=${WATCHTOWER_LABEL_ENABLE}'
|
||||
|
||||
# Pinchflat - YouTube download manager
|
||||
pinchflat:
|
||||
image: ${MEDIA_PINCHFLAT_IMAGE:-ghcr.io/kieraneglin/pinchflat:latest}
|
||||
@@ -95,15 +65,98 @@ services:
|
||||
# Watchtower
|
||||
- 'com.centurylinklabs.watchtower.enable=${WATCHTOWER_LABEL_ENABLE}'
|
||||
|
||||
# Immich PostgreSQL - Dedicated database with vector extensions
|
||||
immich_postgres:
|
||||
image: ${MEDIA_IMMICH_POSTGRES_IMAGE:-ghcr.io/immich-app/postgres:14-vectorchord0.4.3-pgvectors0.2.0}
|
||||
container_name: ${MEDIA_COMPOSE_PROJECT_NAME}_immich_postgres
|
||||
restart: unless-stopped
|
||||
environment:
|
||||
TZ: ${TIMEZONE:-Europe/Berlin}
|
||||
POSTGRES_USER: ${MEDIA_IMMICH_DB_USER:-immich}
|
||||
POSTGRES_PASSWORD: ${MEDIA_IMMICH_DB_PASSWORD}
|
||||
POSTGRES_DB: ${MEDIA_IMMICH_DB_NAME:-immich}
|
||||
POSTGRES_INITDB_ARGS: --data-checksums
|
||||
volumes:
|
||||
- immich_postgres_data:/var/lib/postgresql/data
|
||||
healthcheck:
|
||||
test: ["CMD-SHELL", "pg_isready -U ${MEDIA_IMMICH_DB_USER:-immich} -d ${MEDIA_IMMICH_DB_NAME:-immich}"]
|
||||
interval: 30s
|
||||
timeout: 10s
|
||||
retries: 3
|
||||
start_period: 40s
|
||||
networks:
|
||||
- compose_network
|
||||
|
||||
# Immich Server - Main application
|
||||
immich_server:
|
||||
image: ${MEDIA_IMMICH_SERVER_IMAGE:-ghcr.io/immich-app/immich-server:release}
|
||||
container_name: ${MEDIA_COMPOSE_PROJECT_NAME}_immich_server
|
||||
restart: unless-stopped
|
||||
environment:
|
||||
TZ: ${TIMEZONE:-Europe/Berlin}
|
||||
DB_HOSTNAME: ${MEDIA_COMPOSE_PROJECT_NAME}_immich_postgres
|
||||
DB_PORT: 5432
|
||||
DB_USERNAME: ${MEDIA_IMMICH_DB_USER:-immich}
|
||||
DB_PASSWORD: ${MEDIA_IMMICH_DB_PASSWORD}
|
||||
DB_DATABASE_NAME: ${MEDIA_IMMICH_DB_NAME:-immich}
|
||||
REDIS_HOSTNAME: ${CORE_COMPOSE_PROJECT_NAME}_redis
|
||||
REDIS_PORT: ${CORE_REDIS_PORT:-6379}
|
||||
IMMICH_MACHINE_LEARNING_URL: http://${MEDIA_COMPOSE_PROJECT_NAME}_immich_ml:3003
|
||||
volumes:
|
||||
- immich_upload:/usr/src/app/upload
|
||||
- /etc/localtime:/etc/localtime:ro
|
||||
depends_on:
|
||||
immich_postgres:
|
||||
condition: service_healthy
|
||||
networks:
|
||||
- compose_network
|
||||
labels:
|
||||
- 'traefik.enable=${MEDIA_TRAEFIK_ENABLED}'
|
||||
# HTTP to HTTPS redirect
|
||||
- 'traefik.http.middlewares.${MEDIA_COMPOSE_PROJECT_NAME}-immich-redirect-web-secure.redirectscheme.scheme=https'
|
||||
- 'traefik.http.routers.${MEDIA_COMPOSE_PROJECT_NAME}-immich-web.middlewares=${MEDIA_COMPOSE_PROJECT_NAME}-immich-redirect-web-secure'
|
||||
- 'traefik.http.routers.${MEDIA_COMPOSE_PROJECT_NAME}-immich-web.rule=Host(`${MEDIA_IMMICH_TRAEFIK_HOST}`)'
|
||||
- 'traefik.http.routers.${MEDIA_COMPOSE_PROJECT_NAME}-immich-web.entrypoints=web'
|
||||
# HTTPS router
|
||||
- 'traefik.http.routers.${MEDIA_COMPOSE_PROJECT_NAME}-immich-web-secure.rule=Host(`${MEDIA_IMMICH_TRAEFIK_HOST}`)'
|
||||
- 'traefik.http.routers.${MEDIA_COMPOSE_PROJECT_NAME}-immich-web-secure.tls.certresolver=resolver'
|
||||
- 'traefik.http.routers.${MEDIA_COMPOSE_PROJECT_NAME}-immich-web-secure.entrypoints=web-secure'
|
||||
- 'traefik.http.middlewares.${MEDIA_COMPOSE_PROJECT_NAME}-immich-web-secure-compress.compress=true'
|
||||
- 'traefik.http.routers.${MEDIA_COMPOSE_PROJECT_NAME}-immich-web-secure.middlewares=${MEDIA_COMPOSE_PROJECT_NAME}-immich-web-secure-compress,security-headers@file'
|
||||
# Service
|
||||
- 'traefik.http.services.${MEDIA_COMPOSE_PROJECT_NAME}-immich-web-secure.loadbalancer.server.port=2283'
|
||||
- 'traefik.docker.network=${NETWORK_NAME}'
|
||||
# Watchtower
|
||||
- 'com.centurylinklabs.watchtower.enable=${WATCHTOWER_LABEL_ENABLE}'
|
||||
|
||||
# Immich Machine Learning - AI inference for faces, search, etc.
|
||||
immich_ml:
|
||||
image: ${MEDIA_IMMICH_ML_IMAGE:-ghcr.io/immich-app/immich-machine-learning:release}
|
||||
container_name: ${MEDIA_COMPOSE_PROJECT_NAME}_immich_ml
|
||||
restart: unless-stopped
|
||||
environment:
|
||||
TZ: ${TIMEZONE:-Europe/Berlin}
|
||||
volumes:
|
||||
- immich_model_cache:/cache
|
||||
networks:
|
||||
- compose_network
|
||||
labels:
|
||||
# Watchtower
|
||||
- 'com.centurylinklabs.watchtower.enable=${WATCHTOWER_LABEL_ENABLE}'
|
||||
|
||||
volumes:
|
||||
jellyfin_config:
|
||||
name: ${MEDIA_COMPOSE_PROJECT_NAME}_jellyfin_config
|
||||
jellyfin_cache:
|
||||
name: ${MEDIA_COMPOSE_PROJECT_NAME}_jellyfin_cache
|
||||
filestash_data:
|
||||
name: ${MEDIA_COMPOSE_PROJECT_NAME}_filestash_data
|
||||
pinchflat_config:
|
||||
name: ${MEDIA_COMPOSE_PROJECT_NAME}_pinchflat_config
|
||||
immich_postgres_data:
|
||||
name: ${MEDIA_COMPOSE_PROJECT_NAME}_immich_postgres_data
|
||||
immich_upload:
|
||||
name: ${MEDIA_COMPOSE_PROJECT_NAME}_immich_upload
|
||||
immich_model_cache:
|
||||
name: ${MEDIA_COMPOSE_PROJECT_NAME}_immich_model_cache
|
||||
|
||||
networks:
|
||||
compose_network:
|
||||
|
||||
@@ -74,23 +74,26 @@ access_control:
|
||||
- "admin.asciinema.dev.pivoine.art"
|
||||
- "facefusion.ai.pivoine.art"
|
||||
- "pinchflat.media.pivoine.art"
|
||||
- "comfy.ai.pivoine.art"
|
||||
- "supervisor.ai.pivoine.art"
|
||||
- "audiocraft.ai.pivoine.art"
|
||||
- "upscale.ai.pivoine.art"
|
||||
policy: one_factor
|
||||
|
||||
|
||||
# session secret set via environment variable: AUTHELIA_SESSION_SECRET
|
||||
session:
|
||||
name: 'authelia_session'
|
||||
same_site: 'lax'
|
||||
expiration: '1h'
|
||||
inactivity: '5m'
|
||||
remember_me: '1M'
|
||||
name: "authelia_session"
|
||||
same_site: "lax"
|
||||
expiration: "1h"
|
||||
inactivity: "15m"
|
||||
remember_me: "1M"
|
||||
cookies:
|
||||
- domain: 'pivoine.art'
|
||||
authelia_url: 'https://auth.pivoine.art'
|
||||
same_site: 'lax'
|
||||
expiration: '1h'
|
||||
inactivity: '5m'
|
||||
remember_me: '1M'
|
||||
- domain: "pivoine.art"
|
||||
authelia_url: "https://auth.pivoine.art"
|
||||
same_site: "lax"
|
||||
expiration: "1h"
|
||||
inactivity: "5m"
|
||||
remember_me: "1M"
|
||||
|
||||
regulation:
|
||||
max_retries: 3
|
||||
|
||||
179
net/compose.yaml
179
net/compose.yaml
@@ -6,49 +6,49 @@ services:
|
||||
restart: unless-stopped
|
||||
command:
|
||||
# API & Dashboard
|
||||
- '--api.dashboard=true'
|
||||
- '--api.insecure=false'
|
||||
- "--api.dashboard=true"
|
||||
- "--api.insecure=false"
|
||||
|
||||
# Ping endpoint for healthcheck
|
||||
- '--ping=true'
|
||||
- "--ping=true"
|
||||
|
||||
# Experimental plugins
|
||||
- '--experimental.plugins.sablier.modulename=github.com/acouvreur/sablier'
|
||||
- '--experimental.plugins.sablier.version=v1.8.0'
|
||||
- "--experimental.plugins.sablier.modulename=github.com/acouvreur/sablier"
|
||||
- "--experimental.plugins.sablier.version=v1.8.0"
|
||||
|
||||
# Logging
|
||||
- '--log.level=${NET_PROXY_LOG_LEVEL:-INFO}'
|
||||
- '--accesslog=true'
|
||||
- "--log.level=${NET_PROXY_LOG_LEVEL:-INFO}"
|
||||
- "--accesslog=true"
|
||||
|
||||
# Global
|
||||
- '--global.sendAnonymousUsage=false'
|
||||
- '--global.checkNewVersion=true'
|
||||
- "--global.sendAnonymousUsage=false"
|
||||
- "--global.checkNewVersion=true"
|
||||
|
||||
# Docker Provider
|
||||
- '--providers.docker=true'
|
||||
- '--providers.docker.exposedbydefault=false'
|
||||
- '--providers.docker.network=${NETWORK_NAME}'
|
||||
- "--providers.docker=true"
|
||||
- "--providers.docker.exposedbydefault=false"
|
||||
- "--providers.docker.network=${NETWORK_NAME}"
|
||||
|
||||
# File Provider for dynamic configuration
|
||||
- '--providers.file.directory=/etc/traefik/dynamic'
|
||||
- '--providers.file.watch=true'
|
||||
- "--providers.file.directory=/etc/traefik/dynamic"
|
||||
- "--providers.file.watch=true"
|
||||
|
||||
# Entrypoints
|
||||
- '--entrypoints.web.address=:${NET_PROXY_PORT_HTTP:-80}'
|
||||
- '--entrypoints.web-secure.address=:${NET_PROXY_PORT_HTTPS:-443}'
|
||||
- "--entrypoints.web.address=:${NET_PROXY_PORT_HTTP:-80}"
|
||||
- "--entrypoints.web-secure.address=:${NET_PROXY_PORT_HTTPS:-443}"
|
||||
|
||||
# Global HTTP to HTTPS redirect
|
||||
- '--entrypoints.web.http.redirections.entryPoint.to=web-secure'
|
||||
- '--entrypoints.web.http.redirections.entryPoint.scheme=https'
|
||||
- '--entrypoints.web.http.redirections.entryPoint.permanent=true'
|
||||
- "--entrypoints.web.http.redirections.entryPoint.to=web-secure"
|
||||
- "--entrypoints.web.http.redirections.entryPoint.scheme=https"
|
||||
- "--entrypoints.web.http.redirections.entryPoint.permanent=true"
|
||||
|
||||
# Security Headers (applied globally)
|
||||
- '--entrypoints.web-secure.http.middlewares=security-headers@file'
|
||||
- "--entrypoints.web-secure.http.middlewares=security-headers@file"
|
||||
|
||||
# Let's Encrypt
|
||||
- '--certificatesresolvers.resolver.acme.tlschallenge=true'
|
||||
- '--certificatesresolvers.resolver.acme.email=${ADMIN_EMAIL}'
|
||||
- '--certificatesresolvers.resolver.acme.storage=/letsencrypt/acme.json'
|
||||
- "--certificatesresolvers.resolver.acme.tlschallenge=true"
|
||||
- "--certificatesresolvers.resolver.acme.email=${ADMIN_EMAIL}"
|
||||
- "--certificatesresolvers.resolver.acme.storage=/letsencrypt/acme.json"
|
||||
|
||||
healthcheck:
|
||||
test: ["CMD", "traefik", "healthcheck", "--ping"]
|
||||
@@ -74,20 +74,20 @@ services:
|
||||
- ./dynamic:/etc/traefik/dynamic:ro
|
||||
|
||||
labels:
|
||||
- 'traefik.enable=true'
|
||||
- "traefik.enable=true"
|
||||
# HTTP to HTTPS redirect
|
||||
- 'traefik.http.middlewares.${NET_COMPOSE_PROJECT_NAME}-traefik-redirect-web-secure.redirectscheme.scheme=https'
|
||||
- 'traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-traefik-web.middlewares=${NET_COMPOSE_PROJECT_NAME}-traefik-redirect-web-secure'
|
||||
- 'traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-traefik-web.rule=Host(`${NET_PROXY_TRAEFIK_HOST}`)'
|
||||
- 'traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-traefik-web.entrypoints=web'
|
||||
- "traefik.http.middlewares.${NET_COMPOSE_PROJECT_NAME}-traefik-redirect-web-secure.redirectscheme.scheme=https"
|
||||
- "traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-traefik-web.middlewares=${NET_COMPOSE_PROJECT_NAME}-traefik-redirect-web-secure"
|
||||
- "traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-traefik-web.rule=Host(`${NET_PROXY_TRAEFIK_HOST}`)"
|
||||
- "traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-traefik-web.entrypoints=web"
|
||||
# HTTPS router with auth
|
||||
- 'traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-traefik-web-secure.rule=Host(`${NET_PROXY_TRAEFIK_HOST}`)'
|
||||
- 'traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-traefik-web-secure.tls.certresolver=resolver'
|
||||
- 'traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-traefik-web-secure.entrypoints=web-secure'
|
||||
- 'traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-traefik-web-secure.service=api@internal'
|
||||
- 'traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-traefik-web-secure.middlewares=${NET_COMPOSE_PROJECT_NAME}-authelia,security-headers@file'
|
||||
- 'traefik.http.services.${NET_COMPOSE_PROJECT_NAME}-traefik-web-secure.loadbalancer.server.port=8080'
|
||||
- 'traefik.docker.network=${NETWORK_NAME}'
|
||||
- "traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-traefik-web-secure.rule=Host(`${NET_PROXY_TRAEFIK_HOST}`)"
|
||||
- "traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-traefik-web-secure.tls.certresolver=resolver"
|
||||
- "traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-traefik-web-secure.entrypoints=web-secure"
|
||||
- "traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-traefik-web-secure.service=api@internal"
|
||||
- "traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-traefik-web-secure.middlewares=${NET_COMPOSE_PROJECT_NAME}-authelia,security-headers@file"
|
||||
- "traefik.http.services.${NET_COMPOSE_PROJECT_NAME}-traefik-web-secure.loadbalancer.server.port=8080"
|
||||
- "traefik.docker.network=${NETWORK_NAME}"
|
||||
|
||||
# Netdata - Real-time monitoring
|
||||
netdata:
|
||||
@@ -128,23 +128,23 @@ services:
|
||||
networks:
|
||||
- compose_network
|
||||
labels:
|
||||
- 'traefik.enable=${NET_TRAEFIK_ENABLED}'
|
||||
- "traefik.enable=${NET_TRAEFIK_ENABLED}"
|
||||
# HTTP to HTTPS redirect
|
||||
- 'traefik.http.middlewares.${NET_COMPOSE_PROJECT_NAME}-netdata-redirect-web-secure.redirectscheme.scheme=https'
|
||||
- 'traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-netdata-web.middlewares=${NET_COMPOSE_PROJECT_NAME}-netdata-redirect-web-secure'
|
||||
- 'traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-netdata-web.rule=Host(`${NET_NETDATA_TRAEFIK_HOST}`)'
|
||||
- 'traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-netdata-web.entrypoints=web'
|
||||
- "traefik.http.middlewares.${NET_COMPOSE_PROJECT_NAME}-netdata-redirect-web-secure.redirectscheme.scheme=https"
|
||||
- "traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-netdata-web.middlewares=${NET_COMPOSE_PROJECT_NAME}-netdata-redirect-web-secure"
|
||||
- "traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-netdata-web.rule=Host(`${NET_NETDATA_TRAEFIK_HOST}`)"
|
||||
- "traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-netdata-web.entrypoints=web"
|
||||
# HTTPS router
|
||||
- 'traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-netdata-web-secure.rule=Host(`${NET_NETDATA_TRAEFIK_HOST}`)'
|
||||
- 'traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-netdata-web-secure.tls.certresolver=resolver'
|
||||
- 'traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-netdata-web-secure.entrypoints=web-secure'
|
||||
- 'traefik.http.middlewares.${NET_COMPOSE_PROJECT_NAME}-netdata-compress.compress=true'
|
||||
- 'traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-netdata-web-secure.middlewares=${NET_COMPOSE_PROJECT_NAME}-netdata-compress,${NET_COMPOSE_PROJECT_NAME}-authelia,security-headers@file'
|
||||
- "traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-netdata-web-secure.rule=Host(`${NET_NETDATA_TRAEFIK_HOST}`)"
|
||||
- "traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-netdata-web-secure.tls.certresolver=resolver"
|
||||
- "traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-netdata-web-secure.entrypoints=web-secure"
|
||||
- "traefik.http.middlewares.${NET_COMPOSE_PROJECT_NAME}-netdata-compress.compress=true"
|
||||
- "traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-netdata-web-secure.middlewares=${NET_COMPOSE_PROJECT_NAME}-netdata-compress,${NET_COMPOSE_PROJECT_NAME}-authelia,security-headers@file"
|
||||
# Service
|
||||
- 'traefik.http.services.${NET_COMPOSE_PROJECT_NAME}-netdata.loadbalancer.server.port=19999'
|
||||
- 'traefik.docker.network=${NETWORK_NAME}'
|
||||
- "traefik.http.services.${NET_COMPOSE_PROJECT_NAME}-netdata.loadbalancer.server.port=19999"
|
||||
- "traefik.docker.network=${NETWORK_NAME}"
|
||||
# Watchtower
|
||||
- 'com.centurylinklabs.watchtower.enable=${WATCHTOWER_LABEL_ENABLE}'
|
||||
- "com.centurylinklabs.watchtower.enable=${WATCHTOWER_LABEL_ENABLE}"
|
||||
|
||||
# Watchtower - Automatic container updates
|
||||
watchtower:
|
||||
@@ -202,7 +202,8 @@ services:
|
||||
- compose_network
|
||||
|
||||
healthcheck:
|
||||
test: ["CMD-SHELL", "curl -f http://localhost:3000/api/heartbeat || exit 1"]
|
||||
test:
|
||||
["CMD-SHELL", "curl -f http://localhost:3000/api/heartbeat || exit 1"]
|
||||
interval: 30s
|
||||
timeout: 10s
|
||||
retries: 5
|
||||
@@ -210,21 +211,21 @@ services:
|
||||
|
||||
labels:
|
||||
# Traefik Configuration
|
||||
- 'traefik.enable=${NET_TRAEFIK_ENABLED}'
|
||||
- "traefik.enable=${NET_TRAEFIK_ENABLED}"
|
||||
|
||||
# HTTP to HTTPS redirect
|
||||
- 'traefik.http.middlewares.${NET_COMPOSE_PROJECT_NAME}-umami-redirect-web-secure.redirectscheme.scheme=https'
|
||||
- 'traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-umami-web.middlewares=${NET_COMPOSE_PROJECT_NAME}-umami-redirect-web-secure'
|
||||
- 'traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-umami-web.rule=Host(`${NET_TRACK_TRAEFIK_HOST}`)'
|
||||
- 'traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-umami-web.entrypoints=web'
|
||||
- 'traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-umami-web-secure.rule=Host(`${NET_TRACK_TRAEFIK_HOST}`)'
|
||||
- 'traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-umami-web-secure.tls.certresolver=resolver'
|
||||
- 'traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-umami-web-secure.entrypoints=web-secure'
|
||||
- 'traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-umami-web-secure.middlewares=security-headers@file'
|
||||
- 'traefik.http.services.${NET_COMPOSE_PROJECT_NAME}-umami-web-secure.loadbalancer.server.port=3000'
|
||||
- 'traefik.docker.network=${NETWORK_NAME}'
|
||||
- "traefik.http.middlewares.${NET_COMPOSE_PROJECT_NAME}-umami-redirect-web-secure.redirectscheme.scheme=https"
|
||||
- "traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-umami-web.middlewares=${NET_COMPOSE_PROJECT_NAME}-umami-redirect-web-secure"
|
||||
- "traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-umami-web.rule=Host(`${NET_TRACK_TRAEFIK_HOST}`)"
|
||||
- "traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-umami-web.entrypoints=web"
|
||||
- "traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-umami-web-secure.rule=Host(`${NET_TRACK_TRAEFIK_HOST}`)"
|
||||
- "traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-umami-web-secure.tls.certresolver=resolver"
|
||||
- "traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-umami-web-secure.entrypoints=web-secure"
|
||||
- "traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-umami-web-secure.middlewares=security-headers@file"
|
||||
- "traefik.http.services.${NET_COMPOSE_PROJECT_NAME}-umami-web-secure.loadbalancer.server.port=3000"
|
||||
- "traefik.docker.network=${NETWORK_NAME}"
|
||||
# Watchtower
|
||||
- 'com.centurylinklabs.watchtower.enable=${WATCHTOWER_LABEL_ENABLE}'
|
||||
- "com.centurylinklabs.watchtower.enable=${WATCHTOWER_LABEL_ENABLE}"
|
||||
|
||||
# Mailpit - SMTP server with web UI
|
||||
mailpit:
|
||||
@@ -250,22 +251,22 @@ services:
|
||||
networks:
|
||||
- compose_network
|
||||
labels:
|
||||
- 'traefik.enable=${NET_TRAEFIK_ENABLED}'
|
||||
- "traefik.enable=${NET_TRAEFIK_ENABLED}"
|
||||
# HTTP to HTTPS redirect
|
||||
- 'traefik.http.middlewares.${NET_COMPOSE_PROJECT_NAME}-mailpit-redirect-web-secure.redirectscheme.scheme=https'
|
||||
- 'traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-mailpit-web.middlewares=${NET_COMPOSE_PROJECT_NAME}-mailpit-redirect-web-secure'
|
||||
- 'traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-mailpit-web.rule=Host(`${NET_MAILPIT_TRAEFIK_HOST}`)'
|
||||
- 'traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-mailpit-web.entrypoints=web'
|
||||
- "traefik.http.middlewares.${NET_COMPOSE_PROJECT_NAME}-mailpit-redirect-web-secure.redirectscheme.scheme=https"
|
||||
- "traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-mailpit-web.middlewares=${NET_COMPOSE_PROJECT_NAME}-mailpit-redirect-web-secure"
|
||||
- "traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-mailpit-web.rule=Host(`${NET_MAILPIT_TRAEFIK_HOST}`)"
|
||||
- "traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-mailpit-web.entrypoints=web"
|
||||
# HTTPS router with auth
|
||||
- 'traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-mailpit-web-secure.rule=Host(`${NET_MAILPIT_TRAEFIK_HOST}`)'
|
||||
- 'traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-mailpit-web-secure.tls.certresolver=resolver'
|
||||
- 'traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-mailpit-web-secure.entrypoints=web-secure'
|
||||
- 'traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-mailpit-web-secure.middlewares=${NET_COMPOSE_PROJECT_NAME}-authelia,security-headers@file'
|
||||
- "traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-mailpit-web-secure.rule=Host(`${NET_MAILPIT_TRAEFIK_HOST}`)"
|
||||
- "traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-mailpit-web-secure.tls.certresolver=resolver"
|
||||
- "traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-mailpit-web-secure.entrypoints=web-secure"
|
||||
- "traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-mailpit-web-secure.middlewares=${NET_COMPOSE_PROJECT_NAME}-authelia,security-headers@file"
|
||||
# Service
|
||||
- 'traefik.http.services.${NET_COMPOSE_PROJECT_NAME}-mailpit-web-secure.loadbalancer.server.port=8025'
|
||||
- 'traefik.docker.network=${NETWORK_NAME}'
|
||||
- "traefik.http.services.${NET_COMPOSE_PROJECT_NAME}-mailpit-web-secure.loadbalancer.server.port=8025"
|
||||
- "traefik.docker.network=${NETWORK_NAME}"
|
||||
# Watchtower
|
||||
- 'com.centurylinklabs.watchtower.enable=${WATCHTOWER_LABEL_ENABLE}'
|
||||
- "com.centurylinklabs.watchtower.enable=${WATCHTOWER_LABEL_ENABLE}"
|
||||
|
||||
# Authelia - SSO and authentication portal
|
||||
authelia:
|
||||
@@ -285,27 +286,27 @@ services:
|
||||
networks:
|
||||
- compose_network
|
||||
labels:
|
||||
- 'traefik.enable=${NET_TRAEFIK_ENABLED}'
|
||||
- "traefik.enable=${NET_TRAEFIK_ENABLED}"
|
||||
# HTTP to HTTPS redirect
|
||||
- 'traefik.http.middlewares.${NET_COMPOSE_PROJECT_NAME}-authelia-redirect-web-secure.redirectscheme.scheme=https'
|
||||
- 'traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-authelia-web.middlewares=${NET_COMPOSE_PROJECT_NAME}-authelia-redirect-web-secure'
|
||||
- 'traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-authelia-web.rule=Host(`${NET_AUTHELIA_TRAEFIK_HOST}`)'
|
||||
- 'traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-authelia-web.entrypoints=web'
|
||||
- "traefik.http.middlewares.${NET_COMPOSE_PROJECT_NAME}-authelia-redirect-web-secure.redirectscheme.scheme=https"
|
||||
- "traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-authelia-web.middlewares=${NET_COMPOSE_PROJECT_NAME}-authelia-redirect-web-secure"
|
||||
- "traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-authelia-web.rule=Host(`${NET_AUTHELIA_TRAEFIK_HOST}`)"
|
||||
- "traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-authelia-web.entrypoints=web"
|
||||
# HTTPS router
|
||||
- 'traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-authelia-web-secure.rule=Host(`${NET_AUTHELIA_TRAEFIK_HOST}`)'
|
||||
- 'traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-authelia-web-secure.tls.certresolver=resolver'
|
||||
- 'traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-authelia-web-secure.entrypoints=web-secure'
|
||||
- 'traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-authelia-web-secure.middlewares=security-headers@file'
|
||||
- "traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-authelia-web-secure.rule=Host(`${NET_AUTHELIA_TRAEFIK_HOST}`)"
|
||||
- "traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-authelia-web-secure.tls.certresolver=resolver"
|
||||
- "traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-authelia-web-secure.entrypoints=web-secure"
|
||||
- "traefik.http.routers.${NET_COMPOSE_PROJECT_NAME}-authelia-web-secure.middlewares=security-headers@file"
|
||||
# Service
|
||||
- 'traefik.http.services.${NET_COMPOSE_PROJECT_NAME}-authelia-web-secure.loadbalancer.server.port=9091'
|
||||
- 'traefik.docker.network=${NETWORK_NAME}'
|
||||
- "traefik.http.services.${NET_COMPOSE_PROJECT_NAME}-authelia-web-secure.loadbalancer.server.port=9091"
|
||||
- "traefik.docker.network=${NETWORK_NAME}"
|
||||
# ForwardAuth middleware for other services
|
||||
- 'traefik.http.middlewares.${NET_COMPOSE_PROJECT_NAME}-authelia.forwardAuth.address=http://net_authelia:9091/api/authz/forward-auth'
|
||||
- 'traefik.http.middlewares.${NET_COMPOSE_PROJECT_NAME}-authelia.forwardAuth.trustForwardHeader=true'
|
||||
- 'traefik.http.middlewares.${NET_COMPOSE_PROJECT_NAME}-authelia.forwardAuth.authResponseHeaders=Remote-User,Remote-Groups,Remote-Name,Remote-Email'
|
||||
- 'traefik.http.middlewares.${NET_COMPOSE_PROJECT_NAME}-authelia.forwardAuth.authResponseHeadersRegex=^Remote-'
|
||||
- "traefik.http.middlewares.${NET_COMPOSE_PROJECT_NAME}-authelia.forwardAuth.address=http://net_authelia:9091/api/authz/forward-auth"
|
||||
- "traefik.http.middlewares.${NET_COMPOSE_PROJECT_NAME}-authelia.forwardAuth.trustForwardHeader=true"
|
||||
- "traefik.http.middlewares.${NET_COMPOSE_PROJECT_NAME}-authelia.forwardAuth.authResponseHeaders=Remote-User,Remote-Groups,Remote-Name,Remote-Email"
|
||||
- "traefik.http.middlewares.${NET_COMPOSE_PROJECT_NAME}-authelia.forwardAuth.authResponseHeadersRegex=^Remote-"
|
||||
# Watchtower
|
||||
- 'com.centurylinklabs.watchtower.enable=${WATCHTOWER_LABEL_ENABLE}'
|
||||
- "com.centurylinklabs.watchtower.enable=${WATCHTOWER_LABEL_ENABLE}"
|
||||
|
||||
volumes:
|
||||
letsencrypt_data:
|
||||
|
||||
Reference in New Issue
Block a user