Files

Sebastian Krüger abcebd1d9b docs: migrate multi-modal AI orchestration to dedicated runpod repository

Multi-modal AI stack (text/image/music generation) has been moved to:
Repository: ssh://git@dev.pivoine.art:2222/valknar/runpod.git

Updated ai/README.md to document:
- VPS AI services (Open WebUI, Crawl4AI, AI PostgreSQL)
- Reference to new runpod repository for GPU infrastructure
- Clear separation between VPS and GPU deployments
- Integration architecture via Tailscale VPN

2025-11-21 14:36:36 +01:00

flux/config

feat(ai): add multi-modal orchestration system for text, image, and music generation

2025-11-21 14:12:13 +01:00

model-orchestrator

feat(ai): add multi-modal orchestration system for text, image, and music generation

2025-11-21 14:12:13 +01:00

musicgen

feat(ai): add multi-modal orchestration system for text, image, and music generation

2025-11-21 14:12:13 +01:00

postgres/init

feat: add PostgreSQL initialization script for AI stack

2025-11-09 18:36:50 +01:00

vllm

feat(ai): add multi-modal orchestration system for text, image, and music generation

2025-11-21 14:12:13 +01:00

.env.example

feat(ai): add multi-modal orchestration system for text, image, and music generation

2025-11-21 14:12:13 +01:00

compose.yaml

perf: optimize LiteLLM for better performance

2025-11-16 16:03:19 +01:00

deploy-gpu-stack.sh

docs(ai): add comprehensive GPU setup documentation and configs

2025-11-21 12:57:06 +01:00

disable-nsfw-filter.patch

fix: correct patch for Facefusion 3.5.0 content_analyser.py

2025-11-13 04:28:50 +00:00

DOCKER_GPU_SETUP.md

docs(ai): add comprehensive GPU setup documentation and configs

2025-11-21 12:57:06 +01:00

docker-compose.gpu.yaml

feat(ai): add multi-modal orchestration system for text, image, and music generation

2025-11-21 14:12:13 +01:00

Dockerfile

fix: update content_analyser hash check in core.py for patched version

2025-11-13 06:16:14 +01:00

entrypoint.sh

fix: add Gradio environment variables and remove conflicting command

2025-11-13 05:52:13 +01:00

GPU_DEPLOYMENT_LOG.md

feat(ai): add multi-modal orchestration system for text, image, and music generation

2025-11-21 14:12:13 +01:00

GPU_EXPANSION_PLAN.md

docs(ai): add comprehensive GPU setup documentation and configs

2025-11-21 12:57:06 +01:00

gpu-server-compose.yaml

docs(ai): add comprehensive GPU setup documentation and configs

2025-11-21 12:57:06 +01:00

litellm-config-gpu.yaml

docs(ai): add comprehensive GPU setup documentation and configs

2025-11-21 12:57:06 +01:00

litellm-config.yaml

feat(ai): add multi-modal orchestration system for text, image, and music generation

2025-11-21 14:12:13 +01:00

README_GPU_SETUP.md

docs(ai): add comprehensive GPU setup documentation and configs

2025-11-21 12:57:06 +01:00

README.md

docs: migrate multi-modal AI orchestration to dedicated runpod repository

2025-11-21 14:36:36 +01:00

SETUP_GUIDE.md

docs(ai): add comprehensive GPU setup documentation and configs

2025-11-21 12:57:06 +01:00

simple_vllm_server.py

feat(ai): add GPU server deployment with vLLM and Tailscale

2025-11-21 12:56:57 +01:00

TAILSCALE_SETUP.md

docs(ai): add comprehensive GPU setup documentation and configs

2025-11-21 12:57:06 +01:00

WIREGUARD_SETUP.md

docs(ai): add comprehensive GPU setup documentation and configs

2025-11-21 12:57:06 +01:00

README.md

AI Infrastructure

This directory contains AI-related configurations for the VPS deployment.

The multi-modal AI orchestration stack (text, image, music generation) has been moved to a dedicated repository:

Repository: https://dev.pivoine.art/valknar/runpod

The RunPod repository contains:

Model orchestrator for automatic switching between text, image, and music models
vLLM + Qwen 2.5 7B (text generation)
Flux.1 Schnell (image generation)
MusicGen Medium (music generation)
RunPod template creation scripts
Complete deployment documentation

This separation allows for independent management of:

VPS Services (this repo): Open WebUI, Crawl4AI, AI database
GPU Services (runpod repo): Model inference, orchestration, RunPod templates

VPS AI Services (ai/compose.yaml)

This compose stack manages the VPS-side AI infrastructure that integrates with the GPU server:

Services

ai_postgres

Dedicated PostgreSQL 16 instance with pgvector extension for AI workloads:

Vector similarity search support
Isolated from core database for performance
Used by Open WebUI for RAG and embeddings

webui (Open WebUI)

ChatGPT-like interface exposed at ai.pivoine.art:8080:

Claude API integration via Anthropic
RAG support with document upload
Vector storage via pgvector
Web search capability
SMTP email via IONOS
User signup enabled

crawl4ai

Internal web scraping service for LLM content preparation:

API on port 11235 (not exposed publicly)
Optimized for AI/RAG workflows
Integration with Open WebUI and n8n

Integration with GPU Server

The VPS AI services connect to the GPU server via Tailscale VPN:

VPS Tailscale IP: 100.102.217.79
GPU Tailscale IP: 100.100.108.13

LiteLLM Proxy (port 4000 on VPS) routes requests:

Claude API for chat completions
GPU orchestrator for self-hosted models (text, image, music)

See ../litellm-config.yaml for routing configuration.

Environment Variables

Required in .env:

# AI Database
AI_DB_PASSWORD=<password>

# Open WebUI
AI_WEBUI_SECRET_KEY=<secret>

# Claude API
ANTHROPIC_API_KEY=<api_key>

# Email (IONOS SMTP)
ADMIN_EMAIL=<email>
SMTP_HOST=smtp.ionos.com
SMTP_PORT=587
SMTP_USER=<smtp_user>
SMTP_PASSWORD=<smtp_password>

Backup Configuration

AI services are backed up daily via Restic:

ai_postgres_data: 3 AM (7 daily, 4 weekly, 6 monthly, 2 yearly)
ai_webui_data: 3 AM (same retention)
ai_crawl4ai_data: 3 AM (same retention)

Repository: /mnt/hidrive/users/valknar/Backup

Management Commands

# Start AI stack
pnpm arty up ai_postgres webui crawl4ai

# View logs
docker logs -f ai_webui
docker logs -f ai_postgres
docker logs -f ai_crawl4ai

# Check Open WebUI
curl http://ai.pivoine.art:8080/health

# Restart AI services
pnpm arty restart ai_postgres webui crawl4ai

GPU Server Management

For GPU server operations (model orchestration, template creation, etc.):

# Clone the dedicated repository
git clone ssh://git@dev.pivoine.art:2222/valknar/runpod.git

# See runpod repository for:
# - Model orchestration setup
# - RunPod template creation
# - GPU deployment guides

Documentation

VPS AI Services

GPU_DEPLOYMENT_LOG.md - VPS AI deployment history

GPU Server (Separate Repository)

runpod/README.md - Main GPU documentation
runpod/DEPLOYMENT.md - Deployment guide
runpod/RUNPOD_TEMPLATE.md - Template creation

Architecture Overview

┌─────────────────────────────────────────────────────────────────┐
│                     VPS (Tailscale: 100.102.217.79)             │
│  ┌───────────────────────────────────────────────────────────┐  │
│  │ LiteLLM Proxy (Port 4000)                                 │  │
│  │ Routes to: Claude API + GPU Orchestrator                  │  │
│  └───────┬───────────────────────────────────────────────────┘  │
│          │                                                       │
│  ┌───────▼─────────┐  ┌──────────────┐  ┌─────────────────┐   │
│  │ Open WebUI      │  │ Crawl4AI     │  │ AI PostgreSQL   │   │
│  │ Port: 8080      │  │ Port: 11235  │  │ + pgvector      │   │
│  └─────────────────┘  └──────────────┘  └─────────────────┘   │
└─────────────────────────────────────────────────────────────────┘
                               │ Tailscale VPN
┌──────────────────────────────┼──────────────────────────────────┐
│              RunPod GPU Server (Tailscale: 100.100.108.13)      │
│  ┌───────────────────────────▼──────────────────────────────┐   │
│  │ Orchestrator (Port 9000)                                  │   │
│  │ Manages sequential model loading                          │   │
│  └─────┬──────────────┬──────────────────┬──────────────────┘   │
│        │              │                  │                       │
│  ┌─────▼──────┐ ┌────▼────────┐  ┌──────▼───────┐              │
│  │vLLM        │ │Flux.1       │  │MusicGen      │              │
│  │Qwen 2.5 7B │ │Schnell      │  │Medium        │              │
│  │Port: 8001  │ │Port: 8002   │  │Port: 8003    │              │
│  └────────────┘ └─────────────┘  └──────────────┘              │
└─────────────────────────────────────────────────────────────────┘

Support

For issues:

VPS AI services: Check logs via docker logs
GPU server: See runpod repository documentation
LiteLLM routing: Review ../litellm-config.yaml

README.md

AI Infrastructure

Multi-Modal GPU Infrastructure (Migrated)

VPS AI Services (ai/compose.yaml)

Services

ai_postgres

webui (Open WebUI)

crawl4ai

Integration with GPU Server

Environment Variables

Backup Configuration

Management Commands

GPU Server Management

Documentation

VPS AI Services

GPU Server (Separate Repository)

Architecture Overview

Support