Initial commit: RunPod multi-modal AI orchestration stack

- Multi-modal AI infrastructure for RunPod RTX 4090
- Automatic model orchestration (text, image, music)
- Text: vLLM + Qwen 2.5 7B Instruct
- Image: Flux.1 Schnell via OpenEDAI
- Music: MusicGen Medium via AudioCraft
- Cost-optimized sequential loading on single GPU
- Template preparation scripts for rapid deployment
- Comprehensive documentation (README, DEPLOYMENT, TEMPLATE)
This commit is contained in:
2025-11-21 14:34:55 +01:00
commit 277f1c95bd
35 changed files with 7654 additions and 0 deletions

24
.env.example Normal file
View File

@@ -0,0 +1,24 @@
# RunPod Multi-Modal AI Environment Configuration
# Copy this file to .env and fill in your values
# ============================================================================
# HuggingFace Token (Required for model downloads)
# ============================================================================
# Get your token from: https://huggingface.co/settings/tokens
# Required for downloading models: Qwen 2.5 7B, Flux.1 Schnell, MusicGen Medium
HF_TOKEN=hf_your_token_here
# ============================================================================
# GPU Tailscale IP (Optional, for LiteLLM integration)
# ============================================================================
# If integrating with VPS LiteLLM proxy, set this to your GPU server's Tailscale IP
# Get it with: tailscale ip -4
# GPU_TAILSCALE_IP=100.100.108.13
# ============================================================================
# Notes
# ============================================================================
# - HF_TOKEN is the only required variable for basic operation
# - Models will be cached in /workspace/ directories on RunPod
# - Orchestrator automatically manages model switching
# - No database credentials needed (stateless architecture)