Initial commit: RunPod multi-modal AI orchestration stack

- Multi-modal AI infrastructure for RunPod RTX 4090 - Automatic model orchestration (text, image, music) - Text: vLLM + Qwen 2.5 7B Instruct - Image: Flux.1 Schnell via OpenEDAI - Music: MusicGen Medium via AudioCraft - Cost-optimized sequential loading on single GPU - Template preparation scripts for rapid deployment - Comprehensive documentation (README, DEPLOYMENT, TEMPLATE)
2025-11-21 14:34:55 +01:00
commit 277f1c95bd
35 changed files with 7654 additions and 0 deletions
--- a/.env.example
+++ b/.env.example
@@ -0,0 +1,24 @@
+# RunPod Multi-Modal AI Environment Configuration
+# Copy this file to .env and fill in your values
+
+# ============================================================================
+# HuggingFace Token (Required for model downloads)
+# ============================================================================
+# Get your token from: https://huggingface.co/settings/tokens
+# Required for downloading models: Qwen 2.5 7B, Flux.1 Schnell, MusicGen Medium
+HF_TOKEN=hf_your_token_here
+
+# ============================================================================
+# GPU Tailscale IP (Optional, for LiteLLM integration)
+# ============================================================================
+# If integrating with VPS LiteLLM proxy, set this to your GPU server's Tailscale IP
+# Get it with: tailscale ip -4
+# GPU_TAILSCALE_IP=100.100.108.13
+
+# ============================================================================
+# Notes
+# ============================================================================
+# - HF_TOKEN is the only required variable for basic operation
+# - Models will be cached in /workspace/ directories on RunPod
+# - Orchestrator automatically manages model switching
+# - No database credentials needed (stateless architecture)