- Multi-modal AI infrastructure for RunPod RTX 4090 - Automatic model orchestration (text, image, music) - Text: vLLM + Qwen 2.5 7B Instruct - Image: Flux.1 Schnell via OpenEDAI - Music: MusicGen Medium via AudioCraft - Cost-optimized sequential loading on single GPU - Template preparation scripts for rapid deployment - Comprehensive documentation (README, DEPLOYMENT, TEMPLATE)
25 lines
1.2 KiB
Plaintext
25 lines
1.2 KiB
Plaintext
# RunPod Multi-Modal AI Environment Configuration
|
|
# Copy this file to .env and fill in your values
|
|
|
|
# ============================================================================
|
|
# HuggingFace Token (Required for model downloads)
|
|
# ============================================================================
|
|
# Get your token from: https://huggingface.co/settings/tokens
|
|
# Required for downloading models: Qwen 2.5 7B, Flux.1 Schnell, MusicGen Medium
|
|
HF_TOKEN=hf_your_token_here
|
|
|
|
# ============================================================================
|
|
# GPU Tailscale IP (Optional, for LiteLLM integration)
|
|
# ============================================================================
|
|
# If integrating with VPS LiteLLM proxy, set this to your GPU server's Tailscale IP
|
|
# Get it with: tailscale ip -4
|
|
# GPU_TAILSCALE_IP=100.100.108.13
|
|
|
|
# ============================================================================
|
|
# Notes
|
|
# ============================================================================
|
|
# - HF_TOKEN is the only required variable for basic operation
|
|
# - Models will be cached in /workspace/ directories on RunPod
|
|
# - Orchestrator automatically manages model switching
|
|
# - No database credentials needed (stateless architecture)
|