Initial commit: RunPod multi-modal AI orchestration stack

- Multi-modal AI infrastructure for RunPod RTX 4090 - Automatic model orchestration (text, image, music) - Text: vLLM + Qwen 2.5 7B Instruct - Image: Flux.1 Schnell via OpenEDAI - Music: MusicGen Medium via AudioCraft - Cost-optimized sequential loading on single GPU - Template preparation scripts for rapid deployment - Comprehensive documentation (README, DEPLOYMENT, TEMPLATE)
2025-11-21 14:34:55 +01:00
commit 277f1c95bd
35 changed files with 7654 additions and 0 deletions
--- a/SETUP_GUIDE.md
+++ b/SETUP_GUIDE.md
@@ -0,0 +1,261 @@
+# GPU Server Setup Guide - Week 1
+
+## Day 1-2: RunPod Account & GPU Server
+
+### Step 1: Create RunPod Account
+
+1. **Go to RunPod**: https://www.runpod.io/
+2. **Sign up** with email or GitHub
+3. **Add billing method**:
+   - Credit card required
+   - No charges until you deploy a pod
+   - Recommended: Add $50 initial credit
+
+4. **Verify email** and complete account setup
+
+### Step 2: Deploy Your First GPU Pod
+
+#### 2.1 Navigate to Pods
+
+1. Click **"Deploy"** in top menu
+2. Select **"GPU Pods"**
+
+#### 2.2 Choose GPU Type
+
+**Recommended: RTX 4090**
+- 24GB VRAM
+- ~$0.50/hour
+- Perfect for LLMs up to 14B params
+- Great for SDXL/FLUX
+
+**Filter options:**
+- GPU Type: RTX 4090
+- GPU Count: 1
+- Sort by: Price (lowest first)
+- Region: Europe (lower latency to Germany)
+
+#### 2.3 Select Template
+
+Choose: **"RunPod PyTorch"** template
+- Includes: CUDA, PyTorch, Python
+- Pre-configured for GPU workloads
+- Docker pre-installed
+
+**Alternative**: "Ubuntu 22.04 with CUDA 12.1" (more control)
+
+#### 2.4 Configure Pod
+
+**Container Settings:**
+- **Container Disk**: 50GB (temporary, auto-included)
+- **Expose Ports**:
+  - Add: 22 (SSH)
+  - Add: 8000 (vLLM)
+  - Add: 8188 (ComfyUI)
+  - Add: 8888 (JupyterLab)
+
+**Volume Settings:**
+- Click **"+ Network Volume"**
+- **Name**: `gpu-models-storage`
+- **Size**: 500GB
+- **Region**: Same as pod
+- **Cost**: ~$50/month
+
+**Environment Variables:**
+- Add later (not needed for initial setup)
+
+#### 2.5 Deploy Pod
+
+1. Review configuration
+2. Click **"Deploy On-Demand"** (not Spot for reliability)
+3. Wait 2-3 minutes for deployment
+
+**Expected cost:**
+- GPU: $0.50/hour = $360/month (24/7)
+- Storage: $50/month
+- **Total: $410/month**
+
+### Step 3: Access Your GPU Server
+
+#### 3.1 Get Connection Info
+
+Once deployed, you'll see:
+- **Pod ID**: e.g., `abc123def456`
+- **SSH Command**: `ssh root@<pod-id>.runpod.io -p 12345`
+- **Public IP**: May not be directly accessible (use SSH)
+
+#### 3.2 SSH Access
+
+RunPod automatically generates SSH keys for you:
+
+```bash
+# Copy the SSH command from RunPod dashboard
+ssh root@abc123def456.runpod.io -p 12345
+
+# First time: Accept fingerprint
+# You should now be in the GPU server!
+```
+
+**Verify GPU:**
+```bash
+nvidia-smi
+```
+
+Expected output:
+```
+-----------------------------------------------------------------------------+
+| NVIDIA-SMI 535.xx       Driver Version: 535.xx       CUDA Version: 12.1    |
+|-------------------------------+----------------------+----------------------+
+| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
+| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
+|===============================+======================+======================|
+|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |
+| 30%   45C    P0    50W / 450W |      0MiB / 24564MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
+```
+
+### Step 4: Initial Server Configuration
+
+#### 4.1 Update System
+
+```bash
+# Update package lists
+apt update
+
+# Upgrade existing packages
+apt upgrade -y
+
+# Install essential tools
+apt install -y \
+  vim \
+  htop \
+  tmux \
+  curl \
+  wget \
+  git \
+  net-tools \
+  iptables-persistent
+```
+
+#### 4.2 Set Timezone
+
+```bash
+timedatectl set-timezone Europe/Berlin
+date  # Verify
+```
+
+#### 4.3 Create Working Directory
+
+```bash
+# Create workspace
+mkdir -p /workspace/{models,configs,data,scripts}
+
+# Check network volume mount
+ls -la /workspace
+# Should show your 500GB volume
+```
+
+#### 4.4 Configure SSH (Optional but Recommended)
+
+**Generate your own SSH key on your local machine:**
+
+```bash
+# On your local machine (not GPU server)
+ssh-keygen -t ed25519 -C "gpu-server-pivoine" -f ~/.ssh/gpu_pivoine
+
+# Copy public key to GPU server
+ssh-copy-id -i ~/.ssh/gpu_pivoine.pub root@abc123def456.runpod.io -p 12345
+```
+
+**Add to your local ~/.ssh/config:**
+
+```bash
+Host gpu-pivoine
+    HostName abc123def456.runpod.io
+    Port 12345
+    User root
+    IdentityFile ~/.ssh/gpu_pivoine
+```
+
+Now you can connect with: `ssh gpu-pivoine`
+
+### Step 5: Verify GPU Access
+
+Run this test:
+
+```bash
+# Test CUDA
+python3 -c "import torch; print('CUDA available:', torch.cuda.is_available()); print('GPU count:', torch.cuda.device_count())"
+```
+
+Expected output:
+```
+CUDA available: True
+GPU count: 1
+```
+
+### Troubleshooting
+
+**Problem: Can't connect via SSH**
+- Check pod is running (not stopped)
+- Verify port number in SSH command
+- Try web terminal in RunPod dashboard
+
+**Problem: GPU not detected**
+- Run `nvidia-smi`
+- Check RunPod selected correct GPU type
+- Restart pod if needed
+
+**Problem: Network volume not mounted**
+- Check RunPod dashboard → Volume tab
+- Verify volume is attached to pod
+- Try: `df -h` to see mounts
+
+### Next Steps
+
+Once SSH access works and GPU is verified:
+✅ Proceed to **Day 3-4: Network Configuration (Tailscale VPN)**
+
+### Save Important Info
+
+Create a file to track your setup:
+
+```bash
+# On GPU server
+cat > /workspace/SERVER_INFO.md << 'EOF'
+# GPU Server Information
+
+## Connection
+- SSH: ssh root@abc123def456.runpod.io -p 12345
+- Pod ID: abc123def456
+- Region: [YOUR_REGION]
+
+## Hardware
+- GPU: RTX 4090 24GB
+- CPU: [Check with: lscpu]
+- RAM: [Check with: free -h]
+- Storage: 500GB network volume at /workspace
+
+## Costs
+- GPU: $0.50/hour
+- Storage: $50/month
+- Total: ~$410/month (24/7)
+
+## Deployed: [DATE]
+EOF
+```
+
+---
+
+## Checkpoint ✓
+
+Before moving to Day 3, verify:
+- [ ] RunPod account created and billing added
+- [ ] RTX 4090 pod deployed successfully
+- [ ] 500GB network volume attached
+- [ ] SSH access working
+- [ ] `nvidia-smi` shows GPU
+- [ ] `torch.cuda.is_available()` returns True
+- [ ] Timezone set to Europe/Berlin
+- [ ] Essential tools installed
+
+**Ready for Tailscale setup? Let's go!**