docker-compose/ai/SETUP_GUIDE.md

# GPU Server Setup Guide - Week 1

## Day 1-2: RunPod Account & GPU Server

### Step 1: Create RunPod Account

1. **Go to RunPod**: https://www.runpod.io/
2. **Sign up** with email or GitHub
3. **Add billing method**:
   - Credit card required
   - No charges until you deploy a pod
   - Recommended: Add $50 initial credit

4. **Verify email** and complete account setup

### Step 2: Deploy Your First GPU Pod

#### 2.1 Navigate to Pods

1. Click **"Deploy"** in top menu
2. Select **"GPU Pods"**

#### 2.2 Choose GPU Type

**Recommended: RTX 4090**
- 24GB VRAM
- ~$0.50/hour
- Perfect for LLMs up to 14B params
- Great for SDXL/FLUX

**Filter options:**
- GPU Type: RTX 4090
- GPU Count: 1
- Sort by: Price (lowest first)
- Region: Europe (lower latency to Germany)

#### 2.3 Select Template

Choose: **"RunPod PyTorch"** template
- Includes: CUDA, PyTorch, Python
- Pre-configured for GPU workloads
- Docker pre-installed

**Alternative**: "Ubuntu 22.04 with CUDA 12.1" (more control)

#### 2.4 Configure Pod

**Container Settings:**
- **Container Disk**: 50GB (temporary, auto-included)
- **Expose Ports**:
  - Add: 22 (SSH)
  - Add: 8000 (vLLM)
  - Add: 8188 (ComfyUI)
  - Add: 8888 (JupyterLab)

**Volume Settings:**
- Click **"+ Network Volume"**
- **Name**: `gpu-models-storage`
- **Size**: 500GB
- **Region**: Same as pod
- **Cost**: ~$50/month

**Environment Variables:**
- Add later (not needed for initial setup)

#### 2.5 Deploy Pod

1. Review configuration
2. Click **"Deploy On-Demand"** (not Spot for reliability)
3. Wait 2-3 minutes for deployment

**Expected cost:**
- GPU: $0.50/hour = $360/month (24/7)
- Storage: $50/month
- **Total: $410/month**

### Step 3: Access Your GPU Server

#### 3.1 Get Connection Info

Once deployed, you'll see:
- **Pod ID**: e.g., `abc123def456`
- **SSH Command**: `ssh root@<pod-id>.runpod.io -p 12345`
- **Public IP**: May not be directly accessible (use SSH)

#### 3.2 SSH Access

RunPod automatically generates SSH keys for you:

```bash
# Copy the SSH command from RunPod dashboard
ssh root@abc123def456.runpod.io -p 12345

# First time: Accept fingerprint
# You should now be in the GPU server!
```

**Verify GPU:**
```bash
nvidia-smi
```

Expected output:
```
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 535.xx       Driver Version: 535.xx       CUDA Version: 12.1    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |
| 30%   45C    P0    50W / 450W |      0MiB / 24564MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
```

### Step 4: Initial Server Configuration

#### 4.1 Update System

```bash
# Update package lists
apt update

# Upgrade existing packages
apt upgrade -y

# Install essential tools
apt install -y \
  vim \
  htop \
  tmux \
  curl \
  wget \
  git \
  net-tools \
  iptables-persistent
```

#### 4.2 Set Timezone

```bash
timedatectl set-timezone Europe/Berlin
date  # Verify
```

#### 4.3 Create Working Directory

```bash
# Create workspace
mkdir -p /workspace/{models,configs,data,scripts}

# Check network volume mount
ls -la /workspace
# Should show your 500GB volume
```

#### 4.4 Configure SSH (Optional but Recommended)

**Generate your own SSH key on your local machine:**

```bash
# On your local machine (not GPU server)
ssh-keygen -t ed25519 -C "gpu-server-pivoine" -f ~/.ssh/gpu_pivoine

# Copy public key to GPU server
ssh-copy-id -i ~/.ssh/gpu_pivoine.pub root@abc123def456.runpod.io -p 12345
```

**Add to your local ~/.ssh/config:**

```bash
Host gpu-pivoine
    HostName abc123def456.runpod.io
    Port 12345
    User root
    IdentityFile ~/.ssh/gpu_pivoine
```

Now you can connect with: `ssh gpu-pivoine`

### Step 5: Verify GPU Access

Run this test:

```bash
# Test CUDA
python3 -c "import torch; print('CUDA available:', torch.cuda.is_available()); print('GPU count:', torch.cuda.device_count())"
```

Expected output:
```
CUDA available: True
GPU count: 1
```

### Troubleshooting

**Problem: Can't connect via SSH**
- Check pod is running (not stopped)
- Verify port number in SSH command
- Try web terminal in RunPod dashboard

**Problem: GPU not detected**
- Run `nvidia-smi`
- Check RunPod selected correct GPU type
- Restart pod if needed

**Problem: Network volume not mounted**
- Check RunPod dashboard → Volume tab
- Verify volume is attached to pod
- Try: `df -h` to see mounts

### Next Steps

Once SSH access works and GPU is verified:
✅ Proceed to **Day 3-4: Network Configuration (Tailscale VPN)**

### Save Important Info

Create a file to track your setup:

```bash
# On GPU server
cat > /workspace/SERVER_INFO.md << 'EOF'
# GPU Server Information

## Connection
- SSH: ssh root@abc123def456.runpod.io -p 12345
- Pod ID: abc123def456
- Region: [YOUR_REGION]

## Hardware
- GPU: RTX 4090 24GB
- CPU: [Check with: lscpu]
- RAM: [Check with: free -h]
- Storage: 500GB network volume at /workspace

## Costs
- GPU: $0.50/hour
- Storage: $50/month
- Total: ~$410/month (24/7)

## Deployed: [DATE]
EOF
```

---

## Checkpoint ✓

Before moving to Day 3, verify:
- [ ] RunPod account created and billing added
- [ ] RTX 4090 pod deployed successfully
- [ ] 500GB network volume attached
- [ ] SSH access working
- [ ] `nvidia-smi` shows GPU
- [ ] `torch.cuda.is_available()` returns True
- [ ] Timezone set to Europe/Berlin
- [ ] Essential tools installed

**Ready for Tailscale setup? Let's go!**