Files

Sebastian Krüger 8de88d96ac docs(ai): add comprehensive GPU setup documentation and configs

- Add setup guides (SETUP_GUIDE, TAILSCALE_SETUP, DOCKER_GPU_SETUP, etc.)
- Add deployment configurations (litellm-config-gpu.yaml, gpu-server-compose.yaml)
- Add GPU_DEPLOYMENT_LOG.md with current infrastructure details
- Add GPU_EXPANSION_PLAN.md with complete provider comparison
- Add deploy-gpu-stack.sh automation script

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-11-21 12:57:06 +01:00

5.6 KiB

Raw Blame History

GPU Server Setup Guide - Week 1

Day 1-2: RunPod Account & GPU Server

Step 1: Create RunPod Account

Go to RunPod: https://www.runpod.io/
Sign up with email or GitHub
Add billing method:
- Credit card required
- No charges until you deploy a pod
- Recommended: Add $50 initial credit
Verify email and complete account setup

Step 2: Deploy Your First GPU Pod

2.1 Navigate to Pods

Click "Deploy" in top menu
Select "GPU Pods"

2.2 Choose GPU Type

Recommended: RTX 4090

24GB VRAM
~$0.50/hour
Perfect for LLMs up to 14B params
Great for SDXL/FLUX

Filter options:

GPU Type: RTX 4090
GPU Count: 1
Sort by: Price (lowest first)
Region: Europe (lower latency to Germany)

2.3 Select Template

Choose: "RunPod PyTorch" template

Includes: CUDA, PyTorch, Python
Pre-configured for GPU workloads
Docker pre-installed

Alternative: "Ubuntu 22.04 with CUDA 12.1" (more control)

2.4 Configure Pod

Container Settings:

Container Disk: 50GB (temporary, auto-included)
Expose Ports:
- Add: 22 (SSH)
- Add: 8000 (vLLM)
- Add: 8188 (ComfyUI)
- Add: 8888 (JupyterLab)

Volume Settings:

Click "+ Network Volume"
Name: gpu-models-storage
Size: 500GB
Region: Same as pod
Cost: ~$50/month

Environment Variables:

Add later (not needed for initial setup)

2.5 Deploy Pod

Review configuration
Click "Deploy On-Demand" (not Spot for reliability)
Wait 2-3 minutes for deployment

Expected cost:

GPU: $0.50/hour = $360/month (24/7)
Storage: $50/month
Total: $410/month

Step 3: Access Your GPU Server

3.1 Get Connection Info

Once deployed, you'll see:

Pod ID: e.g., abc123def456
SSH Command: ssh root@<pod-id>.runpod.io -p 12345
Public IP: May not be directly accessible (use SSH)

3.2 SSH Access

RunPod automatically generates SSH keys for you:

# Copy the SSH command from RunPod dashboard
ssh root@abc123def456.runpod.io -p 12345

# First time: Accept fingerprint
# You should now be in the GPU server!

Verify GPU:

nvidia-smi

Expected output:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 535.xx       Driver Version: 535.xx       CUDA Version: 12.1    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |
| 30%   45C    P0    50W / 450W |      0MiB / 24564MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

Step 4: Initial Server Configuration

4.1 Update System

# Update package lists
apt update

# Upgrade existing packages
apt upgrade -y

# Install essential tools
apt install -y \
  vim \
  htop \
  tmux \
  curl \
  wget \
  git \
  net-tools \
  iptables-persistent

4.2 Set Timezone

timedatectl set-timezone Europe/Berlin
date  # Verify

4.3 Create Working Directory

# Create workspace
mkdir -p /workspace/{models,configs,data,scripts}

# Check network volume mount
ls -la /workspace
# Should show your 500GB volume

4.4 Configure SSH (Optional but Recommended)

Generate your own SSH key on your local machine:

# On your local machine (not GPU server)
ssh-keygen -t ed25519 -C "gpu-server-pivoine" -f ~/.ssh/gpu_pivoine

# Copy public key to GPU server
ssh-copy-id -i ~/.ssh/gpu_pivoine.pub root@abc123def456.runpod.io -p 12345

Add to your local ~/.ssh/config:

Host gpu-pivoine
    HostName abc123def456.runpod.io
    Port 12345
    User root
    IdentityFile ~/.ssh/gpu_pivoine

Now you can connect with: ssh gpu-pivoine

Step 5: Verify GPU Access

Run this test:

# Test CUDA
python3 -c "import torch; print('CUDA available:', torch.cuda.is_available()); print('GPU count:', torch.cuda.device_count())"

Expected output:

CUDA available: True
GPU count: 1

Troubleshooting

Problem: Can't connect via SSH

Check pod is running (not stopped)
Verify port number in SSH command
Try web terminal in RunPod dashboard

Problem: GPU not detected

Run nvidia-smi
Check RunPod selected correct GPU type
Restart pod if needed

Problem: Network volume not mounted

Check RunPod dashboard → Volume tab
Verify volume is attached to pod
Try: df -h to see mounts

Next Steps

Once SSH access works and GPU is verified: ✅ Proceed to Day 3-4: Network Configuration (Tailscale VPN)

Save Important Info

Create a file to track your setup:

# On GPU server
cat > /workspace/SERVER_INFO.md << 'EOF'
# GPU Server Information

## Connection
- SSH: ssh root@abc123def456.runpod.io -p 12345
- Pod ID: abc123def456
- Region: [YOUR_REGION]

## Hardware
- GPU: RTX 4090 24GB
- CPU: [Check with: lscpu]
- RAM: [Check with: free -h]
- Storage: 500GB network volume at /workspace

## Costs
- GPU: $0.50/hour
- Storage: $50/month
- Total: ~$410/month (24/7)

## Deployed: [DATE]
EOF

Checkpoint ✓

Before moving to Day 3, verify:

RunPod account created and billing added
RTX 4090 pod deployed successfully
500GB network volume attached
SSH access working
nvidia-smi shows GPU
torch.cuda.is_available() returns True
Timezone set to Europe/Berlin
Essential tools installed

Ready for Tailscale setup? Let's go!

5.6 KiB Raw Blame History