- Add setup guides (SETUP_GUIDE, TAILSCALE_SETUP, DOCKER_GPU_SETUP, etc.) - Add deployment configurations (litellm-config-gpu.yaml, gpu-server-compose.yaml) - Add GPU_DEPLOYMENT_LOG.md with current infrastructure details - Add GPU_EXPANSION_PLAN.md with complete provider comparison - Add deploy-gpu-stack.sh automation script 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
8.3 KiB
Docker & NVIDIA Container Toolkit Setup
Day 5: Docker Configuration on GPU Server
This guide sets up Docker with GPU support on your RunPod server.
Step 1: Install Docker
Quick Install (Recommended)
# SSH into GPU server
ssh gpu-pivoine
# Download and run Docker install script
curl -fsSL https://get.docker.com -o get-docker.sh
sh get-docker.sh
# Verify installation
docker --version
docker compose version
Expected output:
Docker version 24.0.7, build afdd53b
Docker Compose version v2.23.0
Manual Install (Alternative)
# Add Docker's official GPG key
apt-get update
apt-get install -y ca-certificates curl gnupg
install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | gpg --dearmor -o /etc/apt/keyrings/docker.gpg
chmod a+r /etc/apt/keyrings/docker.gpg
# Add repository
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
tee /etc/apt/sources.list.d/docker.list > /dev/null
# Install Docker
apt-get update
apt-get install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
# Start Docker
systemctl enable docker
systemctl start docker
Step 2: Install NVIDIA Container Toolkit
This enables Docker containers to use the GPU.
# Add NVIDIA repository
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | \
gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
# Install toolkit
apt-get update
apt-get install -y nvidia-container-toolkit
# Configure Docker to use NVIDIA runtime
nvidia-ctk runtime configure --runtime=docker
# Restart Docker
systemctl restart docker
Step 3: Test GPU Access in Docker
Test 1: Basic CUDA Container
docker run --rm --runtime=nvidia --gpus all \
nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi
Expected output: Same as nvidia-smi output showing your RTX 4090.
Test 2: PyTorch Container
docker run --rm --runtime=nvidia --gpus all \
pytorch/pytorch:2.3.0-cuda12.1-cudnn8-runtime \
python -c "import torch; print('CUDA:', torch.cuda.is_available(), 'Device:', torch.cuda.get_device_name(0))"
Expected output:
CUDA: True Device: NVIDIA GeForce RTX 4090
Test 3: Multi-GPU Query (if you have multiple GPUs)
docker run --rm --runtime=nvidia --gpus all \
nvidia/cuda:12.1.0-base-ubuntu22.04 \
bash -c "echo 'GPU Count:' && nvidia-smi --list-gpus"
Step 4: Configure Docker Compose with GPU Support
Docker Compose needs to know about NVIDIA runtime.
Create daemon.json
cat > /etc/docker/daemon.json << 'EOF'
{
"runtimes": {
"nvidia": {
"path": "nvidia-container-runtime",
"runtimeArgs": []
}
},
"default-runtime": "nvidia",
"log-driver": "json-file",
"log-opts": {
"max-size": "10m",
"max-file": "3"
}
}
EOF
# Restart Docker
systemctl restart docker
Step 5: Create GPU Project Structure
cd /workspace
# Create directory structure
mkdir -p gpu-stack/{vllm,comfyui,training,jupyter,monitoring}
cd gpu-stack
# Create .env file
cat > .env << 'EOF'
# GPU Stack Environment Variables
# Timezone
TIMEZONE=Europe/Berlin
# VPN Network
VPS_IP=10.8.0.1
GPU_IP=10.8.0.2
# Model Storage
MODELS_PATH=/workspace/models
# Hugging Face (optional, for private models)
HF_TOKEN=
# PostgreSQL (on VPS)
DB_HOST=10.8.0.1
DB_PORT=5432
DB_USER=valknar
DB_PASSWORD=ragnarok98
DB_NAME=openwebui
# Weights & Biases (optional, for training logging)
WANDB_API_KEY=
EOF
chmod 600 .env
Step 6: Test Full Stack (Quick Smoke Test)
Let's deploy a minimal vLLM container to verify everything works:
cd /workspace/gpu-stack
# Create test compose file
cat > test-compose.yaml << 'EOF'
services:
test-vllm:
image: vllm/vllm-openai:latest
container_name: test_vllm
runtime: nvidia
environment:
NVIDIA_VISIBLE_DEVICES: all
command:
- --model
- facebook/opt-125m # Tiny model for testing
- --host
- 0.0.0.0
- --port
- 8000
ports:
- "8000:8000"
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
EOF
# Start test
docker compose -f test-compose.yaml up -d
# Wait 30 seconds for model download
sleep 30
# Check logs
docker compose -f test-compose.yaml logs
# Test inference
curl http://localhost:8000/v1/completions \
-H "Content-Type: application/json" \
-d '{
"model": "facebook/opt-125m",
"prompt": "Hello, my name is",
"max_tokens": 10
}'
Expected output (JSON response with generated text).
Clean up test:
docker compose -f test-compose.yaml down
Step 7: Install Additional Tools
# Python tools
apt install -y python3-pip python3-venv
# Monitoring tools
apt install -y htop nvtop iotop
# Network tools
apt install -y iperf3 tcpdump
# Development tools
apt install -y build-essential
# Git LFS (for large model files)
apt install -y git-lfs
git lfs install
Step 8: Configure Automatic Updates (Optional)
# Install unattended-upgrades
apt install -y unattended-upgrades
# Configure
dpkg-reconfigure -plow unattended-upgrades
# Enable automatic security updates
cat > /etc/apt/apt.conf.d/50unattended-upgrades << 'EOF'
Unattended-Upgrade::Allowed-Origins {
"${distro_id}:${distro_codename}-security";
};
Unattended-Upgrade::Automatic-Reboot "false";
Unattended-Upgrade::Remove-Unused-Dependencies "true";
EOF
Troubleshooting
Docker can't access GPU
Problem: docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]]
Solution:
# Verify NVIDIA runtime is configured
docker info | grep -i runtime
# Should show nvidia in runtimes list
# If not, reinstall nvidia-container-toolkit
# Check daemon.json
cat /etc/docker/daemon.json
# Restart Docker
systemctl restart docker
Permission denied on docker commands
Solution:
# Add your user to docker group (if not root)
usermod -aG docker $USER
# Or always use sudo
sudo docker ...
Out of disk space
Check usage:
df -h
du -sh /var/lib/docker
docker system df
Clean up:
# Remove unused images
docker image prune -a
# Remove unused volumes
docker volume prune
# Full cleanup
docker system prune -a --volumes
Verification Checklist
Before deploying the full stack:
- Docker installed and running
docker --versionshows 24.x or newerdocker compose versionworks- NVIDIA Container Toolkit installed
docker run --gpus all nvidia/cuda:12.1.0-base nvidia-smiworks- PyTorch container can see GPU
- Test vLLM deployment successful
- /workspace directory structure created
- .env file configured with VPN IPs
- Additional tools installed (nvtop, htop, etc.)
Performance Monitoring Commands
GPU Monitoring:
# Real-time GPU stats
watch -n 1 nvidia-smi
# Or with nvtop (prettier)
nvtop
# GPU memory usage
nvidia-smi --query-gpu=memory.used,memory.total --format=csv
Docker Stats:
# Container resource usage
docker stats
# Specific container
docker stats vllm --no-stream
System Resources:
# Overall system
htop
# I/O stats
iotop
# Network
iftop
Next: Deploy Production Stack
Now you're ready to deploy the full GPU stack with vLLM, ComfyUI, and training tools.
Proceed to: Deploying the production docker-compose.yaml
Save your progress:
cat >> /workspace/SERVER_INFO.md << 'EOF'
## Docker Configuration
- Docker Version: [docker --version]
- NVIDIA Runtime: Enabled
- GPU Access in Containers: ✓
- Test vLLM Deployment: Successful
- Directory: /workspace/gpu-stack
## Tools Installed
- nvtop: GPU monitoring
- htop: System monitoring
- Docker Compose: v2.x
- Git LFS: Large file support
EOF