docker-compose/ai at bb3dabcba73504c054222beaae5190237d34a59a - docker-compose - dev.pivoine.art

valknar/docker-compose

Files

History

Sebastian Krüger bb3dabcba7 feat(ai): complete GPU deployment with self-hosted Qwen 2.5 7B model

This commit finalizes the GPU infrastructure deployment on RunPod:

- Added qwen-2.5-7b model to LiteLLM configuration
  - Self-hosted on RunPod RTX 4090 GPU server
  - Connected via Tailscale VPN (100.100.108.13:8000)
  - OpenAI-compatible API endpoint
  - Rate limits: 1000 RPM, 100k TPM

- Marked GPU deployment as COMPLETE in deployment log
  - vLLM 0.6.4.post1 with custom AsyncLLMEngine server
  - Qwen/Qwen2.5-7B-Instruct model (14.25 GB)
  - 85% GPU memory utilization, 4096 context length
  - Successfully integrated with Open WebUI at ai.pivoine.art

Infrastructure:
- Provider: RunPod Spot Instance (~$0.50/hr)
- GPU: NVIDIA RTX 4090 24GB
- Disk: 50GB local SSD + 922TB network volume
- VPN: Tailscale (replaces WireGuard due to RunPod UDP restrictions)

Model now visible and accessible in Open WebUI for end users.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-11-21 13:18:17 +01:00

..

feat: add PostgreSQL initialization script for AI stack

2025-11-09 18:36:50 +01:00

compose.yaml

perf: optimize LiteLLM for better performance

2025-11-16 16:03:19 +01:00

deploy-gpu-stack.sh

docs(ai): add comprehensive GPU setup documentation and configs

2025-11-21 12:57:06 +01:00

disable-nsfw-filter.patch

fix: correct patch for Facefusion 3.5.0 content_analyser.py

2025-11-13 04:28:50 +00:00

DOCKER_GPU_SETUP.md

docs(ai): add comprehensive GPU setup documentation and configs

2025-11-21 12:57:06 +01:00

Dockerfile

fix: update content_analyser hash check in core.py for patched version

2025-11-13 06:16:14 +01:00

entrypoint.sh

fix: add Gradio environment variables and remove conflicting command

2025-11-13 05:52:13 +01:00

GPU_DEPLOYMENT_LOG.md

feat(ai): complete GPU deployment with self-hosted Qwen 2.5 7B model

2025-11-21 13:18:17 +01:00

GPU_EXPANSION_PLAN.md

docs(ai): add comprehensive GPU setup documentation and configs

2025-11-21 12:57:06 +01:00

gpu-server-compose.yaml

docs(ai): add comprehensive GPU setup documentation and configs

2025-11-21 12:57:06 +01:00

litellm-config-gpu.yaml

docs(ai): add comprehensive GPU setup documentation and configs

2025-11-21 12:57:06 +01:00

litellm-config.yaml

feat(ai): complete GPU deployment with self-hosted Qwen 2.5 7B model

2025-11-21 13:18:17 +01:00

README_GPU_SETUP.md

docs(ai): add comprehensive GPU setup documentation and configs

2025-11-21 12:57:06 +01:00

SETUP_GUIDE.md

docs(ai): add comprehensive GPU setup documentation and configs

2025-11-21 12:57:06 +01:00

simple_vllm_server.py

feat(ai): add GPU server deployment with vLLM and Tailscale

2025-11-21 12:56:57 +01:00

TAILSCALE_SETUP.md

docs(ai): add comprehensive GPU setup documentation and configs

2025-11-21 12:57:06 +01:00

WIREGUARD_SETUP.md

docs(ai): add comprehensive GPU setup documentation and configs

2025-11-21 12:57:06 +01:00