docker-compose/ai/compose.yaml at 62fcf832da8e60238b12be114b36f1af754c4398

Files

Sebastian Krüger 62fcf832da feat: add direct RunPod orchestrator connection to WebUI for streaming bypass

- Configure WebUI with both LiteLLM and direct orchestrator API base URLs
- This bypasses LiteLLM's streaming issues for the qwen-2.5-7b model
- WebUI will now show models from both endpoints
- Allows testing if LiteLLM is the bottleneck for streaming

Related to streaming fix in RunPod models/vllm/server.py

2025-11-21 18:38:31 +01:00

9.0 KiB

Raw Blame History

View Raw

9.0 KiB Raw Blame History

9.0 KiB

Raw Blame History