Files
docker-compose/ai
Sebastian Krüger 62fcf832da feat: add direct RunPod orchestrator connection to WebUI for streaming bypass
- Configure WebUI with both LiteLLM and direct orchestrator API base URLs
- This bypasses LiteLLM's streaming issues for the qwen-2.5-7b model
- WebUI will now show models from both endpoints
- Allows testing if LiteLLM is the bottleneck for streaming

Related to streaming fix in RunPod models/vllm/server.py
2025-11-21 18:38:31 +01:00
..