Sebastian Krüger d21caa56bc fix: implement incremental streaming deltas for vLLM chat completions
- Track previous_text to calculate deltas instead of sending full accumulated text
- Fixes WebUI streaming issue where responses appeared empty
- Only send new tokens in each SSE chunk delta
- Resolves OpenAI API compatibility for streaming chat completions
2025-11-21 17:23:18 +01:00
Description
No description provided
20 MiB
Languages
Python 79.1%
Shell 19.3%
Dockerfile 1.6%