runpod

Go to file

Sebastian Krüger d21caa56bc fix: implement incremental streaming deltas for vLLM chat completions

- Track previous_text to calculate deltas instead of sending full accumulated text
- Fixes WebUI streaming issue where responses appeared empty
- Only send new tokens in each SSE chunk delta
- Resolves OpenAI API compatibility for streaming chat completions

2025-11-21 17:23:18 +01:00

model-orchestrator

fix: correct vLLM service port to 8000

2025-11-21 16:28:54 +01:00

models

fix: implement incremental streaming deltas for vLLM chat completions

2025-11-21 17:23:18 +01:00

scripts

feat: implement Ansible-based process architecture for RunPod

2025-11-21 15:37:18 +01:00

.env.example

Initial commit: RunPod multi-modal AI orchestration stack