- Add ComfyUI installation to Ansible playbook with 'comfyui' tag
- Create ComfyUI requirements.txt and start.sh script
- Clone ComfyUI from official GitHub repository
- Symlink HuggingFace cache for Flux model access
- ComfyUI runs on port 8188 with CORS enabled
- Add ComfyUI to services list in playbook
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Set enforce_eager=True to disable CUDA graphs which were batching outputs
- Add disable_log_stats=True for better streaming performance
- This ensures AsyncLLMEngine yields tokens incrementally instead of returning complete response
- Track previous_text to calculate deltas instead of sending full accumulated text
- Fixes WebUI streaming issue where responses appeared empty
- Only send new tokens in each SSE chunk delta
- Resolves OpenAI API compatibility for streaming chat completions