Commit Graph

11 Commits

Author SHA1 Message Date
18cd87fbd1 refactor: reorganize webdav-sync into dedicated directory
Clean up project structure by organizing WebDAV sync service properly.

Changes:
- Move scripts/comfyui_webdav_sync.py → webdav-sync/webdav_sync.py
- Create webdav-sync/requirements.txt with watchdog and webdavclient3
- Remove webdav dependencies from model-orchestrator/requirements.txt
- Delete unused scripts/ folder (start-all.sh, status.sh, stop-all.sh)
- Update supervisord.conf to use new path /workspace/ai/webdav-sync/webdav_sync.py

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 19:03:48 +01:00
79442bd62e feat: add WebDAV sync service for ComfyUI outputs
Add Python watchdog service to automatically sync ComfyUI outputs to HiDrive WebDAV storage.

Changes:
- Add scripts/comfyui_webdav_sync.py: File watcher service using watchdog + webdavclient3
- Update model-orchestrator/requirements.txt: Add watchdog and webdavclient3 dependencies
- Update supervisord.conf: Add webdav-sync program with ENV variable support
- Update arty.yml: Add service management scripts (start/stop/restart/status/logs)

WebDAV credentials are now loaded from .env file (WEBDAV_URL, WEBDAV_USERNAME, WEBDAV_PASSWORD, WEBDAV_REMOTE_PATH)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 18:58:18 +01:00
7edf17551a Add Llama 3.1 8B model to orchestrator 2025-11-21 21:29:57 +01:00
c4426ccc58 Remove Flux and MusicGen models from orchestrator
ComfyUI now handles Flux image generation directly.
MusicGen is not being used and has been removed.
2025-11-21 21:14:43 +01:00
34fc456a3b feat: add /v1/models endpoint and fix streaming proxy in subprocess orchestrator 2025-11-21 19:34:20 +01:00
91089d3edc feat: add /v1/models endpoint and systemd service for orchestrator
- Add OpenAI-compatible /v1/models endpoint to list available models
- Create systemd service file for proper service management
- Service runs as root with automatic restart on failure
- Logs to systemd journal for easy debugging
2025-11-21 19:28:16 +01:00
9947fe37bb fix: properly proxy streaming requests without buffering
The orchestrator was calling response.json() which buffered the entire
streaming response before returning it. This caused LiteLLM to receive
only one chunk with empty content instead of token-by-token streaming.

Changes:
- Detect streaming requests by parsing request body for 'stream': true
- Use client.stream() with aiter_bytes() for streaming requests
- Return StreamingResponse with proper SSE headers
- Keep original JSONResponse behavior for non-streaming requests

This fixes streaming from vLLM → orchestrator → LiteLLM chain.
2025-11-21 19:21:56 +01:00
57b706abe6 fix: correct vLLM service port to 8000
- Updated qwen-2.5-7b port from 8001 to 8000 in models.yaml
- Matches actual vLLM server default port configuration
- Tested and verified: orchestrator successfully loaded model and generated response
2025-11-21 16:28:54 +01:00
9a637cc4fc refactor: clean Docker files and restore standalone model services
- Remove all Docker-related files (Dockerfiles, compose.yaml)
- Remove documentation files (README, ARCHITECTURE, docs/)
- Remove old core/ directory (base_service, service_manager)
- Update models.yaml with correct service_script paths (models/*/server.py)
- Simplify vLLM requirements.txt to let vLLM manage dependencies
- Restore original standalone vLLM server (no base_service dependency)
- Remove obsolete vllm/, musicgen/, flux/ directories

Process-based architecture is now fully functional on RunPod.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-21 16:17:38 +01:00
31be1932e7 wip: start architecture redesign for RunPod (no Docker)
Started redesigning architecture to run services directly without Docker:

**Completed:**
- Created new process-based orchestrator (orchestrator_subprocess.py)
- Uses subprocess instead of Docker SDK for process management
- Updated models.yaml to reference service_script paths
- vLLM server already standalone-ready

**Still needed:**
- Create/update Flux and MusicGen standalone servers
- Create systemd service files or startup scripts
- Update prepare-template script for Python deployment
- Remove Docker/Compose dependencies
- Test full stack on RunPod
- Update documentation

Reason for change: RunPod's containerized environment doesn't support
Docker-in-Docker (requires CAP_SYS_ADMIN). Direct Python execution is
simpler, faster, and more reliable for RunPod.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-21 15:09:30 +01:00
277f1c95bd Initial commit: RunPod multi-modal AI orchestration stack
- Multi-modal AI infrastructure for RunPod RTX 4090
- Automatic model orchestration (text, image, music)
- Text: vLLM + Qwen 2.5 7B Instruct
- Image: Flux.1 Schnell via OpenEDAI
- Music: MusicGen Medium via AudioCraft
- Cost-optimized sequential loading on single GPU
- Template preparation scripts for rapid deployment
- Comprehensive documentation (README, DEPLOYMENT, TEMPLATE)
2025-11-21 14:34:55 +01:00