runpod

Author	SHA1	Message	Date
Sebastian Krüger	7edf17551a	Add Llama 3.1 8B model to orchestrator	2025-11-21 21:29:57 +01:00
Sebastian Krüger	c4426ccc58	Remove Flux and MusicGen models from orchestrator ComfyUI now handles Flux image generation directly. MusicGen is not being used and has been removed.	2025-11-21 21:14:43 +01:00
Sebastian Krüger	57b706abe6	fix: correct vLLM service port to 8000 - Updated qwen-2.5-7b port from 8001 to 8000 in models.yaml - Matches actual vLLM server default port configuration - Tested and verified: orchestrator successfully loaded model and generated response	2025-11-21 16:28:54 +01:00
Sebastian Krüger	9a637cc4fc	refactor: clean Docker files and restore standalone model services - Remove all Docker-related files (Dockerfiles, compose.yaml) - Remove documentation files (README, ARCHITECTURE, docs/) - Remove old core/ directory (base_service, service_manager) - Update models.yaml with correct service_script paths (models/*/server.py) - Simplify vLLM requirements.txt to let vLLM manage dependencies - Restore original standalone vLLM server (no base_service dependency) - Remove obsolete vllm/, musicgen/, flux/ directories Process-based architecture is now fully functional on RunPod. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-21 16:17:38 +01:00
Sebastian Krüger	31be1932e7	wip: start architecture redesign for RunPod (no Docker) Started redesigning architecture to run services directly without Docker: Completed: - Created new process-based orchestrator (orchestrator_subprocess.py) - Uses subprocess instead of Docker SDK for process management - Updated models.yaml to reference service_script paths - vLLM server already standalone-ready Still needed: - Create/update Flux and MusicGen standalone servers - Create systemd service files or startup scripts - Update prepare-template script for Python deployment - Remove Docker/Compose dependencies - Test full stack on RunPod - Update documentation Reason for change: RunPod's containerized environment doesn't support Docker-in-Docker (requires CAP_SYS_ADMIN). Direct Python execution is simpler, faster, and more reliable for RunPod. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-21 15:09:30 +01:00
Sebastian Krüger	277f1c95bd	Initial commit: RunPod multi-modal AI orchestration stack - Multi-modal AI infrastructure for RunPod RTX 4090 - Automatic model orchestration (text, image, music) - Text: vLLM + Qwen 2.5 7B Instruct - Image: Flux.1 Schnell via OpenEDAI - Music: MusicGen Medium via AudioCraft - Cost-optimized sequential loading on single GPU - Template preparation scripts for rapid deployment - Comprehensive documentation (README, DEPLOYMENT, TEMPLATE)	2025-11-21 14:34:55 +01:00

6 Commits