runpod

58 Commits 1 Branch 0 Tags

Author	SHA1	Message	Date
Sebastian Krüger	9ee626a78e	feat: implement Ansible-based process architecture for RunPod Major architecture overhaul to address RunPod Docker limitations: Core Infrastructure: - Add base_service.py: Abstract base class for all AI services - Add service_manager.py: Process lifecycle management - Add core/requirements.txt: Core dependencies Model Services (Standalone Python): - Add models/vllm/server.py: Qwen 2.5 7B text generation - Add models/flux/server.py: Flux.1 Schnell image generation - Add models/musicgen/server.py: MusicGen Medium music generation - Each service inherits from GPUService base class - OpenAI-compatible APIs - Standalone execution support Ansible Deployment: - Add playbook.yml: Comprehensive deployment automation - Add ansible.cfg: Ansible configuration - Add inventory.yml: Localhost inventory - Tags: base, python, dependencies, models, tailscale, validate, cleanup Scripts: - Add scripts/install.sh: Full installation wrapper - Add scripts/download-models.sh: Model download wrapper - Add scripts/start-all.sh: Start orchestrator - Add scripts/stop-all.sh: Stop all services Documentation: - Update ARCHITECTURE.md: Document distributed VPS+GPU architecture Benefits: - No Docker: Avoids RunPod CAP_SYS_ADMIN limitations - Fully reproducible via Ansible - Extensible: Add models in 3 steps - Direct Python execution (no container overhead) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-21 15:37:18 +01:00
Sebastian Krüger	03a430894d	docs: add clean extensible architecture design Created comprehensive architecture document for RunPod deployment: Key Design Principles: - No Docker (direct Python for RunPod compatibility) - Extensible (add models in 3 simple steps) - Maintainable (clear structure, base classes) - Simple (one command startup) Structure: - core/ - Base service class + service manager - model-orchestrator/ - Request routing - models/ - Service implementations (vllm, flux, musicgen) - scripts/ - Install, start, stop, template prep - docs/ - Adding models, deployment, templates Adding New Models: 1. Create server.py inheriting BaseService 2. Add entry to models.yaml 3. Add requirements.txt That's it! Orchestrator handles lifecycle automatically. Next: Implement base_service.py and refactor existing services. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-21 15:16:51 +01:00
Sebastian Krüger	31be1932e7	wip: start architecture redesign for RunPod (no Docker) Started redesigning architecture to run services directly without Docker: Completed: - Created new process-based orchestrator (orchestrator_subprocess.py) - Uses subprocess instead of Docker SDK for process management - Updated models.yaml to reference service_script paths - vLLM server already standalone-ready Still needed: - Create/update Flux and MusicGen standalone servers - Create systemd service files or startup scripts - Update prepare-template script for Python deployment - Remove Docker/Compose dependencies - Test full stack on RunPod - Update documentation Reason for change: RunPod's containerized environment doesn't support Docker-in-Docker (requires CAP_SYS_ADMIN). Direct Python execution is simpler, faster, and more reliable for RunPod. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-21 15:09:30 +01:00
Sebastian Krüger	cd9e2eee2e	fix: use legacy Docker builder for RunPod compatibility - Set DOCKER_BUILDKIT=0 to use legacy builder - BuildKit has permission issues in RunPod's containerized environment - Legacy builder works reliably with RunPod's security constraints 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-21 15:01:16 +01:00
Sebastian Krüger	8f1d4bedd2	fix: update Docker daemon startup for RunPod environment - Changed from systemctl/service to direct dockerd command - Added --iptables=false --bridge=none flags (required for RunPod) - Added proper error checking and 10s wait time - Improved logging with verification step This fixes Docker startup in RunPod's containerized environment where systemd is not available and iptables require special handling. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-21 15:00:42 +01:00
Sebastian Krüger	0fa69cae28	refactor: rename docker-compose.gpu.yaml to compose.yaml Simplified compose file naming to follow Docker Compose best practices: - Renamed docker-compose.gpu.yaml to compose.yaml - Updated all references in documentation files (README.md, DEPLOYMENT.md, GPU_DEPLOYMENT_LOG.md, RUNPOD_TEMPLATE.md) - Updated references in scripts (prepare-template.sh) This change enables simpler command syntax: - Before: docker compose -f docker-compose.gpu.yaml up -d orchestrator - After: docker compose up -d orchestrator Generated with Claude Code (https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-21 14:49:42 +01:00
Sebastian Krüger	cafa0a1147	refactor: clean up runpod repository structure Removed facefusion and VPS-related files: - compose.yaml, postgres/, litellm-config.yaml (VPS services) - Dockerfile, entrypoint.sh, disable-nsfw-filter.patch (facefusion) Removed outdated documentation: - DOCKER_GPU_SETUP.md, README_GPU_SETUP.md, SETUP_GUIDE.md - TAILSCALE_SETUP.md, WIREGUARD_SETUP.md (covered in DEPLOYMENT.md) - GPU_EXPANSION_PLAN.md (historical planning doc) - gpu-server-compose.yaml, litellm-config-gpu.yaml (old versions) - deploy-gpu-stack.sh, simple_vllm_server.py (old scripts) Organized documentation: - Created docs/ directory - Moved DEPLOYMENT.md, RUNPOD_TEMPLATE.md, GPU_DEPLOYMENT_LOG.md to docs/ - Updated all documentation links in README.md Final structure: - Clean root directory with only GPU-specific files - Organized documentation in docs/ - Model services in dedicated directories (model-orchestrator/, vllm/, flux/, musicgen/) - Automation scripts in scripts/	2025-11-21 14:45:49 +01:00
Sebastian Krüger	277f1c95bd	Initial commit: RunPod multi-modal AI orchestration stack - Multi-modal AI infrastructure for RunPod RTX 4090 - Automatic model orchestration (text, image, music) - Text: vLLM + Qwen 2.5 7B Instruct - Image: Flux.1 Schnell via OpenEDAI - Music: MusicGen Medium via AudioCraft - Cost-optimized sequential loading on single GPU - Template preparation scripts for rapid deployment - Comprehensive documentation (README, DEPLOYMENT, TEMPLATE)	2025-11-21 14:34:55 +01:00