refactor: clean Docker files and restore standalone model services

- Remove all Docker-related files (Dockerfiles, compose.yaml)
- Remove documentation files (README, ARCHITECTURE, docs/)
- Remove old core/ directory (base_service, service_manager)
- Update models.yaml with correct service_script paths (models/*/server.py)
- Simplify vLLM requirements.txt to let vLLM manage dependencies
- Restore original standalone vLLM server (no base_service dependency)
- Remove obsolete vllm/, musicgen/, flux/ directories

Process-based architecture is now fully functional on RunPod.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
2025-11-21 16:17:38 +01:00
parent 9ee626a78e
commit 9a637cc4fc
20 changed files with 228 additions and 3122 deletions

View File

@@ -1,22 +0,0 @@
FROM python:3.11-slim
WORKDIR /app
# Install system dependencies
RUN apt-get update && apt-get install -y \
curl \
&& rm -rf /var/lib/apt/lists/*
# Copy requirements and install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy application code
COPY orchestrator.py .
COPY models.yaml .
# Expose port
EXPOSE 9000
# Run the orchestrator
CMD ["python", "orchestrator.py"]

View File

@@ -6,7 +6,7 @@ models:
qwen-2.5-7b:
type: text
framework: vllm
service_script: vllm/server.py
service_script: models/vllm/server.py
port: 8001
vram_gb: 14
startup_time_seconds: 120
@@ -17,7 +17,7 @@ models:
flux-schnell:
type: image
framework: openedai-images
service_script: flux/server.py
service_script: models/flux/server.py
port: 8002
vram_gb: 14
startup_time_seconds: 60
@@ -28,7 +28,7 @@ models:
musicgen-medium:
type: audio
framework: audiocraft
service_script: musicgen/server.py
service_script: models/musicgen/server.py
port: 8003
vram_gb: 11
startup_time_seconds: 45