Commit Graph

58 Commits

Author SHA1 Message Date
18cd87fbd1 refactor: reorganize webdav-sync into dedicated directory
Clean up project structure by organizing WebDAV sync service properly.

Changes:
- Move scripts/comfyui_webdav_sync.py → webdav-sync/webdav_sync.py
- Create webdav-sync/requirements.txt with watchdog and webdavclient3
- Remove webdav dependencies from model-orchestrator/requirements.txt
- Delete unused scripts/ folder (start-all.sh, status.sh, stop-all.sh)
- Update supervisord.conf to use new path /workspace/ai/webdav-sync/webdav_sync.py

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 19:03:48 +01:00
79442bd62e feat: add WebDAV sync service for ComfyUI outputs
Add Python watchdog service to automatically sync ComfyUI outputs to HiDrive WebDAV storage.

Changes:
- Add scripts/comfyui_webdav_sync.py: File watcher service using watchdog + webdavclient3
- Update model-orchestrator/requirements.txt: Add watchdog and webdavclient3 dependencies
- Update supervisord.conf: Add webdav-sync program with ENV variable support
- Update arty.yml: Add service management scripts (start/stop/restart/status/logs)

WebDAV credentials are now loaded from .env file (WEBDAV_URL, WEBDAV_USERNAME, WEBDAV_PASSWORD, WEBDAV_REMOTE_PATH)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 18:58:18 +01:00
71d4e22240 fix: use non-interactive davfs2 installation 2025-11-22 18:35:37 +01:00
c4e26b8a84 fix: remove sudo commands from WebDAV setup for RunPod root user 2025-11-22 18:33:52 +01:00
c19ee7af88 feat: add WebDAV integration for HiDrive storage
Added setup/webdav script to arty.yml:
- Installs davfs2 package
- Configures HiDrive WebDAV credentials
- Mounts https://webdav.hidrive.ionos.com/ to /mnt/hidrive
- Creates ComfyUI output directory: /mnt/hidrive/users/valknar/Pictures/AI/ComfyUI
- Creates symlink: /workspace/ComfyUI/output_hidrive

Usage:
  arty run setup/webdav

This allows ComfyUI outputs to be saved directly to HiDrive cloud storage.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 18:32:51 +01:00
cabe2158be fix: remove invalid noise_mode parameter from KSamplerAdvanced
Removed "fixed" parameter that was causing broken pipe error.

KSamplerAdvanced correct parameters (9 values):
- add_noise: "enable"
- seed: 42
- steps: 25
- cfg: 7.5
- sampler_name: "euler"
- scheduler: "normal"
- start_at_step: 0
- end_at_step: 25
- return_with_leftover_noise: "enable"

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 18:26:25 +01:00
9d91ec3236 feat: add SD 1.5 model and update AnimateDiff workflow
Changes:
- Added Stable Diffusion v1.5 model to comfyui_models.yaml
  - Required for AnimateDiff motion modules (mm_sd_v15_v2.ckpt)
  - Size: 4GB, VRAM: 8GB
- Updated AnimateDiff workflow to use SD 1.5 instead of SDXL
  - Changed checkpoint from sd_xl_base_1.0.safetensors to v1-5-pruned-emaonly.safetensors
  - Updated VRAM requirement from 18GB to 12GB
  - Updated model requirements in metadata

AnimateDiff v15 motion modules are not compatible with SDXL.
This resolves: "Motion module 'mm_sd_v15_v2.ckpt' is intended for SD1.5 models"

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 18:18:41 +01:00
8d1508c564 fix: correct KSamplerAdvanced parameter order
KSamplerAdvanced correct order:
["add_noise", seed, "noise_mode", steps, cfg, sampler_name, scheduler, start_at_step, end_at_step, "return_with_leftover_noise"]

Fixed values:
- add_noise: "enable"
- seed: 42
- noise_mode: "fixed"
- steps: 25
- cfg: 7.5
- sampler_name: "euler"
- scheduler: "normal"
- start_at_step: 0
- end_at_step: 25
- return_with_leftover_noise: "enable"

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 18:15:56 +01:00
2b66caa7e2 fix: correct KSamplerAdvanced parameters in AnimateDiff workflow
Fixed widget_values order for KSamplerAdvanced node:
- Changed sampler from dpmpp_2m to euler (valid sampler)
- Changed scheduler from karras to normal (valid scheduler)
- Fixed end_at_step from 10000 to 25 (matching steps count)

This resolves validation errors:
- scheduler '0' not in list
- cfg could not convert 'dpmpp_2m' to float
- sampler_name 'karras' not in list
- end_at_step 'disable' invalid literal for int

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 18:14:40 +01:00
c85d036e61 fix: restore correct model types after accidental changes
- FLUX models: checkpoints → diffusion_models
- CogVideoX: checkpoints → diffusion_models
- MusicGen (all): checkpoints → musicgen
- Removed custom_nodes notes from MusicGen (using musicgen directory)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 18:00:15 +01:00
99f4708475 fix: correct ComfyUI directory types to match actual structure
Changed type fields to use actual ComfyUI directory names:
- diffusion_models → checkpoints (for FLUX, CogVideoX)
- musicgen → checkpoints (with notes about custom_nodes placement)

All models now properly map to real ComfyUI/models/* subdirectories.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 17:54:55 +01:00
ba47d34362 fix: add multi-part file mapping for MusicGen Large
MusicGen Large is split into multiple files (pytorch_model-00001/00002-of-00002.bin)
Added all 3 files to the mapping (2 parts + index.json)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 17:39:35 +01:00
55efc6dddf fix: correct model type mappings based on ComfyUI folder_paths
- FLUX models: diffusers → diffusion_models
- SDXL models: diffusers → checkpoints
- SD 3.5: diffusers → checkpoints
- SVD video: diffusion_models → checkpoints
- MusicGen: audio_models → musicgen

This aligns with official ComfyUI directory structure.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 17:34:55 +01:00
0709dec1d4 feat: add explicit file mappings to comfyui_models.yaml
- Added 'files' array to all models specifying source and destination filenames
- Image models (FLUX, SDXL, SD 3.5): Use original checkpoint filenames
- Video models (CogVideoX, SVD): Use descriptive filenames
- Audio models (MusicGen): Prefix with model name for clarity
- Support models (CLIP, IP-Adapter, AnimateDiff): Keep original names
- This allows precise control over linked filenames in ComfyUI

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 17:20:18 +01:00
7ebda1ae44 fix: change audio models type from checkpoints to audio_models
Audio model files (MusicGen) should not be mixed with image generation
checkpoints. Changed type from 'checkpoints' to 'audio_models' to ensure
they are linked to the correct ComfyUI directory.
2025-11-22 17:00:33 +01:00
bd8d1d7783 fix: use correct supervisor group names for service control
Supervisor groups programs as 'ai-services:comfyui' and 'ai-services:orchestrator'
so commands need to use these full names instead of just the program names.
2025-11-22 16:56:23 +01:00
5361bef85c refactor: reorganize service management commands in arty.yml
- Replace shell script calls with direct supervisorctl commands
- Organize services hierarchically: services/{service}/action
- Add all service control commands: start, stop, restart, status, logs
- Support both individual service control and all services control
- Remove obsolete script-based service management

New command structure:
- All services: arty services/{start,stop,restart,status}
- ComfyUI: arty services/comfyui/{start,stop,restart,status,logs}
- Orchestrator: arty services/orchestrator/{start,stop,restart,status,logs}
2025-11-22 16:55:09 +01:00
79b861e687 fix: update AnimateDiff workflow to use correct checkpoint filename
Change CheckpointLoaderSimple to use 'sd_xl_base_1.0.safetensors' instead of
'diffusers/stable-diffusion-xl-base-1.0' to match the actual checkpoint file
linked in /root/ComfyUI/models/checkpoints/
2025-11-22 16:51:59 +01:00
24ace7dcda fix: add type field and missing models to comfyui_models.yaml
- Add 'type' field to all model entries (diffusers, clip_vision, checkpoints, etc.)
- Add animatediff_models category with guoyww/animatediff repo
- Add ipadapter_models category with h94/IP-Adapter repo
- Map models to correct ComfyUI subdirectories

This fixes model linking issues where models weren't appearing in ComfyUI.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 15:56:59 +01:00
e84579a22f fix: add missing VHS_VideoCombine widget values
- Add pingpong and save_output boolean parameters
- Add video/h264-mp4 format options: pix_fmt, crf, save_metadata, trim_to_audio
- Fixes "Failed to restore node: Combine Frames" error

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 15:50:06 +01:00
003caf668b fix: restructure AnimateDiff workflow for Evolved Gen2 nodes
- Replace deprecated AnimateDiffLoaderV1 with ADE_LoadAnimateDiffModel
- Add ADE_ApplyAnimateDiffModelSimple to create M_MODELS
- Add ADE_UseEvolvedSampling to inject motion into base model
- Replace AnimateDiffSampler with KSamplerAdvanced
- Add complete node links for all connections
- Update node IDs and execution order

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 15:47:52 +01:00
360fd52c59 fix: update AnimateDiffSampler to KSamplerAdvanced
- Replace ADE_AnimateDiffSampler with KSamplerAdvanced
- Adjust widget values for KSamplerAdvanced compatibility
- Add proper node name mappings in fix_workflows.py

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 15:42:29 +01:00
45f71e646d fix: update workflow schema validation and node names
- Add missing last_link_id and links fields to all workflows
- Update node name mappings:
  - AudioSave → SaveAudio (MusicGen workflows)
  - AnimateDiffSampler → ADE_AnimateDiffSampler
  - SeedGenerator → ImpactInt
  - BatchKSampler → KSampler
  - ImageBatchToList → GetImageSize
- Fix schema validation errors across all 20 workflows

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 15:38:38 +01:00
2213ed3c85 fix: complete ComfyUI workflow schema validation
Fix all 20 production workflows to comply with ComfyUI schema requirements:
- Add missing 'flags', 'order', 'mode', 'properties', 'size' fields to all nodes
- Update deprecated node names:
  - AnimateDiffLoader → AnimateDiffLoaderV1
  - VHSVideoCombine → VHS_VideoCombine
  - IPAdapterApply → IPAdapter
  - IPAdapterApplyFace → IPAdapterFaceID
- Remove deprecated nodes: PreviewVideo, SaveVideo
- Add fix_workflows.py script for future maintenance

Changes:
- 16 workflows updated with complete schema
- 4 workflows (FLUX, SD3.5) were already valid
- All workflows now pass zod schema validation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 15:30:08 +01:00
19d82108b0 fix: pin diffusers version to >=0.31.0 for CogVideoXWrapper compatibility
ComfyUI-CogVideoXWrapper requires diffusers>=0.31.0 to function properly. Pin the version requirement to ensure the correct version is installed during setup.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 15:07:27 +01:00
99b743c412 feat: add ComfyUI_UltimateSDUpscale to custom nodes
Add ComfyUI_UltimateSDUpscale repository to support the ultimate-sd-upscale-production-v1 workflow. This provides high-quality image upscaling capabilities.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 14:52:16 +01:00
d719bcdfcf feat: link workflows to ComfyUI user directory with category prefixes
Update workflows/link-comfyui to link workflow files to the correct ComfyUI user directory (/workspace/ComfyUI/user/default/workflows/) instead of the non-standard /workspace/ComfyUI/workflows/ location.

Add category prefixes to maintain organization in ComfyUI's flat workflow browser:
- t2i_: Text-to-Image workflows
- i2i_: Image-to-Image workflows
- i2v_: Image-to-Video workflows
- t2m_: Text-to-Music workflows
- upscale_: Upscaling workflows
- adv_: Advanced workflows

Changes:
- Target correct ComfyUI user workflows directory
- Add category prefixes to all workflow symlinks
- Clean up old symlinks from previous location
- Workflows will now appear in ComfyUI's workflow browser sorted by category

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 14:37:42 +01:00
6daca7329a feat: flatten ComfyUI workflow structure for web UI visibility
Update workflows/link-comfyui script to link individual JSON files directly to /workspace/ComfyUI/workflows/ instead of subdirectories. ComfyUI web UI requires workflows to be in the root workflows directory to display them.

Changes:
- Remove old directory symlinks
- Link all 20 workflow JSON files directly to workflows root
- Preserve original descriptive filenames (e.g., flux-dev-t2i-production-v1.json)
- Add workflow count display
- Keep README and WORKFLOW_STANDARDS documentation links

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 14:13:30 +01:00
cc0d93060d feat: expose Supervisor web UI for proxy access
Disable Supervisor's built-in authentication and expose on all interfaces (0.0.0.0:9001) to enable reverse proxy access via nginx and Authelia SSO. Authentication is now handled by Authelia at supervisor.ai.pivoine.art.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 13:19:02 +01:00
16226c7b39 chore: remove Ansible infrastructure
Remove Ansible playbook and configuration files as all infrastructure
setup has been migrated to arty bash scripts.

Deleted:
- playbook.yml (~950 lines)
- inventory.yml
- ansible.cfg

All functionality now available via arty scripts:
- arty run install/essential
- arty run setup/system-packages
- arty run setup/comfyui-base
- arty run setup/comfyui-nodes
- arty run setup/supervisor
- arty run setup/tailscale

Model downloads will be handled separately (TBD).

🤖 Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 12:23:11 +01:00
d5965ce0c4 refactor: replace Ansible with arty bash scripts
Completely migrate infrastructure setup from Ansible to pure bash scripts in arty.yml, enabling simpler deployment and eliminating Ansible dependency on RunPod.

## New Installation Scripts

**System Setup:**
- setup/system-packages: Install apt packages, verify GPU
- setup/python-env: Configure Python, pip, core packages
- setup/comfyui-base: Install ComfyUI, create model directories
- setup/comfyui-nodes: Install 5 essential custom nodes
- setup/supervisor: Install Supervisor process manager
- setup/tailscale: Install Tailscale VPN (optional)

**Utility Scripts:**
- setup/validate: Validate installation, check packages/GPU/cache
- setup/cleanup: Clean sensitive files for template creation

**Orchestration Scripts:**
- install/minimal: System + Python + ComfyUI + Supervisor
- install/essential: Minimal + Custom nodes (recommended)
- install/full: Essential + Tailscale VPN

## Changes

### Added
- 11 new bash scripts replacing Ansible playbook tasks
- Comprehensive deployment notes with all commands
- Script categorization: System Setup, Service Management, Health Checks
- Detailed progress output with visual indicators (✓/)

### Removed
- Ansible shortcut scripts (ansible/base, ansible/supervisor, etc.)
- deps/comfyui-nodes (integrated into setup/comfyui-nodes)
- References to Ansible in deployment workflow

### Modified
- Deployment notes: Complete rewrite with organized sections
- Fresh deployment: Now uses arty run install/essential
- Documentation: Updated paths, commands, and workflow

## Benefits

-  No Ansible dependency
-  Faster debugging (pure bash, no templates)
-  RunPod compatible (no sudo issues)
-  Modular: Run individual steps or orchestration
-  Better error handling and user feedback
-  Easier to maintain and extend

## Migration Path

Model downloads still use Ansible playbook (temporary):
- ansible-playbook playbook.yml --tags comfyui-essential
- ansible-playbook playbook.yml --tags comfyui-models-all

After successful testing, Ansible files can be safely removed:
- playbook.yml
- inventory.yml
- ansible.cfg

## Usage

```bash
# Fresh deployment
arty run install/essential
ansible-playbook playbook.yml --tags comfyui-essential
arty run models/link-comfyui
arty run workflows/link-comfyui
arty run services/start
```

🤖 Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 12:21:05 +01:00
b370c16eb9 feat: add workflow linking script to arty.yml
Add new arty script `workflows/link-comfyui` to create symlinks for all
production workflows from git repository to ComfyUI workflows directory.

Script creates symlinks for:
- All 6 workflow category directories (text-to-image, image-to-image,
  image-to-video, text-to-music, upscaling, advanced)
- templates/ directory for future workflow templates
- README.md and WORKFLOW_STANDARDS.md documentation

Usage: arty run workflows/link-comfyui

Updates:
- Added workflows/link-comfyui script with informative output
- Updated Fresh Deployment guide to include workflow linking step
- Added section 6 "ComfyUI Workflows" with complete workflow listing
- Updated Storage section to document workflow symlinks
- Added workflow README to documentation list

🤖 Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 12:12:06 +01:00
71a30c0e4d feat: add comprehensive ComfyUI workflow collection
Add 20 production-ready ComfyUI workflows across 6 categories:

Text-to-Image (4 workflows):
- FLUX Schnell (fast, 4 steps)
- FLUX Dev (high-quality, 20-50 steps)
- SDXL + Refiner (two-stage, detailed)
- SD3.5 Large (latest generation)

Image-to-Image (3 workflows):
- IP-Adapter Style Transfer
- IP-Adapter Face Portrait
- IP-Adapter Multi-Composition

Image-to-Video (3 workflows):
- CogVideoX (6s AI-driven video)
- SVD (14 frames, quick animations)
- SVD-XT (25 frames, extended)

Text-to-Music (4 workflows):
- MusicGen Small/Medium/Large
- MusicGen Melody (melody conditioning)

Upscaling (3 workflows):
- Ultimate SD Upscale (professional)
- Simple Upscale (fast)
- Face Upscale (portrait-focused)

Advanced (3 workflows):
- ControlNet Fusion (multi-control)
- AnimateDiff Video (text-to-video)
- Batch Pipeline (multiple variations)

Documentation:
- README.md: Usage guide, model requirements, examples
- WORKFLOW_STANDARDS.md: Development standards, best practices

All workflows include:
- API compatibility for orchestrator integration
- Error handling and validation
- VRAM optimization for 24GB GPUs
- Preview and save nodes
- Comprehensive metadata and parameters
- Performance benchmarks

🤖 Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 12:08:18 +01:00
6323488591 fix: add missing ComfyUI-Impact-Pack dependencies
- Add scikit-image for image processing
- Add piexif for EXIF metadata handling
- Add segment-anything for SAM model support

These dependencies are required for ComfyUI-Impact-Pack to load correctly.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 09:33:43 +01:00
664da9f4ea feat: add Supervisor process manager for service management
- Add supervisord.conf with ComfyUI and orchestrator services
- Update Ansible playbook with supervisor installation tag
- Rewrite start-all.sh and stop-all.sh to use Supervisor
- Add status.sh script for checking service status
- Update arty.yml with supervisor commands and shortcuts
- Update CLAUDE.md with Supervisor documentation and troubleshooting
- Services now auto-restart on crashes with centralized logging

Benefits:
- Better process control than manual pkill/background jobs
- Auto-restart on service crashes
- Centralized log management in /workspace/logs/
- Web interface for monitoring (port 9001)
- Works perfectly in RunPod containers (no systemd needed)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 09:22:16 +01:00
2207d60f98 feat: add Arty configuration and Claude Code documentation
- Add arty.yml for repository management with environment profiles (prod/dev/minimal)
- Add CLAUDE.md with comprehensive architecture and usage documentation
- Add comfyui_models.yaml for ComfyUI model configuration
- Include deployment scripts for model linking and dependency installation
- Document all git repositories (ComfyUI + 10 custom nodes)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 02:50:36 +01:00
c9b01eef68 refactor: consolidate model management into Ansible playbook
Remove flux/musicgen standalone implementations in favor of ComfyUI:
- Delete models/flux/ and models/musicgen/ directories
- Remove redundant scripts (install.sh, download-models.sh, prepare-template.sh)
- Update README.md to reference Ansible playbook commands
- Update playbook.yml to remove flux/musicgen service definitions
- Add COMFYUI_MODELS.md with comprehensive model installation guide
- Update stop-all.sh to only manage orchestrator and vLLM services

All model downloads and dependency management now handled via
Ansible playbook tags (base, python, vllm, comfyui, comfyui-essential).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 00:31:26 +01:00
7edf17551a Add Llama 3.1 8B model to orchestrator 2025-11-21 21:29:57 +01:00
c4426ccc58 Remove Flux and MusicGen models from orchestrator
ComfyUI now handles Flux image generation directly.
MusicGen is not being used and has been removed.
2025-11-21 21:14:43 +01:00
205e591fb4 feat: add ComfyUI integration with Ansible playbook
- Add ComfyUI installation to Ansible playbook with 'comfyui' tag
- Create ComfyUI requirements.txt and start.sh script
- Clone ComfyUI from official GitHub repository
- Symlink HuggingFace cache for Flux model access
- ComfyUI runs on port 8188 with CORS enabled
- Add ComfyUI to services list in playbook

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-21 20:41:43 +01:00
538609da3e docs: add comprehensive README for AI model orchestrator
- Architecture overview (process-based orchestrator)
- Installation and setup instructions for RunPod
- Available models (text, image, music generation)
- API endpoints and usage examples
- Integration guide for LiteLLM proxy
- Troubleshooting section with streaming fixes
- Deployment options (RunPod, VPS, Tailscale VPN)
- Adding new models workflow
- Project structure documentation

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-21 19:47:43 +01:00
34fc456a3b feat: add /v1/models endpoint and fix streaming proxy in subprocess orchestrator 2025-11-21 19:34:20 +01:00
91089d3edc feat: add /v1/models endpoint and systemd service for orchestrator
- Add OpenAI-compatible /v1/models endpoint to list available models
- Create systemd service file for proper service management
- Service runs as root with automatic restart on failure
- Logs to systemd journal for easy debugging
2025-11-21 19:28:16 +01:00
9947fe37bb fix: properly proxy streaming requests without buffering
The orchestrator was calling response.json() which buffered the entire
streaming response before returning it. This caused LiteLLM to receive
only one chunk with empty content instead of token-by-token streaming.

Changes:
- Detect streaming requests by parsing request body for 'stream': true
- Use client.stream() with aiter_bytes() for streaming requests
- Return StreamingResponse with proper SSE headers
- Keep original JSONResponse behavior for non-streaming requests

This fixes streaming from vLLM → orchestrator → LiteLLM chain.
2025-11-21 19:21:56 +01:00
7f1890517d fix: enable eager execution for proper token streaming in vLLM
- Set enforce_eager=True to disable CUDA graphs which were batching outputs
- Add disable_log_stats=True for better streaming performance
- This ensures AsyncLLMEngine yields tokens incrementally instead of returning complete response
2025-11-21 18:25:50 +01:00
94080da341 fix: remove incorrect start-vllm.sh that would break orchestrator architecture 2025-11-21 18:10:53 +01:00
6944e4ebd5 feat: add vllm serve script with proper streaming support 2025-11-21 18:08:21 +01:00
d21caa56bc fix: implement incremental streaming deltas for vLLM chat completions
- Track previous_text to calculate deltas instead of sending full accumulated text
- Fixes WebUI streaming issue where responses appeared empty
- Only send new tokens in each SSE chunk delta
- Resolves OpenAI API compatibility for streaming chat completions
2025-11-21 17:23:18 +01:00
57b706abe6 fix: correct vLLM service port to 8000
- Updated qwen-2.5-7b port from 8001 to 8000 in models.yaml
- Matches actual vLLM server default port configuration
- Tested and verified: orchestrator successfully loaded model and generated response
2025-11-21 16:28:54 +01:00
9a637cc4fc refactor: clean Docker files and restore standalone model services
- Remove all Docker-related files (Dockerfiles, compose.yaml)
- Remove documentation files (README, ARCHITECTURE, docs/)
- Remove old core/ directory (base_service, service_manager)
- Update models.yaml with correct service_script paths (models/*/server.py)
- Simplify vLLM requirements.txt to let vLLM manage dependencies
- Restore original standalone vLLM server (no base_service dependency)
- Remove obsolete vllm/, musicgen/, flux/ directories

Process-based architecture is now fully functional on RunPod.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-21 16:17:38 +01:00