Commit Graph

194 Commits

Author SHA1 Message Date
b011c192f8 feat: add FFmpeg dependencies to system packages
All checks were successful
Build and Push RunPod Docker Image / build-and-push (push) Successful in 14s
Add FFmpeg and its development libraries to setup/system-packages script:
- ffmpeg: Main FFmpeg executable
- libavcodec-dev: Audio/video codec library
- libavformat-dev: Audio/video format library
- libavutil-dev: Utility library for FFmpeg
- libswscale-dev: Video scaling library

These libraries are required for torchcodec to function properly with
DiffRhythm audio generation. Also added FFmpeg version verification
after installation.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-24 12:48:57 +01:00
a249dfc941 feat: add torchcodec dependency for DiffRhythm audio caching
All checks were successful
Build and Push RunPod Docker Image / build-and-push (push) Successful in 14s
Add torchcodec to ComfyUI requirements.txt to fix audio tensor caching
error in DiffRhythm. This package is required for save_with_torchcodec
function used by DiffRhythm audio nodes.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-24 12:44:05 +01:00
19376d90a7 feat: add DiffRhythm eval-model download script
All checks were successful
Build and Push RunPod Docker Image / build-and-push (push) Successful in 15s
Add arty script to download eval.yaml and eval.safetensors files from
HuggingFace space for DiffRhythm node support. These files are required
for DiffRhythm evaluation model functionality.

- Add models/diffrhythm-eval script to download eval-model files
- Update setup/comfyui-nodes to create eval-model directory
- Files downloaded from ASLP-lab/DiffRhythm HuggingFace space
- Script includes file verification and size reporting

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-24 12:38:44 +01:00
cf3fcafbae feat: add DiffRhythm music generation support
All checks were successful
Build and Push RunPod Docker Image / build-and-push (push) Successful in 15s
- Add DiffRhythm dependencies to requirements.txt (19 packages)
- Add reference audio placeholder for style transfer workflow
- DiffRhythm nodes now loading in ComfyUI
- All four workflows ready for music generation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-24 12:17:46 +01:00
8fe87064f8 feat: add DiffRhythm dependencies to ComfyUI requirements
All checks were successful
Build and Push RunPod Docker Image / build-and-push (push) Successful in 14s
Added all required packages for ComfyUI_DiffRhythm extension:
- torchdiffeq: ODE solvers for diffusion models
- x-transformers: Transformer architecture components
- librosa: Audio analysis and feature extraction
- pandas, pyarrow: Data handling
- ema-pytorch, prefigure: Training utilities
- muq: Music quality model
- mutagen: Audio metadata handling
- pykakasi, jieba, cn2an, pypinyin: Chinese/Japanese text processing
- Unidecode, phonemizer, inflect: Text normalization and phonetic conversion
- py3langid: Language identification

These dependencies enable the DiffRhythm node to load and function properly in ComfyUI, fixing the "ModuleNotFoundError: No module named 'infer_utils'" error.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-24 10:51:50 +01:00
44762a063c fix: update DiffRhythm workflows with correct node names and parameters
All checks were successful
Build and Push RunPod Docker Image / build-and-push (push) Successful in 14s
Updated all 4 DiffRhythm workflow JSON files to use actual node class names from ComfyUI_DiffRhythm:

**Node Name Changes:**
- DiffRhythmTextToMusic → DiffRhythmRun
- DiffRhythmRandomGeneration → DiffRhythmRun (with empty style_prompt)
- DiffRhythmReferenceBasedGeneration → DiffRhythmRun (with audio input)

**Corrected Parameter Structure:**
All workflows now use proper widgets_values array matching DiffRhythmRun INPUT_TYPES:
1. model (string: "cfm_model_v1_2.pt", "cfm_model.pt", or "cfm_full_model.pt")
2. style_prompt (string: multiline text or empty for random)
3. unload_model (boolean: default true)
4. odeint_method (string: "euler", "midpoint", "rk4", "implicit_adams")
5. steps (int: 1-100, default 30)
6. cfg (int: 1-10, default 4)
7. quality_or_speed (string: "quality" or "speed")
8. seed (int: -1 for random, or specific number)
9. edit (boolean: default false)
10. edit_segments (string: "[-1, 20], [60, -1]")

**Workflow-Specific Updates:**

**diffrhythm-simple-t2m-v1.json:**
- Text-to-music workflow for 95s generation
- Uses cfm_model_v1_2.pt with text prompt guidance
- Default settings: steps=30, cfg=4, speed mode, seed=42

**diffrhythm-full-length-t2m-v1.json:**
- Full-length 4m45s (285s) generation
- Uses cfm_full_model.pt for extended compositions
- Quality mode enabled for better results
- Default seed=123

**diffrhythm-reference-based-v1.json:**
- Reference audio + text prompt workflow
- Uses LoadAudio node connected to style_audio_or_edit_song input
- Higher cfg=5 for stronger prompt adherence
- Demonstrates optional audio input connection

**diffrhythm-random-generation-v1.json:**
- Pure random generation (no prompt/guidance)
- Empty style_prompt string
- Minimal cfg=1 for maximum randomness
- Random seed=-1 for unique output each time

**Documentation Updates:**
- Removed PLACEHOLDER notes
- Updated usage sections with correct parameter descriptions
- Added notes about optional MultiLineLyricsDR node for lyrics
- Clarified parameter behavior and recommendations

These workflows are now ready to use in ComfyUI with the installed DiffRhythm extension.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-24 10:46:31 +01:00
e9a1536f1d chore: clean up arty.yml - remove unused scripts and envs
All checks were successful
Build and Push RunPod Docker Image / build-and-push (push) Successful in 14s
- Remove deprecated legacy setup scripts
- Remove unused environment definitions (prod, dev, minimal)
- Remove WebDAV setup script
- Remove redundant model linking script
- Streamline configuration

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-24 10:26:52 +01:00
f2186db78e feat: integrate ComfyUI_DiffRhythm extension with 7 models and 4 workflows
All checks were successful
Build and Push RunPod Docker Image / build-and-push (push) Successful in 15s
- Add DiffRhythm to arty.yml references and setup/comfyui-nodes
- Install espeak-ng system dependency for phoneme processing
- Add 7 DiffRhythm models to models_huggingface.yaml with file mappings:
  * ASLP-lab/DiffRhythm-1_2 (95s generation)
  * ASLP-lab/DiffRhythm-full (4m45s generation)
  * ASLP-lab/DiffRhythm-base
  * ASLP-lab/DiffRhythm-vae
  * OpenMuQ/MuQ-MuLan-large
  * OpenMuQ/MuQ-large-msd-iter
  * FacebookAI/xlm-roberta-base
- Create 4 comprehensive workflows:
  * diffrhythm-simple-t2m-v1.json (basic 95s text-to-music)
  * diffrhythm-full-length-t2m-v1.json (4m45s full-length)
  * diffrhythm-reference-based-v1.json (style transfer with reference audio)
  * diffrhythm-random-generation-v1.json (no-prompt random generation)
- Update storage requirements: 90GB essential, 149GB total

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-24 09:50:45 +01:00
9439185b3d fix: update Docker registry from Docker Hub to dev.pivoine.art
All checks were successful
Build and Push RunPod Docker Image / build-and-push (push) Successful in 2m8s
- Use Gitea container registry instead of Docker Hub
- Update workflow to use gitea.actor and REGISTRY_TOKEN
- Update documentation to reflect correct registry URL
- Match supervisor-ui workflow configuration
2025-11-23 21:57:14 +01:00
571431955d feat: add RunPod Docker template with automated build workflow
- Add Dockerfile with minimal setup (supervisor, tailscale)
- Add start.sh bootstrap script for container initialization
- Add Gitea workflow for automated Docker image builds
- Add comprehensive RUNPOD_TEMPLATE.md documentation
- Add bootstrap-venvs.sh for Python venv health checks

This enables deployment of the AI orchestrator on RunPod using:
- Minimal Docker image (~2-3GB) for fast deployment
- Network volume for models and data persistence (~80-200GB)
- Automated builds on push to main or version tags
- Full Tailscale VPN integration
- Supervisor process management
2025-11-23 21:53:56 +01:00
0e3150e26c fix: correct Pony Diffusion workflow checkpoint reference
- Changed checkpoint from waiIllustriousSDXL_v150.safetensors to ponyDiffusionV6XL_v6StartWithThisOne.safetensors
- Fixed metadata model reference (was incorrectly referencing LoRA)
- Added files field to models_civitai.yaml for explicit filename mapping
- Aligns workflow with actual Pony Diffusion V6 XL model
2025-11-23 19:57:45 +01:00
f6de19bec1 feat: add files field for embeddings with different filenames
- Add files field to badx-sdxl, pony-pdxl-hq-v3, pony-pdxl-xxx
- Specifies actual downloaded filenames (BadX-neg.pt, zPDXL3.safetensors, zPDXLxxx.pt)
- Allows script to properly link embeddings where YAML name != filename
2025-11-23 19:54:41 +01:00
5770563d9a feat: add comprehensive negative embeddings support (SD 1.5, SDXL, Pony)
- Add 3 new embedding categories to models_civitai.yaml:
  - embeddings_sd15: 6 embeddings (BadDream, UnrealisticDream, badhandv4, EasyNegative, FastNegativeV2, BadNegAnatomyV1-neg)
  - embeddings_sdxl: 1 embedding (BadX v1.1)
  - embeddings_pony: 2 embeddings (zPDXL3, zPDXLxxx)
- Total storage: ~1.1 MB (9 embeddings)
- Add comprehensive embeddings documentation to NSFW README
- Include usage examples, compatibility notes, and syntax guide
- Document embedding weights and recommended combinations
2025-11-23 19:39:18 +01:00
68d3606cab fix: use WAI-NSFW-Illustrious checkpoint instead of non-existent Pony model
Changed checkpoint from 'add-detail-xl.safetensors' (which is a LoRA) to
'waiIllustriousSDXL_v150.safetensors' which is the downloaded anime NSFW model
2025-11-23 19:13:22 +01:00
9ca62724d0 feat: add LoRA models category to CivitAI config with add-detail-xl and siesta 2025-11-23 19:04:19 +01:00
5e7c65a95c feat: add NSFW workflow linking to arty workflow manager
Updated arty.yml workflow linking script to include NSFW workflows:
- Added nsfw_ prefix for NSFW workflow category
- Links 4 NSFW workflows (LUSTIFY, Pony, RealVisXL, Ultimate Upscale)
- Updated workflow count from 20 to 25 total production workflows
- Updated documentation to list all 7 workflow categories

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 18:48:46 +01:00
1d851bb11c feat: add NSFW ComfyUI workflow suite with LoRA fusion and upscaling
Added 5 production-ready workflows to leverage downloaded CivitAI NSFW models:

**NSFW Text-to-Image Workflows (3):**
- lustify-realistic-t2i-production-v1.json - Photorealistic NSFW with LUSTIFY v7.0
  - DPM++ 2M SDE, Exponential scheduler, 30 steps, CFG 6.0
  - Optimized for women in realistic scenarios with professional photography quality
- pony-anime-t2i-production-v1.json - Anime/cartoon/furry with Pony Diffusion V6 XL
  - Euler Ancestral, Normal scheduler, 35 steps, CFG 7.5
  - Danbooru tag support, balanced safe/questionable/explicit content
- realvisxl-lightning-t2i-production-v1.json - Ultra-fast photorealistic with RealVisXL V5.0 Lightning
  - DPM++ SDE Karras, 6 steps (vs 30+), CFG 2.0
  - 4-6 step generation for rapid high-quality output

**Enhancement Workflows (2):**
- lora-fusion-t2i-production-v1.json - Multi-LoRA stacking (text-to-image directory)
  - Stack up to 3 LoRAs with adjustable weights (0.2-1.0)
  - Compatible with all SDXL checkpoints including NSFW models
  - Hierarchical strength control for style mixing and enhancement
- nsfw-ultimate-upscale-production-v1.json - Professional 2x upscaling with LUSTIFY
  - RealESRGAN_x2 + diffusion refinement via Ultimate SD Upscale
  - Tiled processing, optimized for detailed skin texture
  - Denoise 0.25 preserves original composition

**Documentation:**
- Comprehensive README.md with usage examples, API integration, model comparison
- Optimized settings for each workflow based on model recommendations
- Advanced usage guide for LoRA stacking and upscaling pipelines
- Version history tracking

**Total additions:** 1,768 lines across 6 files

These workflows complement the 27GB of CivitAI NSFW models downloaded in previous commit.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 18:46:22 +01:00
5944767d3f fix: update CivitAI model version IDs to latest versions
- RealVisXL: 904692 → 798204 (V5.0 Lightning)
- WAI-NSFW-illustrious: 1239648 → 2167369 (v15.0)
- Big Lust: 649023 → 1081768 (v1.6)

Fixes 404 errors when downloading these models.
2025-11-23 18:16:32 +01:00
e29f77c90b feat: add dedicated CivitAI NSFW model downloader
- Add models_civitai.yaml with 6 NSFW SDXL checkpoints
- Create artifact_civitai_download.sh with beautiful purple/magenta CLI
- Update .env.example with CIVITAI_API_KEY documentation
- Update CLAUDE.md with CivitAI usage instructions
- Rename comfyui_models.yaml to models_huggingface.yaml for clarity

Features:
- Dedicated config and downloader for CivitAI models
- Same elegant architecture as HuggingFace downloader
- Retry logic, rate limiting, progress bars
- Models: LUSTIFY, Pony Diffusion V6, RealVisXL, etc.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 17:58:25 +01:00
76cf5b5e31 docs: update CLAUDE.md to reflect direct vLLM architecture
- Remove all orchestrator references
- Update to dedicated vLLM server model
- Update service management commands
- Update LiteLLM integration details
- Update testing examples
2025-11-23 16:26:59 +01:00
479201d338 chore: remove orchestrator - replaced with dedicated vLLM servers 2025-11-23 16:24:43 +01:00
1ad99cdb53 refactor: replace orchestrator with dedicated vLLM servers for Qwen and Llama 2025-11-23 16:00:03 +01:00
cc0f55df38 fix: reduce max_model_len to 20000 to fit in 24GB VRAM 2025-11-23 15:43:37 +01:00
5cfd03f1ef fix: improve streaming with proper delta format and increase max_model_len to 32768 2025-11-23 15:38:18 +01:00
3f812704a2 fix: use venv python for vLLM service startup 2025-11-23 15:21:52 +01:00
fdd724298a fix: increase max_tokens limit from 4096 to 32768 for LLMX CLI support 2025-11-23 15:10:06 +01:00
a8c2ee1b90 fix: make model name and port configurable via environment variables 2025-11-23 13:45:01 +01:00
16112e50f6 fix: relax dependency version constraints for vllm compatibility 2025-11-23 13:33:46 +01:00
e0a43259d4 fix: update pydantic version constraint to match vllm requirements 2025-11-23 13:33:22 +01:00
d351ec7172 fix: update service_script path to vllm/server.py
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 13:29:51 +01:00
b94df17845 feat: add requirements.txt for vLLM models
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 13:25:03 +01:00
d67667c79f fix: add psutil to orchestrator dependencies
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 13:11:30 +01:00
61fd0e9265 fix: correct widgets_values - remove upscale_model/custom params, fix seam_fix order (width before mask_blur) 2025-11-23 12:34:28 +01:00
b9afd68ddd fix: add control_after_generate parameter at position 2 (23 total params) 2025-11-23 12:26:27 +01:00
2f53f542e7 fix: add custom_sampler and custom_sigmas null placeholders (22 total parameters) 2025-11-23 12:21:40 +01:00
14a1fcf4a7 fix: add null placeholder for upscale_model in widgets_values (20th parameter) 2025-11-23 12:20:48 +01:00
626dab6f65 fix: back to function signature order for seam_fix params 2025-11-23 12:18:42 +01:00
abbd89981e fix: use USDU_base_inputs order (seam_fix_width before mask_blur) 2025-11-23 12:15:49 +01:00
f976dc2c74 fix: correct seam_fix parameter order - mask_blur comes before width in function signature 2025-11-23 12:14:19 +01:00
75c6c77391 fix: correct widgets_values array to match actual parameter order (19 widget values for unconnected parameters) 2025-11-23 12:11:54 +01:00
6f4ac14032 fix: correct seam_fix parameter order in widgets_values (seam_fix_denoise was 1.0, should be 0.3) 2025-11-23 12:10:23 +01:00
21efd3b86d fix: remove widget parameters from inputs array - they belong in widgets_values only 2025-11-23 12:09:11 +01:00
8b8a29a47e fix: add missing type fields to sampler_name and scheduler inputs 2025-11-23 12:07:43 +01:00
d6fbda38f1 fix: correct UltimateSDUpscale input indices in workflow
The upscale_model input was at index 5 instead of index 12, causing all
widget parameters to be misaligned. Fixed by:
- Updating link target index from 5 to 12 for upscale_model
- Adding explicit entries for widget parameters in inputs array
- Maintaining correct parameter order per custom node definition

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 12:06:25 +01:00
096d565f3d chore: reorganize workflow assets and remove unused files
- Move example images to their respective workflow directories
- Remove unused COMFYUI_MODELS.md (content consolidated elsewhere)
- Remove fix_workflows.py script (no longer needed)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 12:01:38 +01:00
d12c868e65 fix: add UpscaleModelLoader and correct widget order in UltimateSDUpscale workflow
- Added UpscaleModelLoader node (node 8) for RealESRGAN model
- Connected upscale_model input to UltimateSDUpscale
- Fixed widgets_values array to match correct parameter order:
  upscale_by, seed, steps, cfg, sampler_name, scheduler, denoise,
  mode_type, tile_width, tile_height, mask_blur, tile_padding,
  seam_fix_mode, seam_fix_denoise, seam_fix_width, seam_fix_mask_blur,
  seam_fix_padding, force_uniform_tiles, tiled_decode
- Updated version to 1.1.0
2025-11-23 11:45:28 +01:00
53a7faf2a8 feat: add RealESRGAN upscale models to configuration
Added RealESRGAN x2 and x4 upscaling models:
- RealESRGAN_x2.pth - Fast 2x upscaling
- RealESRGAN_x4.pth - High quality 4x upscaling
- Both marked as essential, minimal VRAM usage (~2-4GB)
- Total size: ~120MB combined
2025-11-23 11:35:04 +01:00
c114569309 feat: add placeholder input images for workflows
Added example images for testing workflows:
- input_image.png (512x512) - for general upscaling workflows
- input_portrait.png (512x768) - for portrait/face upscaling workflows
2025-11-23 11:33:00 +01:00
0df4c63412 fix: add missing links and rebuild upscaling workflows
- simple-upscale: Added proper node connections, changed ImageScale to ImageScaleBy
- ultimate-sd-upscale: Added CLIP text encoders, removed incorrect VAEDecode and UpscaleModelLoader nodes
- face-upscale: Simplified to basic upscaling workflow (FaceDetailer requires complex bbox detector setup)

All workflows now have proper inputs, outputs, and links arrays.
2025-11-23 11:30:29 +01:00
f1788f88ca fix: replace PreviewAudio with AudioPlay in MusicGen workflows
Sound Lab's Musicgen_ node outputs AUDIO format that is only compatible with Sound Lab nodes like AudioPlay, not the built-in ComfyUI audio nodes (SaveAudio/PreviewAudio).
2025-11-23 11:20:15 +01:00