feat: integrate ACE Step music generation with 19-language support
All checks were successful
Build and Push RunPod Docker Image / build-and-push (push) Successful in 14s

Added ACE Step v1 3.5B model for state-of-the-art music generation:
- 15x faster than LLM baselines with superior structural coherence
- Supports 19 languages (en, zh, ja, ko, fr, es, de, it, pt, ru + 9 more)
- Voice cloning, lyric alignment, and multi-genre capabilities

Changes:
- Added ACE Step models to models_huggingface.yaml (checkpoint + Chinese RAP LoRA)
- Added ComfyUI_ACE-Step custom node to arty.yml with installation script
- Created 4 comprehensive workflows in comfyui/workflows/text-to-music/:
  * acestep-simple-t2m-v1.json - Basic 60s text-to-music generation
  * acestep-multilang-t2m-v1.json - 19-language music generation
  * acestep-remix-m2m-v1.json - Music-to-music remixing with style transfer
  * acestep-chinese-rap-v1.json - Chinese hip-hop with specialized LoRA

🤖 Generated with Claude Code (https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
2025-11-25 08:40:17 +01:00
parent 5af3eeb333
commit 513062623c
6 changed files with 480 additions and 0 deletions

View File

@@ -219,6 +219,34 @@ model_categories:
- source: "pytorch_model.bin.index.json"
dest: "musicgen-large-pytorch_model.bin.index.json"
# ACE Step v1 3.5B - State-of-the-art music generation
- repo_id: Comfy-Org/ACE-Step_ComfyUI_repackaged
description: ACE Step v1 3.5B - Fast coherent music generation with 19-language support
size_gb: 7.7
essential: true
category: audio
type: checkpoints
format: safetensors
vram_gb: 16
duration_seconds: 240
notes: 15x faster than LLM baselines, superior structural coherence, voice cloning, 19-language lyrics
files:
- source: "all_in_one/ace_step_v1_3.5b.safetensors"
dest: "ace_step_v1_3.5b.safetensors"
# ACE Step Chinese RAP LoRA (optional)
- repo_id: ACE-Step/ACE-Step-v1-chinese-rap-LoRA
description: ACE Step Chinese RAP LoRA - Enhanced Chinese pronunciation and hip-hop genre
size_gb: 0.3
essential: false
category: audio
type: loras
format: safetensors
notes: Improves Chinese pronunciation accuracy and hip-hop/electronic genre adherence
files:
- source: "pytorch_lora_weights.safetensors"
dest: "ace-step-chinese-rap-lora.safetensors"
# ==========================================================================
# SUPPORT MODELS (CLIP, IP-Adapter, etc.)
# ==========================================================================