fix: pin transformers to 4.49.0 for DiffRhythm compatibility
All checks were successful
Build and Push RunPod Docker Image / build-and-push (push) Successful in 13s
All checks were successful
Build and Push RunPod Docker Image / build-and-push (push) Successful in 13s
Resolves tensor dimension mismatch error in rotary position embeddings. DiffRhythm requires transformers 4.49.0 - newer versions (4.50+) cause "The size of tensor a (32) must match the size of tensor b (64)" error due to transformer block initialization changes. Updated pivoine_diffrhythm.py documentation to reflect actual root cause and link to upstream GitHub issues #44 and #48. References: - https://github.com/billwuhao/ComfyUI_DiffRhythm/issues/44 - https://github.com/billwuhao/ComfyUI_DiffRhythm/issues/48 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
@@ -1,7 +1,14 @@
|
|||||||
"""
|
"""
|
||||||
Pivoine DiffRhythm Node
|
Pivoine DiffRhythm Node
|
||||||
Custom wrapper for DiffRhythm that disables chunked decoding to prevent
|
Custom wrapper for DiffRhythm that ensures correct transformer library version
|
||||||
tensor dimension mismatch errors (32 vs 64) in VAE overlap logic.
|
compatibility and provides fallback fixes for tensor dimension issues.
|
||||||
|
|
||||||
|
Known Issue: DiffRhythm requires transformers==4.49.0. Newer versions (4.50+)
|
||||||
|
cause "The size of tensor a (32) must match the size of tensor b (64)" error
|
||||||
|
in rotary position embeddings due to transformer block initialization changes.
|
||||||
|
|
||||||
|
Reference: https://github.com/billwuhao/ComfyUI_DiffRhythm/issues/44
|
||||||
|
Reference: https://github.com/billwuhao/ComfyUI_DiffRhythm/issues/48
|
||||||
|
|
||||||
Author: valknar@pivoine.art
|
Author: valknar@pivoine.art
|
||||||
"""
|
"""
|
||||||
@@ -24,12 +31,16 @@ from DiffRhythmNode import DiffRhythmRun
|
|||||||
|
|
||||||
class PivoineDiffRhythmRun(DiffRhythmRun):
|
class PivoineDiffRhythmRun(DiffRhythmRun):
|
||||||
"""
|
"""
|
||||||
Pivoine version of DiffRhythmRun with chunked decoding forcibly disabled.
|
Pivoine version of DiffRhythmRun with enhanced compatibility and error handling.
|
||||||
|
|
||||||
Changes from original:
|
Changes from original:
|
||||||
- Monkey-patches the infer() function to always use chunked=False
|
- Monkey-patches decode_audio to always use chunked=False for stability
|
||||||
- Prevents tensor dimension mismatch in VAE (32 vs 64 channel error)
|
- Ensures transformers library version compatibility (requires 4.49.0)
|
||||||
|
- Prevents tensor dimension mismatch in VAE decoding
|
||||||
- Requires more VRAM (~12-16GB) but works reliably on RTX 4090
|
- Requires more VRAM (~12-16GB) but works reliably on RTX 4090
|
||||||
|
|
||||||
|
Note: If you encounter "tensor a (32) must match tensor b (64)" errors,
|
||||||
|
ensure transformers==4.49.0 is installed in your ComfyUI venv.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
CATEGORY = "🌸Pivoine/Audio"
|
CATEGORY = "🌸Pivoine/Audio"
|
||||||
|
|||||||
@@ -1,7 +1,7 @@
|
|||||||
torch
|
torch
|
||||||
torchvision
|
torchvision
|
||||||
torchaudio
|
torchaudio
|
||||||
transformers
|
transformers==4.49.0
|
||||||
diffusers>=0.31.0
|
diffusers>=0.31.0
|
||||||
accelerate
|
accelerate
|
||||||
safetensors
|
safetensors
|
||||||
|
|||||||
Reference in New Issue
Block a user