All checks were successful
Build and Push RunPod Docker Image / build-and-push (push) Successful in 15s
Adds monkey-patch for DiT.__init__() to properly configure LlamaConfig with num_attention_heads and num_key_value_heads parameters, which are missing in the upstream DiffRhythm code. Root cause: transformers 4.49.0+ requires these parameters but DiffRhythm's dit.py only specifies hidden_size, causing the library to incorrectly infer head_dim as 32 instead of 64, leading to tensor dimension mismatches. Solution: - Sets num_attention_heads = hidden_size // 64 (standard Llama architecture) - Sets num_key_value_heads = num_attention_heads // 4 (GQA configuration) - Ensures head_dim = 64, fixing the "tensor a (32) vs tensor b (64)" error This is a proper fix rather than just downgrading transformers version. References: - https://github.com/billwuhao/ComfyUI_DiffRhythm/issues/44 - https://github.com/billwuhao/ComfyUI_DiffRhythm/issues/48 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>