feat: add complete HunyuanVideo and Wan2.2 video generation integration
All checks were successful
Build and Push RunPod Docker Image / build-and-push (push) Successful in 15s

Integrated 35+ video generation models and 13 production workflows from ComfyUI docs tutorials for state-of-the-art text-to-video and image-to-video generation.

Models Added (models_huggingface.yaml):
- HunyuanVideo (5 models): Original T2V/I2V (720p), v1.5 (720p/1080p) with Qwen 2.5 VL
- Wan2.2 diffusion models (18 models):
  - 5B TI2V hybrid (8GB VRAM, efficient)
  - 14B variants: T2V, I2V (high/low noise), Animate, S2V (FP8/BF16), Fun Camera/Control (high/low noise)
- Support models (12): VAEs, UMT5-XXL, CLIP Vision H, Wav2Vec2, LLaVA encoders
- LoRA accelerators (4): Lightx2v 4-step distillation for 5x speedup

Workflows Added (comfyui/workflows/image-to-video/):
- HunyuanVideo (5 workflows): T2V original, I2V v1/v2 (webp embedded), v1.5 T2V/I2V (JSON)
- Wan2.2 (8 workflows): 5B TI2V, 14B T2V/I2V/FLF2V/Animate/S2V/Fun Camera/Fun Control
- Asset files (10): Reference images, videos, audio for workflow testing

Custom Nodes Added (arty.yml):
- ComfyUI-KJNodes: Kijai optimizations for HunyuanVideo/Wan2.2 (FP8 scaling, video helpers)
- comfyui_controlnet_aux: ControlNet preprocessors (Canny, Depth, OpenPose, MLSD) for Fun Control
- ComfyUI-GGUF: GGUF quantization support for memory optimization

VRAM Requirements:
- HunyuanVideo original: 24GB (720p T2V/I2V, 129 frames, 5s generation)
- HunyuanVideo 1.5: 30-60GB (720p/1080p, improved quality with Qwen 2.5 VL)
- Wan2.2 5B: 8GB (efficient dual-expert architecture with native offloading)
- Wan2.2 14B: 24GB (high-quality video generation, all modes)

Note: Wan2.2 Fun Inpaint workflow not available in official templates repository (404).

Tutorial Sources:
- https://docs.comfy.org/tutorials/video/hunyuan/hunyuan-video
- https://docs.comfy.org/tutorials/video/hunyuan/hunyuan-video-1-5
- https://docs.comfy.org/tutorials/video/wan/wan2_2
- https://docs.comfy.org/tutorials/video/wan/wan2-2-animate
- https://docs.comfy.org/tutorials/video/wan/wan2-2-s2v
- https://docs.comfy.org/tutorials/video/wan/wan2-2-fun-camera
- https://docs.comfy.org/tutorials/video/wan/wan2-2-fun-control

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
2025-11-25 10:43:39 +01:00
parent 06b8ec0064
commit 6efb55c59f
21 changed files with 32794 additions and 0 deletions

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.0 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.8 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.4 MiB

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,733 @@
{
"id": "91f6bbe2-ed41-4fd6-bac7-71d5b5864ecb",
"revision": 0,
"last_node_id": 59,
"last_link_id": 108,
"nodes": [
{
"id": 37,
"type": "UNETLoader",
"pos": [
-30,
50
],
"size": [
346.7470703125,
82
],
"flags": {},
"order": 0,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"slot_index": 0,
"links": [
94
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.45",
"Node name for S&R": "UNETLoader",
"models": [
{
"name": "wan2.2_ti2v_5B_fp16.safetensors",
"url": "https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/diffusion_models/wan2.2_ti2v_5B_fp16.safetensors",
"directory": "diffusion_models"
}
]
},
"widgets_values": [
"wan2.2_ti2v_5B_fp16.safetensors",
"default"
]
},
{
"id": 38,
"type": "CLIPLoader",
"pos": [
-30,
190
],
"size": [
350,
110
],
"flags": {},
"order": 1,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "CLIP",
"type": "CLIP",
"slot_index": 0,
"links": [
74,
75
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.45",
"Node name for S&R": "CLIPLoader",
"models": [
{
"name": "umt5_xxl_fp8_e4m3fn_scaled.safetensors",
"url": "https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors",
"directory": "text_encoders"
}
]
},
"widgets_values": [
"umt5_xxl_fp8_e4m3fn_scaled.safetensors",
"wan",
"default"
]
},
{
"id": 39,
"type": "VAELoader",
"pos": [
-30,
350
],
"size": [
350,
60
],
"flags": {},
"order": 2,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "VAE",
"type": "VAE",
"slot_index": 0,
"links": [
76,
105
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.45",
"Node name for S&R": "VAELoader",
"models": [
{
"name": "wan2.2_vae.safetensors",
"url": "https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/vae/wan2.2_vae.safetensors",
"directory": "vae"
}
]
},
"widgets_values": [
"wan2.2_vae.safetensors"
]
},
{
"id": 8,
"type": "VAEDecode",
"pos": [
1190,
150
],
"size": [
210,
46
],
"flags": {},
"order": 10,
"mode": 0,
"inputs": [
{
"name": "samples",
"type": "LATENT",
"link": 35
},
{
"name": "vae",
"type": "VAE",
"link": 76
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"slot_index": 0,
"links": [
107
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.45",
"Node name for S&R": "VAEDecode"
},
"widgets_values": []
},
{
"id": 57,
"type": "CreateVideo",
"pos": [
1200,
240
],
"size": [
270,
78
],
"flags": {},
"order": 11,
"mode": 0,
"inputs": [
{
"name": "images",
"type": "IMAGE",
"link": 107
},
{
"name": "audio",
"shape": 7,
"type": "AUDIO",
"link": null
}
],
"outputs": [
{
"name": "VIDEO",
"type": "VIDEO",
"links": [
108
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.45",
"Node name for S&R": "CreateVideo"
},
"widgets_values": [
24
]
},
{
"id": 58,
"type": "SaveVideo",
"pos": [
1200,
370
],
"size": [
660,
450
],
"flags": {},
"order": 12,
"mode": 0,
"inputs": [
{
"name": "video",
"type": "VIDEO",
"link": 108
}
],
"outputs": [],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.45",
"Node name for S&R": "SaveVideo"
},
"widgets_values": [
"video/ComfyUI",
"auto",
"auto"
]
},
{
"id": 55,
"type": "Wan22ImageToVideoLatent",
"pos": [
380,
540
],
"size": [
271.9126892089844,
150
],
"flags": {},
"order": 8,
"mode": 0,
"inputs": [
{
"name": "vae",
"type": "VAE",
"link": 105
},
{
"name": "start_image",
"shape": 7,
"type": "IMAGE",
"link": 106
}
],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"links": [
104
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.45",
"Node name for S&R": "Wan22ImageToVideoLatent"
},
"widgets_values": [
1280,
704,
121,
1
]
},
{
"id": 56,
"type": "LoadImage",
"pos": [
0,
540
],
"size": [
274.080078125,
314
],
"flags": {},
"order": 3,
"mode": 4,
"inputs": [],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"links": [
106
]
},
{
"name": "MASK",
"type": "MASK",
"links": null
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.45",
"Node name for S&R": "LoadImage"
},
"widgets_values": [
"example.png",
"image"
]
},
{
"id": 7,
"type": "CLIPTextEncode",
"pos": [
380,
260
],
"size": [
425.27801513671875,
180.6060791015625
],
"flags": {},
"order": 7,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 75
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"slot_index": 0,
"links": [
52
]
}
],
"title": "CLIP Text Encode (Negative Prompt)",
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.45",
"Node name for S&R": "CLIPTextEncode"
},
"widgets_values": [
"色调艳丽过曝静态细节模糊不清字幕风格作品画作画面静止整体发灰最差质量低质量JPEG压缩残留丑陋的残缺的多余的手指画得不好的手部画得不好的脸部畸形的毁容的形态畸形的肢体手指融合静止不动的画面杂乱的背景三条腿背景人很多倒着走"
],
"color": "#322",
"bgcolor": "#533"
},
{
"id": 6,
"type": "CLIPTextEncode",
"pos": [
380,
50
],
"size": [
422.84503173828125,
164.31304931640625
],
"flags": {},
"order": 6,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 74
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"slot_index": 0,
"links": [
46
]
}
],
"title": "CLIP Text Encode (Positive Prompt)",
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.45",
"Node name for S&R": "CLIPTextEncode"
},
"widgets_values": [
"Low contrast. In a retro 1970s-style subway station, a street musician plays in dim colors and rough textures. He wears an old jacket, playing guitar with focus. Commuters hurry by, and a small crowd gathers to listen. The camera slowly moves right, capturing the blend of music and city noise, with old subway signs and mottled walls in the background."
],
"color": "#232",
"bgcolor": "#353"
},
{
"id": 3,
"type": "KSampler",
"pos": [
850,
130
],
"size": [
315,
262
],
"flags": {},
"order": 9,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 95
},
{
"name": "positive",
"type": "CONDITIONING",
"link": 46
},
{
"name": "negative",
"type": "CONDITIONING",
"link": 52
},
{
"name": "latent_image",
"type": "LATENT",
"link": 104
}
],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"slot_index": 0,
"links": [
35
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.45",
"Node name for S&R": "KSampler"
},
"widgets_values": [
898471028164125,
"randomize",
20,
5,
"uni_pc",
"simple",
1
]
},
{
"id": 48,
"type": "ModelSamplingSD3",
"pos": [
850,
20
],
"size": [
210,
58
],
"flags": {
"collapsed": false
},
"order": 5,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 94
}
],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"slot_index": 0,
"links": [
95
]
}
],
"properties": {
"cnr_id": "comfy-core",
"ver": "0.3.45",
"Node name for S&R": "ModelSamplingSD3"
},
"widgets_values": [
8
]
},
{
"id": 59,
"type": "MarkdownNote",
"pos": [
-550,
10
],
"size": [
480,
340
],
"flags": {},
"order": 4,
"mode": 0,
"inputs": [],
"outputs": [],
"title": "Model Links",
"properties": {},
"widgets_values": [
"[Tutorial](https://docs.comfy.org/tutorials/video/wan/wan2_2\n) \n\n**Diffusion Model**\n- [wan2.2_ti2v_5B_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/diffusion_models/wan2.2_ti2v_5B_fp16.safetensors)\n\n**VAE**\n- [wan2.2_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/vae/wan2.2_vae.safetensors)\n\n**Text Encoder** \n- [umt5_xxl_fp8_e4m3fn_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors)\n\n\nFile save location\n\n```\nComfyUI/\n├───📂 models/\n│ ├───📂 diffusion_models/\n│ │ └───wan2.2_ti2v_5B_fp16.safetensors\n│ ├───📂 text_encoders/\n│ │ └─── umt5_xxl_fp8_e4m3fn_scaled.safetensors \n│ └───📂 vae/\n│ └── wan2.2_vae.safetensors\n```\n"
],
"color": "#432",
"bgcolor": "#653"
}
],
"links": [
[
35,
3,
0,
8,
0,
"LATENT"
],
[
46,
6,
0,
3,
1,
"CONDITIONING"
],
[
52,
7,
0,
3,
2,
"CONDITIONING"
],
[
74,
38,
0,
6,
0,
"CLIP"
],
[
75,
38,
0,
7,
0,
"CLIP"
],
[
76,
39,
0,
8,
1,
"VAE"
],
[
94,
37,
0,
48,
0,
"MODEL"
],
[
95,
48,
0,
3,
0,
"MODEL"
],
[
104,
55,
0,
3,
3,
"LATENT"
],
[
105,
39,
0,
55,
0,
"VAE"
],
[
106,
56,
0,
55,
1,
"IMAGE"
],
[
107,
8,
0,
57,
0,
"IMAGE"
],
[
108,
57,
0,
58,
0,
"VIDEO"
]
],
"groups": [
{
"id": 1,
"title": "Step1 - Load models",
"bounding": [
-50,
-20,
400,
453.6000061035156
],
"color": "#3f789e",
"font_size": 24,
"flags": {}
},
{
"id": 2,
"title": "Step3 - Prompt",
"bounding": [
370,
-20,
448.27801513671875,
473.2060852050781
],
"color": "#3f789e",
"font_size": 24,
"flags": {}
},
{
"id": 3,
"title": "For i2v, use Ctrl + B to enable",
"bounding": [
-50,
450,
400,
420
],
"color": "#3f789e",
"font_size": 24,
"flags": {}
},
{
"id": 4,
"title": "Video Size & length",
"bounding": [
370,
470,
291.9127197265625,
233.60000610351562
],
"color": "#3f789e",
"font_size": 24,
"flags": {}
}
],
"config": {},
"extra": {
"ds": {
"scale": 0.46462425349300085,
"offset": [
847.5372059811432,
288.7938392118285
]
},
"frontendVersion": "1.27.10",
"VHS_latentpreview": false,
"VHS_latentpreviewrate": 0,
"VHS_MetadataImage": true,
"VHS_KeepIntermediate": true
},
"version": 0.4
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 906 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.7 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.0 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 925 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 712 KiB