Files
runpod/comfyui/workflows/WORKFLOW_STANDARDS.md

658 lines
14 KiB
Markdown
Raw Permalink Normal View History

# ComfyUI Workflow Development Standards
Production standards and best practices for creating ComfyUI workflows in the RunPod AI Model Orchestrator.
## Table of Contents
- [Naming Conventions](#naming-conventions)
- [Workflow Structure](#workflow-structure)
- [API Integration](#api-integration)
- [Error Handling](#error-handling)
- [VRAM Optimization](#vram-optimization)
- [Quality Assurance](#quality-assurance)
- [Documentation Requirements](#documentation-requirements)
- [Testing Guidelines](#testing-guidelines)
## Naming Conventions
### Workflow Files
Format: `{category}-{model}-{type}-{environment}-v{version}.json`
**Components:**
- `category`: Descriptive category (flux, sdxl, cogvideox, musicgen, etc.)
- `model`: Specific model variant (schnell, dev, small, medium, large)
- `type`: Operation type (t2i, i2i, i2v, t2m, upscale)
- `environment`: `production` (stable) or `experimental` (testing)
- `version`: Semantic versioning (1, 2, 3, etc.)
**Examples:**
- `flux-schnell-t2i-production-v1.json` - FLUX Schnell text-to-image, production version 1
- `sdxl-refiner-t2i-production-v2.json` - SDXL with refiner, production version 2
- `musicgen-large-t2m-experimental-v1.json` - MusicGen large, experimental version 1
### Node Naming
**Descriptive names for all nodes:**
```json
{
"title": "FLUX Schnell Checkpoint Loader",
"type": "CheckpointLoaderSimple",
"properties": {
"Node name for S&R": "CheckpointLoaderSimple"
}
}
```
**Naming patterns:**
- Loaders: `{Model} Checkpoint Loader`, `{Model} VAE Loader`
- Samplers: `{Model} KSampler`, `{Model} Advanced Sampler`
- Inputs: `API Text Input`, `API Image Input`, `API Seed Input`
- Outputs: `API Image Output`, `Preview Output`, `Save Output`
- Processing: `VAE Encode`, `VAE Decode`, `CLIP Text Encode`
## Workflow Structure
### Required Node Groups
Every production workflow MUST include these node groups:
#### 1. Input Group
```
Purpose: Receive parameters from API or UI
Nodes:
- Text input nodes (prompts, negative prompts)
- Numeric input nodes (seed, steps, CFG scale)
- Image input nodes (for i2i, i2v workflows)
- Model selection nodes (if multiple models supported)
```
#### 2. Model Loading Group
```
Purpose: Load required models and components
Nodes:
- Checkpoint/Diffuser loaders
- VAE loaders
- CLIP text encoders
- ControlNet loaders (if applicable)
- IP-Adapter loaders (if applicable)
```
#### 3. Processing Group
```
Purpose: Main generation/transformation logic
Nodes:
- Samplers (KSampler, Advanced KSampler)
- Encoders (CLIP, VAE)
- Conditioning nodes
- ControlNet application (if applicable)
```
#### 4. Post-Processing Group
```
Purpose: Refinement and enhancement
Nodes:
- VAE decoding
- Upscaling (if applicable)
- Face enhancement (Impact-Pack)
- Image adjustments
```
#### 5. Output Group
```
Purpose: Save and return results
Nodes:
- SaveImage nodes (for file output)
- Preview nodes (for UI feedback)
- API output nodes (for orchestrator)
```
#### 6. Error Handling Group (Optional but Recommended)
```
Purpose: Validation and fallback
Nodes:
- Validation nodes
- Fallback nodes
- Error logging nodes
```
### Node Organization
**Logical flow (left to right, top to bottom):**
```
[Inputs] → [Model Loading] → [Processing] → [Post-Processing] → [Outputs]
[Error Handling]
```
**Visual grouping:**
- Use node positions to create visual separation
- Group related nodes together
- Align nodes for readability
- Use consistent spacing
## API Integration
### Input Nodes
**Required for API compatibility:**
1. **Text Inputs** (prompts, negative prompts)
```json
{
"inputs": {
"text": "A beautiful sunset over mountains",
"default": ""
},
"class_type": "CLIPTextEncode",
"title": "API Prompt Input"
}
```
2. **Numeric Inputs** (seed, steps, CFG, etc.)
```json
{
"inputs": {
"seed": 42,
"steps": 20,
"cfg": 7.5,
"sampler_name": "euler_ancestral",
"scheduler": "normal"
},
"class_type": "KSampler",
"title": "API Sampler Config"
}
```
3. **Image Inputs** (for i2i workflows)
```json
{
"inputs": {
"image": "",
"upload": "image"
},
"class_type": "LoadImage",
"title": "API Image Input"
}
```
### Output Nodes
**Required for orchestrator return:**
```json
{
"inputs": {
"images": ["node_id", 0],
"filename_prefix": "ComfyUI"
},
"class_type": "SaveImage",
"title": "API Image Output"
}
```
### Parameter Validation
**Include validation for critical parameters:**
```json
{
"inputs": {
"value": "seed",
"min": 0,
"max": 4294967295,
"default": 42
},
"class_type": "IntegerInput",
"title": "Seed Validator"
}
```
## Error Handling
### Required Validations
1. **Model Availability**
- Check if checkpoint files exist
- Validate model paths
- Provide fallback to default models
2. **Parameter Bounds**
- Validate numeric ranges (seed, steps, CFG)
- Check dimension constraints (width, height)
- Validate string inputs (sampler names, scheduler types)
3. **VRAM Limits**
- Check batch size against VRAM
- Validate resolution against VRAM
- Enable tiling for large images
4. **Input Validation**
- Verify required inputs are provided
- Check image formats and dimensions
- Validate prompt lengths
### Fallback Strategies
**Default values for missing inputs:**
```json
{
"inputs": {
"text": "{{prompt | default('A beautiful landscape')}}",
"seed": "{{seed | default(42)}}",
"steps": "{{steps | default(20)}}"
}
}
```
**Graceful degradation:**
- If refiner unavailable, skip refinement step
- If upscaler fails, return base resolution
- If face enhancement errors, return unenhanced image
## VRAM Optimization
### Model Unloading
**Explicit model cleanup between stages:**
```json
{
"inputs": {
"model": ["checkpoint_loader", 0]
},
"class_type": "FreeModel",
"title": "Unload Base Model"
}
```
**When to unload:**
- After base generation, before refinement
- After refinement, before upscaling
- Between different model types (diffusion → CLIP → VAE)
### VAE Tiling
**Enable for high-resolution processing:**
```json
{
"inputs": {
"samples": ["sampler", 0],
"vae": ["vae_loader", 0],
"tile_size": 512,
"overlap": 64
},
"class_type": "VAEDecodeTiled",
"title": "Tiled VAE Decode"
}
```
**Tiling thresholds:**
- Use tiled VAE for images >1024x1024
- Tile size: 512 for 24GB VRAM, 256 for lower
- Overlap: 64px minimum for seamless tiles
### Attention Slicing
**Reduce memory for large models:**
```json
{
"inputs": {
"model": ["checkpoint_loader", 0],
"attention_mode": "sliced"
},
"class_type": "ModelOptimization",
"title": "Enable Attention Slicing"
}
```
### Batch Processing
**VRAM-safe batch sizes:**
- FLUX models: batch_size=1
- SDXL: batch_size=1-2
- SD3.5: batch_size=1
- Upscaling: batch_size=1
**Sequential batching:**
```json
{
"inputs": {
"mode": "sequential",
"batch_size": 1
},
"class_type": "BatchProcessor"
}
```
## Quality Assurance
### Preview Nodes
**Include preview at key stages:**
```json
{
"inputs": {
"images": ["vae_decode", 0]
},
"class_type": "PreviewImage",
"title": "Preview Base Generation"
}
```
**Preview locations:**
- After base generation (before refinement)
- After refinement (before upscaling)
- After upscaling (final check)
- After face enhancement
### Quality Gates
**Checkpoints for validation:**
1. **Resolution Check**
```json
{
"inputs": {
"image": ["input", 0],
"min_width": 512,
"min_height": 512,
"max_width": 2048,
"max_height": 2048
},
"class_type": "ImageSizeValidator"
}
```
2. **Quality Metrics**
```json
{
"inputs": {
"image": ["vae_decode", 0],
"min_quality_score": 0.7
},
"class_type": "QualityChecker"
}
```
### Save Points
**Save intermediate results:**
```json
{
"inputs": {
"images": ["base_generation", 0],
"filename_prefix": "intermediate/base_"
},
"class_type": "SaveImage",
"title": "Save Base Generation"
}
```
**When to save:**
- Base generation (before refinement)
- After each major processing stage
- Before potentially destructive operations
## Documentation Requirements
### Workflow Metadata
**Include in workflow JSON:**
```json
{
"workflow_info": {
"name": "FLUX Schnell Text-to-Image Production",
"version": "1.0.0",
"author": "RunPod AI Model Orchestrator",
"description": "Fast text-to-image generation using FLUX.1-schnell (4 steps)",
"category": "text-to-image",
"tags": ["flux", "fast", "production"],
"requirements": {
"models": ["FLUX.1-schnell"],
"custom_nodes": [],
"vram_min": "16GB",
"vram_recommended": "24GB"
},
"parameters": {
"prompt": {
"type": "string",
"required": true,
"description": "Text description of desired image"
},
"seed": {
"type": "integer",
"required": false,
"default": 42,
"min": 0,
"max": 4294967295
},
"steps": {
"type": "integer",
"required": false,
"default": 4,
"min": 1,
"max": 20
}
},
"outputs": {
"image": {
"type": "image",
"format": "PNG",
"resolution": "1024x1024"
}
}
}
}
```
### Node Comments
**Document complex nodes:**
```json
{
"title": "FLUX KSampler - Main Generation",
"notes": "Using euler_ancestral sampler with 4 steps for FLUX Schnell. CFG=1.0 is optimal for this model. Seed controls reproducibility.",
"inputs": {
"seed": 42,
"steps": 4,
"cfg": 1.0
}
}
```
### Usage Examples
**Include in workflow or README:**
```markdown
## Example Usage
### ComfyUI Web Interface
1. Load workflow: `text-to-image/flux-schnell-t2i-production-v1.json`
2. Set prompt: "A serene mountain landscape at sunset"
3. Adjust seed: 42 (optional)
4. Click "Queue Prompt"
### Orchestrator API
```bash
curl -X POST http://localhost:9000/api/comfyui/generate \
-d '{"workflow": "flux-schnell-t2i-production-v1.json", "inputs": {"prompt": "A serene mountain landscape"}}'
```
```
## Testing Guidelines
### Manual Testing
**Required tests before production:**
1. **UI Test**
- Load in ComfyUI web interface
- Execute with default parameters
- Verify output quality
- Check preview nodes
- Confirm save locations
2. **API Test**
- Call via orchestrator API
- Test with various parameter combinations
- Verify JSON response format
- Check error handling
3. **Edge Cases**
- Missing optional parameters
- Invalid parameter values
- Out-of-range inputs
- Missing models (graceful failure)
### Automated Testing
**Test script template:**
```bash
#!/bin/bash
# Test workflow: flux-schnell-t2i-production-v1.json
WORKFLOW="text-to-image/flux-schnell-t2i-production-v1.json"
# Test 1: Default parameters
curl -X POST http://localhost:9000/api/comfyui/generate \
-d "{\"workflow\": \"$WORKFLOW\", \"inputs\": {\"prompt\": \"test image\"}}" \
| jq '.status' # Should return "success"
# Test 2: Custom parameters
curl -X POST http://localhost:9000/api/comfyui/generate \
-d "{\"workflow\": \"$WORKFLOW\", \"inputs\": {\"prompt\": \"test\", \"seed\": 123, \"steps\": 8}}" \
| jq '.status'
# Test 3: Missing prompt (should use default)
curl -X POST http://localhost:9000/api/comfyui/generate \
-d "{\"workflow\": \"$WORKFLOW\", \"inputs\": {}}" \
| jq '.status'
```
### Performance Testing
**Measure key metrics:**
```bash
# Generation time
time curl -X POST http://localhost:9000/api/comfyui/generate \
-d '{"workflow": "flux-schnell-t2i-production-v1.json", "inputs": {"prompt": "benchmark"}}'
# VRAM usage
nvidia-smi --query-gpu=memory.used --format=csv,noheader,nounits -l 1
# GPU utilization
nvidia-smi --query-gpu=utilization.gpu --format=csv,noheader,nounits -l 1
```
**Performance baselines (24GB VRAM):**
- FLUX Schnell (1024x1024, 4 steps): ~5-8 seconds
- FLUX Dev (1024x1024, 20 steps): ~25-35 seconds
- SDXL + Refiner (1024x1024): ~40-60 seconds
- CogVideoX (6s video): ~120-180 seconds
### Load Testing
**Concurrent request handling:**
```bash
# Test 5 concurrent generations
for i in {1..5}; do
curl -X POST http://localhost:9000/api/comfyui/generate \
-d "{\"workflow\": \"flux-schnell-t2i-production-v1.json\", \"inputs\": {\"prompt\": \"test $i\", \"seed\": $i}}" &
done
wait
```
## Version Control
### Semantic Versioning
**Version increments:**
- `v1``v2`: Major changes (different models, restructured workflow)
- Internal iterations: Keep same version, document changes in git commits
### Change Documentation
**Changelog format:**
```markdown
## flux-schnell-t2i-production-v2.json
### Changes from v1
- Added API input validation
- Optimized VRAM usage with model unloading
- Added preview node after generation
- Updated default steps from 4 to 6
### Breaking Changes
- Changed output node structure (requires orchestrator update)
### Migration Guide
- Update API calls to use new parameter names
- Clear ComfyUI cache before loading v2
```
### Deprecation Process
**Sunsetting old versions:**
1. Mark old version as deprecated in README
2. Keep deprecated version for 2 releases
3. Add deprecation warning in workflow metadata
4. Document migration path to new version
5. Archive deprecated workflows in `archive/` directory
## Best Practices
### DO
- Use descriptive node names
- Include preview nodes at key stages
- Validate all inputs
- Optimize for VRAM efficiency
- Document all parameters
- Test with both UI and API
- Version your workflows
- Include error handling
- Save intermediate results
- Use semantic naming
### DON'T
- Hardcode file paths
- Assume unlimited VRAM
- Skip input validation
- Omit documentation
- Create overly complex workflows
- Use experimental nodes in production
- Ignore VRAM optimization
- Skip testing edge cases
- Use unclear node names
- Forget to version
## Resources
- **ComfyUI Wiki**: https://github.com/comfyanonymous/ComfyUI/wiki
- **Custom Nodes List**: https://github.com/ltdrdata/ComfyUI-Manager
- **VRAM Optimization Guide**: `/workspace/ai/CLAUDE.md`
- **Model Documentation**: `/workspace/ai/COMFYUI_MODELS.md`
## Support
For questions or issues:
1. Review this standards document
2. Check ComfyUI logs: `supervisorctl tail -f comfyui`
3. Test workflow in UI before API
4. Validate JSON syntax
5. Check model availability