refactor: reorganize directory structure and remove hardcoded paths
Move comfyui and vllm out of models/ directory to top level for better organization. Replace all hardcoded /workspace paths with relative paths to make the configuration portable across different environments. Changes: - Move models/comfyui/ → comfyui/ - Move models/vllm/ → vllm/ - Remove models/ directory (empty) - Update arty.yml: replace /workspace with environment variables - Update supervisord.conf: use relative paths from /workspace/ai - Update all script references to use new paths - Maintain TQDM_DISABLE=1 to fix BrokenPipeError 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
657
comfyui/workflows/WORKFLOW_STANDARDS.md
Normal file
657
comfyui/workflows/WORKFLOW_STANDARDS.md
Normal file
@@ -0,0 +1,657 @@
|
||||
# ComfyUI Workflow Development Standards
|
||||
|
||||
Production standards and best practices for creating ComfyUI workflows in the RunPod AI Model Orchestrator.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
- [Naming Conventions](#naming-conventions)
|
||||
- [Workflow Structure](#workflow-structure)
|
||||
- [API Integration](#api-integration)
|
||||
- [Error Handling](#error-handling)
|
||||
- [VRAM Optimization](#vram-optimization)
|
||||
- [Quality Assurance](#quality-assurance)
|
||||
- [Documentation Requirements](#documentation-requirements)
|
||||
- [Testing Guidelines](#testing-guidelines)
|
||||
|
||||
## Naming Conventions
|
||||
|
||||
### Workflow Files
|
||||
|
||||
Format: `{category}-{model}-{type}-{environment}-v{version}.json`
|
||||
|
||||
**Components:**
|
||||
- `category`: Descriptive category (flux, sdxl, cogvideox, musicgen, etc.)
|
||||
- `model`: Specific model variant (schnell, dev, small, medium, large)
|
||||
- `type`: Operation type (t2i, i2i, i2v, t2m, upscale)
|
||||
- `environment`: `production` (stable) or `experimental` (testing)
|
||||
- `version`: Semantic versioning (1, 2, 3, etc.)
|
||||
|
||||
**Examples:**
|
||||
- `flux-schnell-t2i-production-v1.json` - FLUX Schnell text-to-image, production version 1
|
||||
- `sdxl-refiner-t2i-production-v2.json` - SDXL with refiner, production version 2
|
||||
- `musicgen-large-t2m-experimental-v1.json` - MusicGen large, experimental version 1
|
||||
|
||||
### Node Naming
|
||||
|
||||
**Descriptive names for all nodes:**
|
||||
```json
|
||||
{
|
||||
"title": "FLUX Schnell Checkpoint Loader",
|
||||
"type": "CheckpointLoaderSimple",
|
||||
"properties": {
|
||||
"Node name for S&R": "CheckpointLoaderSimple"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Naming patterns:**
|
||||
- Loaders: `{Model} Checkpoint Loader`, `{Model} VAE Loader`
|
||||
- Samplers: `{Model} KSampler`, `{Model} Advanced Sampler`
|
||||
- Inputs: `API Text Input`, `API Image Input`, `API Seed Input`
|
||||
- Outputs: `API Image Output`, `Preview Output`, `Save Output`
|
||||
- Processing: `VAE Encode`, `VAE Decode`, `CLIP Text Encode`
|
||||
|
||||
## Workflow Structure
|
||||
|
||||
### Required Node Groups
|
||||
|
||||
Every production workflow MUST include these node groups:
|
||||
|
||||
#### 1. Input Group
|
||||
```
|
||||
Purpose: Receive parameters from API or UI
|
||||
Nodes:
|
||||
- Text input nodes (prompts, negative prompts)
|
||||
- Numeric input nodes (seed, steps, CFG scale)
|
||||
- Image input nodes (for i2i, i2v workflows)
|
||||
- Model selection nodes (if multiple models supported)
|
||||
```
|
||||
|
||||
#### 2. Model Loading Group
|
||||
```
|
||||
Purpose: Load required models and components
|
||||
Nodes:
|
||||
- Checkpoint/Diffuser loaders
|
||||
- VAE loaders
|
||||
- CLIP text encoders
|
||||
- ControlNet loaders (if applicable)
|
||||
- IP-Adapter loaders (if applicable)
|
||||
```
|
||||
|
||||
#### 3. Processing Group
|
||||
```
|
||||
Purpose: Main generation/transformation logic
|
||||
Nodes:
|
||||
- Samplers (KSampler, Advanced KSampler)
|
||||
- Encoders (CLIP, VAE)
|
||||
- Conditioning nodes
|
||||
- ControlNet application (if applicable)
|
||||
```
|
||||
|
||||
#### 4. Post-Processing Group
|
||||
```
|
||||
Purpose: Refinement and enhancement
|
||||
Nodes:
|
||||
- VAE decoding
|
||||
- Upscaling (if applicable)
|
||||
- Face enhancement (Impact-Pack)
|
||||
- Image adjustments
|
||||
```
|
||||
|
||||
#### 5. Output Group
|
||||
```
|
||||
Purpose: Save and return results
|
||||
Nodes:
|
||||
- SaveImage nodes (for file output)
|
||||
- Preview nodes (for UI feedback)
|
||||
- API output nodes (for orchestrator)
|
||||
```
|
||||
|
||||
#### 6. Error Handling Group (Optional but Recommended)
|
||||
```
|
||||
Purpose: Validation and fallback
|
||||
Nodes:
|
||||
- Validation nodes
|
||||
- Fallback nodes
|
||||
- Error logging nodes
|
||||
```
|
||||
|
||||
### Node Organization
|
||||
|
||||
**Logical flow (left to right, top to bottom):**
|
||||
```
|
||||
[Inputs] → [Model Loading] → [Processing] → [Post-Processing] → [Outputs]
|
||||
↓
|
||||
[Error Handling]
|
||||
```
|
||||
|
||||
**Visual grouping:**
|
||||
- Use node positions to create visual separation
|
||||
- Group related nodes together
|
||||
- Align nodes for readability
|
||||
- Use consistent spacing
|
||||
|
||||
## API Integration
|
||||
|
||||
### Input Nodes
|
||||
|
||||
**Required for API compatibility:**
|
||||
|
||||
1. **Text Inputs** (prompts, negative prompts)
|
||||
```json
|
||||
{
|
||||
"inputs": {
|
||||
"text": "A beautiful sunset over mountains",
|
||||
"default": ""
|
||||
},
|
||||
"class_type": "CLIPTextEncode",
|
||||
"title": "API Prompt Input"
|
||||
}
|
||||
```
|
||||
|
||||
2. **Numeric Inputs** (seed, steps, CFG, etc.)
|
||||
```json
|
||||
{
|
||||
"inputs": {
|
||||
"seed": 42,
|
||||
"steps": 20,
|
||||
"cfg": 7.5,
|
||||
"sampler_name": "euler_ancestral",
|
||||
"scheduler": "normal"
|
||||
},
|
||||
"class_type": "KSampler",
|
||||
"title": "API Sampler Config"
|
||||
}
|
||||
```
|
||||
|
||||
3. **Image Inputs** (for i2i workflows)
|
||||
```json
|
||||
{
|
||||
"inputs": {
|
||||
"image": "",
|
||||
"upload": "image"
|
||||
},
|
||||
"class_type": "LoadImage",
|
||||
"title": "API Image Input"
|
||||
}
|
||||
```
|
||||
|
||||
### Output Nodes
|
||||
|
||||
**Required for orchestrator return:**
|
||||
|
||||
```json
|
||||
{
|
||||
"inputs": {
|
||||
"images": ["node_id", 0],
|
||||
"filename_prefix": "ComfyUI"
|
||||
},
|
||||
"class_type": "SaveImage",
|
||||
"title": "API Image Output"
|
||||
}
|
||||
```
|
||||
|
||||
### Parameter Validation
|
||||
|
||||
**Include validation for critical parameters:**
|
||||
|
||||
```json
|
||||
{
|
||||
"inputs": {
|
||||
"value": "seed",
|
||||
"min": 0,
|
||||
"max": 4294967295,
|
||||
"default": 42
|
||||
},
|
||||
"class_type": "IntegerInput",
|
||||
"title": "Seed Validator"
|
||||
}
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
### Required Validations
|
||||
|
||||
1. **Model Availability**
|
||||
- Check if checkpoint files exist
|
||||
- Validate model paths
|
||||
- Provide fallback to default models
|
||||
|
||||
2. **Parameter Bounds**
|
||||
- Validate numeric ranges (seed, steps, CFG)
|
||||
- Check dimension constraints (width, height)
|
||||
- Validate string inputs (sampler names, scheduler types)
|
||||
|
||||
3. **VRAM Limits**
|
||||
- Check batch size against VRAM
|
||||
- Validate resolution against VRAM
|
||||
- Enable tiling for large images
|
||||
|
||||
4. **Input Validation**
|
||||
- Verify required inputs are provided
|
||||
- Check image formats and dimensions
|
||||
- Validate prompt lengths
|
||||
|
||||
### Fallback Strategies
|
||||
|
||||
**Default values for missing inputs:**
|
||||
```json
|
||||
{
|
||||
"inputs": {
|
||||
"text": "{{prompt | default('A beautiful landscape')}}",
|
||||
"seed": "{{seed | default(42)}}",
|
||||
"steps": "{{steps | default(20)}}"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Graceful degradation:**
|
||||
- If refiner unavailable, skip refinement step
|
||||
- If upscaler fails, return base resolution
|
||||
- If face enhancement errors, return unenhanced image
|
||||
|
||||
## VRAM Optimization
|
||||
|
||||
### Model Unloading
|
||||
|
||||
**Explicit model cleanup between stages:**
|
||||
|
||||
```json
|
||||
{
|
||||
"inputs": {
|
||||
"model": ["checkpoint_loader", 0]
|
||||
},
|
||||
"class_type": "FreeModel",
|
||||
"title": "Unload Base Model"
|
||||
}
|
||||
```
|
||||
|
||||
**When to unload:**
|
||||
- After base generation, before refinement
|
||||
- After refinement, before upscaling
|
||||
- Between different model types (diffusion → CLIP → VAE)
|
||||
|
||||
### VAE Tiling
|
||||
|
||||
**Enable for high-resolution processing:**
|
||||
|
||||
```json
|
||||
{
|
||||
"inputs": {
|
||||
"samples": ["sampler", 0],
|
||||
"vae": ["vae_loader", 0],
|
||||
"tile_size": 512,
|
||||
"overlap": 64
|
||||
},
|
||||
"class_type": "VAEDecodeTiled",
|
||||
"title": "Tiled VAE Decode"
|
||||
}
|
||||
```
|
||||
|
||||
**Tiling thresholds:**
|
||||
- Use tiled VAE for images >1024x1024
|
||||
- Tile size: 512 for 24GB VRAM, 256 for lower
|
||||
- Overlap: 64px minimum for seamless tiles
|
||||
|
||||
### Attention Slicing
|
||||
|
||||
**Reduce memory for large models:**
|
||||
|
||||
```json
|
||||
{
|
||||
"inputs": {
|
||||
"model": ["checkpoint_loader", 0],
|
||||
"attention_mode": "sliced"
|
||||
},
|
||||
"class_type": "ModelOptimization",
|
||||
"title": "Enable Attention Slicing"
|
||||
}
|
||||
```
|
||||
|
||||
### Batch Processing
|
||||
|
||||
**VRAM-safe batch sizes:**
|
||||
- FLUX models: batch_size=1
|
||||
- SDXL: batch_size=1-2
|
||||
- SD3.5: batch_size=1
|
||||
- Upscaling: batch_size=1
|
||||
|
||||
**Sequential batching:**
|
||||
```json
|
||||
{
|
||||
"inputs": {
|
||||
"mode": "sequential",
|
||||
"batch_size": 1
|
||||
},
|
||||
"class_type": "BatchProcessor"
|
||||
}
|
||||
```
|
||||
|
||||
## Quality Assurance
|
||||
|
||||
### Preview Nodes
|
||||
|
||||
**Include preview at key stages:**
|
||||
|
||||
```json
|
||||
{
|
||||
"inputs": {
|
||||
"images": ["vae_decode", 0]
|
||||
},
|
||||
"class_type": "PreviewImage",
|
||||
"title": "Preview Base Generation"
|
||||
}
|
||||
```
|
||||
|
||||
**Preview locations:**
|
||||
- After base generation (before refinement)
|
||||
- After refinement (before upscaling)
|
||||
- After upscaling (final check)
|
||||
- After face enhancement
|
||||
|
||||
### Quality Gates
|
||||
|
||||
**Checkpoints for validation:**
|
||||
|
||||
1. **Resolution Check**
|
||||
```json
|
||||
{
|
||||
"inputs": {
|
||||
"image": ["input", 0],
|
||||
"min_width": 512,
|
||||
"min_height": 512,
|
||||
"max_width": 2048,
|
||||
"max_height": 2048
|
||||
},
|
||||
"class_type": "ImageSizeValidator"
|
||||
}
|
||||
```
|
||||
|
||||
2. **Quality Metrics**
|
||||
```json
|
||||
{
|
||||
"inputs": {
|
||||
"image": ["vae_decode", 0],
|
||||
"min_quality_score": 0.7
|
||||
},
|
||||
"class_type": "QualityChecker"
|
||||
}
|
||||
```
|
||||
|
||||
### Save Points
|
||||
|
||||
**Save intermediate results:**
|
||||
|
||||
```json
|
||||
{
|
||||
"inputs": {
|
||||
"images": ["base_generation", 0],
|
||||
"filename_prefix": "intermediate/base_"
|
||||
},
|
||||
"class_type": "SaveImage",
|
||||
"title": "Save Base Generation"
|
||||
}
|
||||
```
|
||||
|
||||
**When to save:**
|
||||
- Base generation (before refinement)
|
||||
- After each major processing stage
|
||||
- Before potentially destructive operations
|
||||
|
||||
## Documentation Requirements
|
||||
|
||||
### Workflow Metadata
|
||||
|
||||
**Include in workflow JSON:**
|
||||
|
||||
```json
|
||||
{
|
||||
"workflow_info": {
|
||||
"name": "FLUX Schnell Text-to-Image Production",
|
||||
"version": "1.0.0",
|
||||
"author": "RunPod AI Model Orchestrator",
|
||||
"description": "Fast text-to-image generation using FLUX.1-schnell (4 steps)",
|
||||
"category": "text-to-image",
|
||||
"tags": ["flux", "fast", "production"],
|
||||
"requirements": {
|
||||
"models": ["FLUX.1-schnell"],
|
||||
"custom_nodes": [],
|
||||
"vram_min": "16GB",
|
||||
"vram_recommended": "24GB"
|
||||
},
|
||||
"parameters": {
|
||||
"prompt": {
|
||||
"type": "string",
|
||||
"required": true,
|
||||
"description": "Text description of desired image"
|
||||
},
|
||||
"seed": {
|
||||
"type": "integer",
|
||||
"required": false,
|
||||
"default": 42,
|
||||
"min": 0,
|
||||
"max": 4294967295
|
||||
},
|
||||
"steps": {
|
||||
"type": "integer",
|
||||
"required": false,
|
||||
"default": 4,
|
||||
"min": 1,
|
||||
"max": 20
|
||||
}
|
||||
},
|
||||
"outputs": {
|
||||
"image": {
|
||||
"type": "image",
|
||||
"format": "PNG",
|
||||
"resolution": "1024x1024"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Node Comments
|
||||
|
||||
**Document complex nodes:**
|
||||
|
||||
```json
|
||||
{
|
||||
"title": "FLUX KSampler - Main Generation",
|
||||
"notes": "Using euler_ancestral sampler with 4 steps for FLUX Schnell. CFG=1.0 is optimal for this model. Seed controls reproducibility.",
|
||||
"inputs": {
|
||||
"seed": 42,
|
||||
"steps": 4,
|
||||
"cfg": 1.0
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Usage Examples
|
||||
|
||||
**Include in workflow or README:**
|
||||
|
||||
```markdown
|
||||
## Example Usage
|
||||
|
||||
### ComfyUI Web Interface
|
||||
1. Load workflow: `text-to-image/flux-schnell-t2i-production-v1.json`
|
||||
2. Set prompt: "A serene mountain landscape at sunset"
|
||||
3. Adjust seed: 42 (optional)
|
||||
4. Click "Queue Prompt"
|
||||
|
||||
### Orchestrator API
|
||||
```bash
|
||||
curl -X POST http://localhost:9000/api/comfyui/generate \
|
||||
-d '{"workflow": "flux-schnell-t2i-production-v1.json", "inputs": {"prompt": "A serene mountain landscape"}}'
|
||||
```
|
||||
```
|
||||
|
||||
## Testing Guidelines
|
||||
|
||||
### Manual Testing
|
||||
|
||||
**Required tests before production:**
|
||||
|
||||
1. **UI Test**
|
||||
- Load in ComfyUI web interface
|
||||
- Execute with default parameters
|
||||
- Verify output quality
|
||||
- Check preview nodes
|
||||
- Confirm save locations
|
||||
|
||||
2. **API Test**
|
||||
- Call via orchestrator API
|
||||
- Test with various parameter combinations
|
||||
- Verify JSON response format
|
||||
- Check error handling
|
||||
|
||||
3. **Edge Cases**
|
||||
- Missing optional parameters
|
||||
- Invalid parameter values
|
||||
- Out-of-range inputs
|
||||
- Missing models (graceful failure)
|
||||
|
||||
### Automated Testing
|
||||
|
||||
**Test script template:**
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# Test workflow: flux-schnell-t2i-production-v1.json
|
||||
|
||||
WORKFLOW="text-to-image/flux-schnell-t2i-production-v1.json"
|
||||
|
||||
# Test 1: Default parameters
|
||||
curl -X POST http://localhost:9000/api/comfyui/generate \
|
||||
-d "{\"workflow\": \"$WORKFLOW\", \"inputs\": {\"prompt\": \"test image\"}}" \
|
||||
| jq '.status' # Should return "success"
|
||||
|
||||
# Test 2: Custom parameters
|
||||
curl -X POST http://localhost:9000/api/comfyui/generate \
|
||||
-d "{\"workflow\": \"$WORKFLOW\", \"inputs\": {\"prompt\": \"test\", \"seed\": 123, \"steps\": 8}}" \
|
||||
| jq '.status'
|
||||
|
||||
# Test 3: Missing prompt (should use default)
|
||||
curl -X POST http://localhost:9000/api/comfyui/generate \
|
||||
-d "{\"workflow\": \"$WORKFLOW\", \"inputs\": {}}" \
|
||||
| jq '.status'
|
||||
```
|
||||
|
||||
### Performance Testing
|
||||
|
||||
**Measure key metrics:**
|
||||
|
||||
```bash
|
||||
# Generation time
|
||||
time curl -X POST http://localhost:9000/api/comfyui/generate \
|
||||
-d '{"workflow": "flux-schnell-t2i-production-v1.json", "inputs": {"prompt": "benchmark"}}'
|
||||
|
||||
# VRAM usage
|
||||
nvidia-smi --query-gpu=memory.used --format=csv,noheader,nounits -l 1
|
||||
|
||||
# GPU utilization
|
||||
nvidia-smi --query-gpu=utilization.gpu --format=csv,noheader,nounits -l 1
|
||||
```
|
||||
|
||||
**Performance baselines (24GB VRAM):**
|
||||
- FLUX Schnell (1024x1024, 4 steps): ~5-8 seconds
|
||||
- FLUX Dev (1024x1024, 20 steps): ~25-35 seconds
|
||||
- SDXL + Refiner (1024x1024): ~40-60 seconds
|
||||
- CogVideoX (6s video): ~120-180 seconds
|
||||
|
||||
### Load Testing
|
||||
|
||||
**Concurrent request handling:**
|
||||
|
||||
```bash
|
||||
# Test 5 concurrent generations
|
||||
for i in {1..5}; do
|
||||
curl -X POST http://localhost:9000/api/comfyui/generate \
|
||||
-d "{\"workflow\": \"flux-schnell-t2i-production-v1.json\", \"inputs\": {\"prompt\": \"test $i\", \"seed\": $i}}" &
|
||||
done
|
||||
wait
|
||||
```
|
||||
|
||||
## Version Control
|
||||
|
||||
### Semantic Versioning
|
||||
|
||||
**Version increments:**
|
||||
- `v1` → `v2`: Major changes (different models, restructured workflow)
|
||||
- Internal iterations: Keep same version, document changes in git commits
|
||||
|
||||
### Change Documentation
|
||||
|
||||
**Changelog format:**
|
||||
|
||||
```markdown
|
||||
## flux-schnell-t2i-production-v2.json
|
||||
|
||||
### Changes from v1
|
||||
- Added API input validation
|
||||
- Optimized VRAM usage with model unloading
|
||||
- Added preview node after generation
|
||||
- Updated default steps from 4 to 6
|
||||
|
||||
### Breaking Changes
|
||||
- Changed output node structure (requires orchestrator update)
|
||||
|
||||
### Migration Guide
|
||||
- Update API calls to use new parameter names
|
||||
- Clear ComfyUI cache before loading v2
|
||||
```
|
||||
|
||||
### Deprecation Process
|
||||
|
||||
**Sunsetting old versions:**
|
||||
|
||||
1. Mark old version as deprecated in README
|
||||
2. Keep deprecated version for 2 releases
|
||||
3. Add deprecation warning in workflow metadata
|
||||
4. Document migration path to new version
|
||||
5. Archive deprecated workflows in `archive/` directory
|
||||
|
||||
## Best Practices
|
||||
|
||||
### DO
|
||||
|
||||
- Use descriptive node names
|
||||
- Include preview nodes at key stages
|
||||
- Validate all inputs
|
||||
- Optimize for VRAM efficiency
|
||||
- Document all parameters
|
||||
- Test with both UI and API
|
||||
- Version your workflows
|
||||
- Include error handling
|
||||
- Save intermediate results
|
||||
- Use semantic naming
|
||||
|
||||
### DON'T
|
||||
|
||||
- Hardcode file paths
|
||||
- Assume unlimited VRAM
|
||||
- Skip input validation
|
||||
- Omit documentation
|
||||
- Create overly complex workflows
|
||||
- Use experimental nodes in production
|
||||
- Ignore VRAM optimization
|
||||
- Skip testing edge cases
|
||||
- Use unclear node names
|
||||
- Forget to version
|
||||
|
||||
## Resources
|
||||
|
||||
- **ComfyUI Wiki**: https://github.com/comfyanonymous/ComfyUI/wiki
|
||||
- **Custom Nodes List**: https://github.com/ltdrdata/ComfyUI-Manager
|
||||
- **VRAM Optimization Guide**: `/workspace/ai/CLAUDE.md`
|
||||
- **Model Documentation**: `/workspace/ai/COMFYUI_MODELS.md`
|
||||
|
||||
## Support
|
||||
|
||||
For questions or issues:
|
||||
1. Review this standards document
|
||||
2. Check ComfyUI logs: `supervisorctl tail -f comfyui`
|
||||
3. Test workflow in UI before API
|
||||
4. Validate JSON syntax
|
||||
5. Check model availability
|
||||
Reference in New Issue
Block a user