HuggingFace cache flattens repository directory structure into snapshots.
When YAML specifies "vae/model.safetensors", the cache stores it as just
"model.safetensors" without preserving the vae/ subdirectory.
Changed grep pattern from "/$source_pattern$" to "/$filename_only$" to
match the actual flattened cache structure instead of the logical repo path.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
The source_pattern can include subdirectories (e.g., "unet/model.safetensors"),
but find_model_files() filters by filename only. Now we:
1. Extract just the filename using basename
2. Find files matching that filename
3. Filter results to match the full source_pattern path
This allows proper matching of files in subdirectories.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
The previous fix still called find_model_files() with empty filter,
which doesn't work. Now we call find_model_files() for EACH file mapping
individually, passing the source_pattern as the filter.
This eliminates the "No files matched filter (filter: )" warnings and
allows the script to properly find and link downloaded models.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Previously, both functions called find_model_files() with empty filters before
checking for explicit file mappings, causing premature exit when filters were empty.
Now both functions check for file_mappings FIRST, then call find_model_files()
only after confirming mappings exist. This fixes critical linking failures for
models with explicit file mappings but no filename filters.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Updated link_model() and verify_model_links() functions to extract target
directories directly from dest paths instead of using separate model_type parameter.
Key changes:
- link_model() now parses dest paths (e.g., "checkpoints/model.safetensors") to
determine target directories dynamically
- verify_model_links() updated to work with dest paths containing directory info
- Removed model_type parameter from both function signatures and all callers
- Support for models with files in multiple ComfyUI subdirectories
- Updated display output to remove model_type from status messages
This simplifies the script by eliminating redundant type information, as the
directory structure is now embedded in the dest paths within the YAML file.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Removed the DEBUG line that printed snapshot paths during model linking.
This cleans up the output to only show user-relevant information.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Changed CROSS to CROSS_MARK to match the variable defined in the
Unicode characters section (line 91).
This fixes the "CROSS: unbound variable" error when running the
cleanup function in dry-run mode.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Fix the bash parameter expansion to properly replace all forward slashes
with double hyphens when converting HuggingFace repo IDs to cache paths.
Changed from: ${repo_id//\/--} (incorrect syntax)
Changed to: ${repo_id//\//--} (correct syntax)
This fixes the 'Cache directory not found' error when using --cleanup flag.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Fix bash string substitution to replace all forward slashes with
double hyphens when constructing HuggingFace cache paths.
Changed from: ${repo_id/\//-} (replaces first / with single -)
Changed to: ${repo_id//\/--} (replaces all / with --)
This fixes the "Cache directory not found" warning when using the
--cleanup flag with repositories that have slashes in their names
(e.g., black-forest-labs/FLUX.1-schnell).
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Add three major features to artifact_huggingface_download.sh:
1. Category filtering (--category flag):
- Filter models by category (single or comma-separated multiple)
- Validates categories against YAML configuration
- Works with all commands (download, link, both, verify)
2. Cleanup mode (--cleanup flag):
- Removes unreferenced files from HuggingFace cache
- Only deletes files not referenced by symlinks
- Per-category cleanup (safe and isolated)
- Works with link and both commands only
3. Dry-run mode (--dry-run/-n flag):
- Preview operations without making changes
- Shows what would be downloaded, linked, or cleaned
- Includes file counts and size estimates
- Warning banner to indicate dry-run mode
Additional improvements:
- Updated help text with comprehensive examples
- Added validation for invalid flag combinations
- Enhanced user feedback with detailed preview information
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Updated find_model_files() to include .index.json files in the file
discovery logic. These files are essential for sharded models like
CogVideoX which split safetensors into multiple parts.
Changes:
- Modified JSON file filtering to allow files ending with .index.json
- Updated comment to document support for sharded model index files
Fixes automatic linking for:
- CogVideoX-5b (3 files including index.json)
- CogVideoX-5b-I2V (4 files including index.json)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
HuggingFace models use symlinks pointing to blobs in the cache. The stat
command was returning the size of the symlink itself (76 bytes) instead of
the actual file it points to.
Fixed by:
- Adding -L flag to stat commands to follow symlinks
- Testing which stat format works (Linux -c vs macOS -f) before executing
- Using proper spacing and quotes in stat commands
- Checking for both regular files and symlinks in the condition
This fixes disk usage showing as 1 KB instead of the actual ~50+ GB for models
like FLUX.
The stat command was producing multi-line output which broke the pipe-delimited
return format of verify_model_download(). Fixed by:
- Testing which stat format works (Linux vs macOS) before executing
- Using proper format flags to get single-line date output
- Defaulting to 'Unknown' instead of relying on fallback chain
This fixes the issue where all models showed as 'NOT DOWNLOADED' even when
present in the cache.
The verify command was showing all models as "NOT DOWNLOADED" because find_model_files()
was exiting silently without diagnostic output. This made debugging impossible.
Changes:
- Added detailed error messages to Python script in find_model_files()
- Reports which directories were checked and why they failed
- Shows actual vs expected paths for model/snapshots directories
- Includes DEBUG messages showing which snapshot is being used
- Warns when no files match the filter
- Modified verify_model_download() to capture and display stderr
- Changed from suppressing stderr (2>/dev/null) to capturing it (2>&1)
- Filters ERROR/WARN/DEBUG prefixes from file paths
- Logs diagnostic messages to stderr for visibility
This will help identify the actual cache structure mismatch causing verification failures.
🤖 Generated with Claude Code (https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Added new 'verify' command to artifact_huggingface_download.sh that performs comprehensive health checks on all downloaded models and their symlinks.
Features:
- Download status verification (existence, size, location, timestamps)
- Link status verification (valid, broken, or missing symlinks)
- Size mismatch detection (warns if actual differs >10% from expected)
- Per-model detailed logging with beautiful formatting
- Category-level and global statistics summaries
- Actionable fix suggestions for detected issues
- Disk space usage analysis
New Functions:
- get_model_disk_usage() - Calculate actual model file sizes
- format_bytes() - Human-readable size formatting
- verify_model_download() - Check model download status
- verify_model_links() - Verify symlink integrity
- verify_category() - Process category with verification
- display_verification_summary() - Show global results
Usage:
artifact_huggingface_download.sh verify -c models.yaml
Output includes:
✓ Downloaded/Missing model counts
✓ Properly linked/Broken link statistics
✓ File sizes and locations
✓ Last modified timestamps
⚠️ Size mismatch warnings
📊 Disk space usage per category
💡 Fix suggestions with exact commands
🤖 Generated with Claude Code (https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>