Commit Graph

19 Commits

Author SHA1 Message Date
8291a3b662 feat: rewrite CivitAI and HuggingFace download scripts with curl
Complete rewrite of both model download scripts with:
- Beautiful colorful CLI output with progress indicators
- Pure bash/curl downloads (no Python dependencies for downloading)
- yq-based YAML parsing (consistent with arty.sh)
- Three commands: download, link, verify
- Filtering by --category and --repo-id (comma-separated)
- --dry-run mode for previewing operations
- Respects format field for file extensions (.safetensors, .pt, etc.)
- Uses type field for output subdirectories (checkpoints, embeddings, loras)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-26 15:01:27 +01:00
71f7391fd1 fix: match filename only in grep pattern for HuggingFace cache structure
HuggingFace cache flattens repository directory structure into snapshots.
When YAML specifies "vae/model.safetensors", the cache stores it as just
"model.safetensors" without preserving the vae/ subdirectory.

Changed grep pattern from "/$source_pattern$" to "/$filename_only$" to
match the actual flattened cache structure instead of the logical repo path.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-25 20:08:28 +01:00
7a1e64862a fix: extract filename from source_pattern before calling find_model_files()
The source_pattern can include subdirectories (e.g., "unet/model.safetensors"),
but find_model_files() filters by filename only. Now we:

1. Extract just the filename using basename
2. Find files matching that filename
3. Filter results to match the full source_pattern path

This allows proper matching of files in subdirectories.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-25 20:01:10 +01:00
bb43639b09 fix: call find_model_files() with source_pattern instead of empty filter
The previous fix still called find_model_files() with empty filter,
which doesn't work. Now we call find_model_files() for EACH file mapping
individually, passing the source_pattern as the filter.

This eliminates the "No files matched filter (filter: )" warnings and
allows the script to properly find and link downloaded models.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-25 19:57:37 +01:00
5199f0fa98 fix: reorder logic in link_model() and verify_model_links() to check file_mappings first
Previously, both functions called find_model_files() with empty filters before
checking for explicit file mappings, causing premature exit when filters were empty.

Now both functions check for file_mappings FIRST, then call find_model_files()
only after confirming mappings exist. This fixes critical linking failures for
models with explicit file mappings but no filename filters.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-25 19:45:32 +01:00
362ed1f816 refactor: remove model_type parameter in favor of dest path parsing
Updated link_model() and verify_model_links() functions to extract target
directories directly from dest paths instead of using separate model_type parameter.

Key changes:
- link_model() now parses dest paths (e.g., "checkpoints/model.safetensors") to
  determine target directories dynamically
- verify_model_links() updated to work with dest paths containing directory info
- Removed model_type parameter from both function signatures and all callers
- Support for models with files in multiple ComfyUI subdirectories
- Updated display output to remove model_type from status messages

This simplifies the script by eliminating redundant type information, as the
directory structure is now embedded in the dest paths within the YAML file.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-25 19:24:41 +01:00
ec4ab72661 chore: remove DEBUG message from model snapshot output
Removed the DEBUG line that printed snapshot paths during model linking.
This cleans up the output to only show user-relevant information.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-25 17:12:10 +01:00
060779544f fix: undefined CROSS variable in cleanup dry-run mode
Changed CROSS to CROSS_MARK to match the variable defined in the
Unicode characters section (line 91).

This fixes the "CROSS: unbound variable" error when running the
cleanup function in dry-run mode.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-25 17:05:51 +01:00
0c62e90006 fix: correct bash string substitution for cache directory path
Fix the bash parameter expansion to properly replace all forward slashes
with double hyphens when converting HuggingFace repo IDs to cache paths.

Changed from: ${repo_id//\/--}  (incorrect syntax)
Changed to:   ${repo_id//\//--} (correct syntax)

This fixes the 'Cache directory not found' error when using --cleanup flag.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-25 17:01:05 +01:00
bd34036039 fix: correct cache directory path construction in cleanup function
Fix bash string substitution to replace all forward slashes with
double hyphens when constructing HuggingFace cache paths.

Changed from: ${repo_id/\//-}  (replaces first / with single -)
Changed to:   ${repo_id//\/--} (replaces all / with --)

This fixes the "Cache directory not found" warning when using the
--cleanup flag with repositories that have slashes in their names
(e.g., black-forest-labs/FLUX.1-schnell).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-25 16:53:28 +01:00
2c71c49893 feat: add category filter, cleanup mode, and dry-run support to HuggingFace downloader
Add three major features to artifact_huggingface_download.sh:

1. Category filtering (--category flag):
   - Filter models by category (single or comma-separated multiple)
   - Validates categories against YAML configuration
   - Works with all commands (download, link, both, verify)

2. Cleanup mode (--cleanup flag):
   - Removes unreferenced files from HuggingFace cache
   - Only deletes files not referenced by symlinks
   - Per-category cleanup (safe and isolated)
   - Works with link and both commands only

3. Dry-run mode (--dry-run/-n flag):
   - Preview operations without making changes
   - Shows what would be downloaded, linked, or cleaned
   - Includes file counts and size estimates
   - Warning banner to indicate dry-run mode

Additional improvements:
- Updated help text with comprehensive examples
- Added validation for invalid flag combinations
- Enhanced user feedback with detailed preview information

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-25 16:46:44 +01:00
f68ee52297 fix: allow .index.json files for sharded model linking
Updated find_model_files() to include .index.json files in the file
discovery logic. These files are essential for sharded models like
CogVideoX which split safetensors into multiple parts.

Changes:
- Modified JSON file filtering to allow files ending with .index.json
- Updated comment to document support for sharded model index files

Fixes automatic linking for:
- CogVideoX-5b (3 files including index.json)
- CogVideoX-5b-I2V (4 files including index.json)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-25 16:04:19 +01:00
94e2326158 fix: follow symlinks when calculating model disk usage
HuggingFace models use symlinks pointing to blobs in the cache. The stat
command was returning the size of the symlink itself (76 bytes) instead of
the actual file it points to.

Fixed by:
- Adding -L flag to stat commands to follow symlinks
- Testing which stat format works (Linux -c vs macOS -f) before executing
- Using proper spacing and quotes in stat commands
- Checking for both regular files and symlinks in the condition

This fixes disk usage showing as 1 KB instead of the actual ~50+ GB for models
like FLUX.
2025-11-25 15:21:27 +01:00
33e0a0f2d0 fix: resolve stat command multi-line output breaking verify command
The stat command was producing multi-line output which broke the pipe-delimited
return format of verify_model_download(). Fixed by:
- Testing which stat format works (Linux vs macOS) before executing
- Using proper format flags to get single-line date output
- Defaulting to 'Unknown' instead of relying on fallback chain

This fixes the issue where all models showed as 'NOT DOWNLOADED' even when
present in the cache.
2025-11-25 15:14:01 +01:00
56f1ee8c69 debug: add bash-level debugging to verify_model_download 2025-11-25 15:06:30 +01:00
4500228941 fix: add comprehensive error logging to find_model_files for verify command
The verify command was showing all models as "NOT DOWNLOADED" because find_model_files()
was exiting silently without diagnostic output. This made debugging impossible.

Changes:
- Added detailed error messages to Python script in find_model_files()
  - Reports which directories were checked and why they failed
  - Shows actual vs expected paths for model/snapshots directories
  - Includes DEBUG messages showing which snapshot is being used
  - Warns when no files match the filter

- Modified verify_model_download() to capture and display stderr
  - Changed from suppressing stderr (2>/dev/null) to capturing it (2>&1)
  - Filters ERROR/WARN/DEBUG prefixes from file paths
  - Logs diagnostic messages to stderr for visibility

This will help identify the actual cache structure mismatch causing verification failures.

🤖 Generated with Claude Code (https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-25 14:45:57 +01:00
3308349e78 feat: add comprehensive verify command to HuggingFace downloader
Added new 'verify' command to artifact_huggingface_download.sh that performs comprehensive health checks on all downloaded models and their symlinks.

Features:
- Download status verification (existence, size, location, timestamps)
- Link status verification (valid, broken, or missing symlinks)
- Size mismatch detection (warns if actual differs >10% from expected)
- Per-model detailed logging with beautiful formatting
- Category-level and global statistics summaries
- Actionable fix suggestions for detected issues
- Disk space usage analysis

New Functions:
- get_model_disk_usage() - Calculate actual model file sizes
- format_bytes() - Human-readable size formatting
- verify_model_download() - Check model download status
- verify_model_links() - Verify symlink integrity
- verify_category() - Process category with verification
- display_verification_summary() - Show global results

Usage:
  artifact_huggingface_download.sh verify -c models.yaml

Output includes:
  ✓ Downloaded/Missing model counts
  ✓ Properly linked/Broken link statistics
  ✓ File sizes and locations
  ✓ Last modified timestamps
  ⚠️ Size mismatch warnings
  📊 Disk space usage per category
  💡 Fix suggestions with exact commands

🤖 Generated with Claude Code (https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-25 13:33:37 +01:00
c31b1e87bd fix: allow config.json and tokenizer files for DiffRhythm models 2025-11-24 12:26:33 +01:00
a1548ed490 fix: correct typo in filename (hugginface -> huggingface)
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 17:59:56 +01:00