Added full control over incremental indexing behavior:
**CLI Changes:**
- Added `--incremental` flag (default: true)
- Added `--full` flag to disable incremental mode
- Updated interactive prompt to ask about incremental mode
**Function Changes:**
- Updated buildIndex(force, mode, incremental) signature
- Added incremental parameter with default value true
- Conditional logic: if incremental=true, skip unchanged repos; else re-index all
- Added console logging to show incremental mode status
**Workflow Changes:**
- Added `incremental` input (boolean, default: true)
- Passes incremental setting to buildIndex via environment variable
- Defaults to true for scheduled (cron) runs
**Usage Examples:**
```bash
# CLI - incremental mode (default)
./awesome index
# CLI - force full re-index
./awesome index --full
# CLI - explicit incremental
./awesome index --incremental
# Workflow - incremental (default)
gh workflow run build-database.yml
# Workflow - full re-index
gh workflow run build-database.yml -f incremental=false
```
This makes incremental indexing opt-out instead of hardcoded, giving users full control over indexing behavior.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Major performance improvements for CI builds:
1. **Removed proactive rate limit threshold**
- No longer waits at 500 remaining requests
- Uses full 5000 request quota before forced wait
- Maximizes work per rate limit cycle
2. **Implemented incremental indexing**
- Checks if repository already exists in database
- Compares last_commit (pushedAt) to detect changes
- Only fetches README for new or updated repositories
- Skips README fetch for unchanged repos (major time savings)
3. **Increased timeout to GitHub maximum**
- Job timeout: 180m → 360m (6 hours, GitHub free tier max)
- Script timeout: 170m → 350m
- Allows full first-run indexing to complete
Impact on performance:
**First run (empty database):**
- Same as before: ~25,000 repos need full indexing
- Will use all 360 minutes but should complete
**Subsequent runs (incremental):**
- Only fetches READMEs for changed repos (~5-10% typically)
- Dramatically faster: estimated 30-60 minutes instead of 360
- Makes daily automated builds sustainable
Files changed:
- lib/github-api.js: Removed proactive rate limit check
- lib/indexer.js: Added incremental indexing logic
- .github/workflows/build-database.yml: Increased timeout to 360m
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Initialize database inside indexer process to ensure connection exists
- Configure GitHub token in same process as indexer
- Make indexer throw errors instead of returning early for CI failure detection
- Remove duplicate token configuration step
- Pass GITHUB_TOKEN as environment variable to build step
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>