Files
awesome/.github/workflows/build-database.yml
valknarness 99cf83330c feat: make incremental indexing configurable via CLI and workflow
Added full control over incremental indexing behavior:

**CLI Changes:**
- Added `--incremental` flag (default: true)
- Added `--full` flag to disable incremental mode
- Updated interactive prompt to ask about incremental mode

**Function Changes:**
- Updated buildIndex(force, mode, incremental) signature
- Added incremental parameter with default value true
- Conditional logic: if incremental=true, skip unchanged repos; else re-index all
- Added console logging to show incremental mode status

**Workflow Changes:**
- Added `incremental` input (boolean, default: true)
- Passes incremental setting to buildIndex via environment variable
- Defaults to true for scheduled (cron) runs

**Usage Examples:**
```bash
# CLI - incremental mode (default)
./awesome index

# CLI - force full re-index
./awesome index --full

# CLI - explicit incremental
./awesome index --incremental

# Workflow - incremental (default)
gh workflow run build-database.yml

# Workflow - full re-index
gh workflow run build-database.yml -f incremental=false
```

This makes incremental indexing opt-out instead of hardcoded, giving users full control over indexing behavior.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-28 09:59:47 +01:00

258 lines
8.6 KiB
YAML

name: Build Awesome Database
on:
schedule:
# Run daily at 02:00 UTC
- cron: '0 2 * * *'
workflow_dispatch: # Allow manual triggering
inputs:
index_mode:
description: 'Indexing mode'
required: false
default: 'full'
type: choice
options:
- full
- sample
incremental:
description: 'Incremental mode (skip unchanged repos)'
required: false
default: true
type: boolean
permissions:
contents: read
actions: write
jobs:
build-database:
runs-on: ubuntu-latest
timeout-minutes: 360 # 6 hours (GitHub Actions maximum for free tier)
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: '22'
- name: Setup pnpm
uses: pnpm/action-setup@v3
with:
version: 10
- name: Install dependencies
run: |
pnpm install
pnpm rebuild better-sqlite3
chmod +x awesome
- name: Build awesome database
id: build
env:
CI: true
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
INCREMENTAL: ${{ github.event.inputs.incremental || 'true' }}
run: |
# Pass env vars!
CI=${CI:-false}
# Capture start time
START_TIME=$(date -u +"%Y-%m-%d %H:%M:%S UTC")
echo "start_time=$START_TIME" >> $GITHUB_OUTPUT
# Determine index mode and incremental setting
INDEX_MODE="${{ github.event.inputs.index_mode || 'full' }}"
INCREMENTAL="${{ github.event.inputs.incremental }}"
# Default to true if not specified (for scheduled runs)
if [ -z "$INCREMENTAL" ]; then
INCREMENTAL="true"
fi
echo "Index mode: $INDEX_MODE"
echo "Incremental: $INCREMENTAL"
# Build the index in non-interactive mode (350m timeout, job timeout is 360m)
timeout 350m node -e "
const db = require('./lib/database');
const dbOps = require('./lib/db-operations');
const indexer = require('./lib/indexer');
(async () => {
try {
// Initialize database
db.initialize();
// Set GitHub token if available
if (process.env.GITHUB_TOKEN) {
dbOps.setSetting('githubToken', process.env.GITHUB_TOKEN);
console.log('GitHub token configured');
} else {
console.warn('⚠️ WARNING: No GitHub token found! Rate limit will be 60/hour instead of 5000/hour');
}
// Build index with incremental flag
const incremental = process.env.INCREMENTAL === 'true';
await indexer.buildIndex(false, '${INDEX_MODE}', incremental);
// Close database
db.close();
console.log('Index built successfully');
process.exit(0);
} catch (error) {
console.error('Failed to build index:', error.message);
console.error(error.stack);
process.exit(1);
}
})();
" || {
EXIT_CODE=$?
if [ $EXIT_CODE -eq 124 ]; then
echo "❌ Index building timed out after 350 minutes"
echo "This may indicate rate limiting issues or too many lists to index"
fi
exit $EXIT_CODE
}
# Capture end time
END_TIME=$(date -u +"%Y-%m-%d %H:%M:%S UTC")
echo "end_time=$END_TIME" >> $GITHUB_OUTPUT
- name: Gather database statistics
id: stats
run: |
# Get database stats
STATS=$(node -e "
const db = require('./lib/database');
const dbOps = require('./lib/db-operations');
db.initialize();
const stats = dbOps.getIndexStats();
const dbPath = require('path').join(require('os').homedir(), '.awesome', 'awesome.db');
const fs = require('fs');
const fileSize = fs.existsSync(dbPath) ? fs.statSync(dbPath).size : 0;
const fileSizeMB = (fileSize / (1024 * 1024)).toFixed(2);
console.log(JSON.stringify({
totalLists: stats.totalLists || 0,
totalRepos: stats.totalRepositories || 0,
totalReadmes: stats.totalReadmes || 0,
sizeBytes: fileSize,
sizeMB: fileSizeMB
}));
db.close();
")
echo "Database statistics:"
echo "$STATS" | jq .
# Extract values for outputs
TOTAL_LISTS=$(echo "$STATS" | jq -r '.totalLists')
TOTAL_REPOS=$(echo "$STATS" | jq -r '.totalRepos')
TOTAL_READMES=$(echo "$STATS" | jq -r '.totalReadmes')
SIZE_MB=$(echo "$STATS" | jq -r '.sizeMB')
echo "total_lists=$TOTAL_LISTS" >> $GITHUB_OUTPUT
echo "total_repos=$TOTAL_REPOS" >> $GITHUB_OUTPUT
echo "total_readmes=$TOTAL_READMES" >> $GITHUB_OUTPUT
echo "size_mb=$SIZE_MB" >> $GITHUB_OUTPUT
- name: Prepare database artifact
run: |
# Copy database from home directory
DB_PATH="$HOME/.awesome/awesome.db"
if [ ! -f "$DB_PATH" ]; then
echo "Error: Database file not found at $DB_PATH"
exit 1
fi
# Create artifact directory
mkdir -p artifacts
# Copy database with timestamp
BUILD_DATE=$(date -u +"%Y%m%d-%H%M%S")
cp "$DB_PATH" "artifacts/awesome-${BUILD_DATE}.db"
cp "$DB_PATH" "artifacts/awesome-latest.db"
# Create metadata file
cat > artifacts/metadata.json <<EOF
{
"build_date": "$(date -u +"%Y-%m-%d %H:%M:%S UTC")",
"build_timestamp": "$(date -u +%s)",
"git_sha": "${{ github.sha }}",
"workflow_run_id": "${{ github.run_id }}",
"total_lists": ${{ steps.stats.outputs.total_lists }},
"total_repos": ${{ steps.stats.outputs.total_repos }},
"total_readmes": ${{ steps.stats.outputs.total_readmes }},
"size_mb": ${{ steps.stats.outputs.size_mb }},
"node_version": "$(node --version)",
"index_mode": "${{ github.event.inputs.index_mode || 'full' }}"
}
EOF
echo "Artifact prepared: awesome-${BUILD_DATE}.db"
ls -lh artifacts/
- name: Upload database artifact
uses: actions/upload-artifact@v4
with:
name: awesome-database-${{ github.run_id }}
path: |
artifacts/awesome-*.db
artifacts/metadata.json
retention-days: 90
compression-level: 9
- name: Create build summary
run: |
cat >> $GITHUB_STEP_SUMMARY <<EOF
# 🎉 Awesome Database Build Complete
## 📊 Statistics
| Metric | Value |
|--------|-------|
| 📚 Total Lists | ${{ steps.stats.outputs.total_lists }} |
| 📦 Total Repositories | ${{ steps.stats.outputs.total_repos }} |
| 📖 Total READMEs | ${{ steps.stats.outputs.total_readmes }} |
| 💾 Database Size | ${{ steps.stats.outputs.size_mb }} MB |
## ⏱️ Build Information
- **Started:** ${{ steps.build.outputs.start_time }}
- **Completed:** ${{ steps.build.outputs.end_time }}
- **Workflow Run:** [\#${{ github.run_id }}](${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }})
- **Commit:** \`${{ github.sha }}\`
- **Index Mode:** ${{ github.event.inputs.index_mode || 'full' }}
## 📥 Download Instructions
\`\`\`bash
# Using GitHub CLI
gh run download ${{ github.run_id }} -n awesome-database-${{ github.run_id }}
# Or using our helper script
curl -sSL https://raw.githubusercontent.com/${{ github.repository }}/main/scripts/download-db.sh | bash
\`\`\`
## 🔗 Artifact Link
[Download Database Artifact](${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }})
EOF
- name: Notify on failure
if: failure()
run: |
cat >> $GITHUB_STEP_SUMMARY <<EOF
# ❌ Database Build Failed
The automated database build encountered an error.
**Workflow Run:** [\#${{ github.run_id }}](${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }})
Please check the logs for details.
EOF