awesome

Author	SHA1	Message	Date
valknarness	279cc2fa25	fix: bypass rate limiting for raw.githubusercontent.com requests CRITICAL FIX: raw.githubusercontent.com does NOT count against GitHub API rate limits, but the code was treating all requests the same way. Problem: - README fetches (~25,000) were going through rateLimitedRequest() - Added artificial delays, proactive checks, and unnecessary waits - Build took ~7 hours instead of ~2-3 hours - Only getRepoInfo() API calls actually count against rate limits Solution: 1. Created fetchRawContent() function for direct raw content fetches 2. Updated getReadme() to use fetchRawContent() 3. Updated getAwesomeListsIndex() to use fetchRawContent() 4. Reduced workflow timeout: 330m → 180m (3 hours) Impact: - Build time: ~7 hours → ~2-3 hours (60% reduction) - Only ~25K API calls (getRepoInfo) count against 5000/hour limit - ~25K README fetches are now unrestricted via raw.githubusercontent.com - Will complete well within GitHub Actions 6-hour free tier limit Files changed: - lib/github-api.js: Add fetchRawContent(), update getReadme() and getAwesomeListsIndex() to use it - .github/workflows/build-database.yml: Reduce timeout to 180 minutes 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-28 06:04:14 +01:00
valknarness	9c166fe56f	fix: increase workflow timeout to 330 minutes (5.5 hours) Accurate calculation for ~25,000 READMEs with 5000/hour rate limit: - Requests needed: ~25,000 (1 per README + metadata) - Cycles needed: 25,000 / 4,500 ≈ 5.5 cycles - Time per cycle: ~44 min work + ~16 min wait = 60 min total - Total time: 5.5 × 60 = 330 minutes (5.5 hours) Previous timeouts were insufficient: - 170 minutes: completed 2.9 cycles (~13,500 requests) - 270 minutes: would complete 4.5 cycles (~20,000 requests) - 330 minutes: allows full completion with buffer Changes: - Job timeout: 180m → 330m (5.5 hours) - Script timeout: 170m → 320m - Within GitHub Actions free tier limit (360 minutes/6 hours) Alternative: Use 'sample' mode for faster builds if full index is not immediately needed. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-28 00:31:58 +01:00
valknarness	326186f4a8	fix: optimize CI rate limit strategy for batch efficiency Changes for CI mode (process.env.CI === 'true'): - Remove grace period (was 10min) to enable continuous monitoring - Increase check frequency from 1% to 10% to catch low rate limits early - Raise proactive threshold from 200 to 500 requests - Increase resume threshold from 100 to 1000 requests This prevents wasting time on small batches (e.g. 184 requests = 2min work + 13min wait) by ensuring we work in larger 1000-5000 request batches for better time efficiency within the 170-minute timeout. Local mode unchanged: maintains user-friendly behavior with fewer interruptions. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-27 20:52:41 +01:00
valknarness	958192899d	fix: use adaptive rate limit threshold for CI vs local Use MIN_REMAINING_TO_CONTINUE = 100 in CI environments to allow incremental progress within the 170-minute timeout constraint, while maintaining 4500 locally for better user experience with fewer interruptions during indexing. This fixes the timeout issue where waiting for nearly full rate limit reset (4500/5000) required ~58 minutes per cycle, causing builds to exceed the 170-minute timeout after just 3 cycles. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-27 17:56:55 +01:00
valknarness	946513fbbb	fix: suspend	2025-10-27 03:53:52 +01:00
valknarness	a136b929b0	fix: suspend	2025-10-27 02:36:46 +01:00
valknarness	c0d3ffd328	fix: CI indexing	2025-10-26 22:04:46 +01:00
valknarness	9eb428dd63	fix: cli no user input	2025-10-26 19:12:49 +01:00
valknarness	1bc5e564b1	docs: readme	2025-10-26 15:26:02 +01:00
valknarness	509795ab82	Fix workflow database initialization and error handling - Initialize database inside indexer process to ensure connection exists - Configure GitHub token in same process as indexer - Make indexer throw errors instead of returning early for CI failure detection - Remove duplicate token configuration step - Pass GITHUB_TOKEN as environment variable to build step 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-26 14:05:21 +01:00
valknarness	6518cd8e76	Add getIndexStats function to db-operations This function is required by the GitHub Actions workflow for gathering database statistics after the build completes. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-26 14:03:11 +01:00
valknarness	10910d8537	fix: github workflow	2025-10-26 14:00:45 +01:00
valknarness	c73a14510b	feat: github workflow	2025-10-26 13:55:27 +01:00
valknarness	4cdcc62e15	feat: github workflow	2025-10-26 13:48:23 +01:00
valknarness	700c73bcbf	a new start	2025-10-25 15:52:06 +02:00

15 Commits