refactor: consolidate data/ into single root directory, fix historical player names
Merge data/wikipedia/{year}/ into data/{year}/ so there is a single
canonical location for World Cup JSON files. Update scrape and seed
scripts to use data/ instead of data/wikipedia/.
Re-scraped all 22 years (1930-2022) with fixed player name extraction
(full name from <a title="..."> rather than abbreviated display text)
so historical goals now show e.g. "Thomas Müller" not "Müller".
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -1,6 +1,6 @@
|
||||
/**
|
||||
* Scrape English Wikipedia for World Cup data and write JSON files to
|
||||
* data/wikipedia/{year}/.
|
||||
* data/{year}/.
|
||||
*
|
||||
* Usage:
|
||||
* pnpm scrape # all years, matches + squads
|
||||
@@ -17,7 +17,7 @@ import {
|
||||
} from '../lib/wiki-scraper'
|
||||
|
||||
const __dirname = path.dirname(fileURLToPath(import.meta.url))
|
||||
const DATA_DIR = path.join(__dirname, '../data/wikipedia')
|
||||
const DATA_DIR = path.join(__dirname, '../data')
|
||||
|
||||
const YEARS = [
|
||||
1930,1934,1938,1950,1954,1958,1962,1966,1970,1974,
|
||||
@@ -95,7 +95,7 @@ async function main() {
|
||||
console.log()
|
||||
}
|
||||
|
||||
console.log('\nDone! Files written to data/wikipedia/{year}/')
|
||||
console.log('\nDone! Files written to data/{year}/')
|
||||
}
|
||||
|
||||
main().catch(e => { console.error(e); process.exit(1) })
|
||||
|
||||
Reference in New Issue
Block a user