Files
llmx/.github/workflows/ci.yml
Michael Bolin c00ae2dcc1 Enforce ASCII in README.md (#513)
This all started because I was going to write a script to autogenerate
the Table of Contents in the root `README.md`, but I noticed that the
`href` for the "Why Codex?" heading was `#whycodex` instead of
`#why-codex`. This piqued my curiosity and it turned out that the space
in "Why Codex?" was not an ASCII space but **U+00A0**, a non-breaking
space, and so GitHub ignored it when generating the `href` for the
heading.

This also meant that when I did a text search for `why codex` in the
`README.md` in VS Code, the "Why Codex" heading did not match because of
the presence of **U+00A0**.

In short, these types of Unicode characters seem like a hazard, so I
decided to introduce this script to flag them, and if desired, to
replace them with "good enough" ASCII equivalents. For now, this only
applies to the root `README.md` file, but I think we should ultimately
apply this across our source code, as well, as we seem to have quite a
lot of non-ASCII Unicode and it's probably going to cause `rg` to miss
things.

Contributions of this PR:

* `./scripts/asciicheck.py`, which takes a list of filepaths and returns
non-zero if any of them contain non-ASCII characters. (Currently, there
is one exception for  aka **U+2728**, though I would like to default to
an empty allowlist and then require all exceptions to be specified as
flags.)
* A `--fix` option that will attempt to rewrite files with violations
using a equivalents from a hardcoded substitution list.
* An update to `ci.yml` to verify `./scripts/asciicheck.py README.md`
succeeds.
* A cleanup of `README.md` using the `--fix` option as well as some
editorial decisions on my part.
* I tried to update the `href`s in the Table of Contents to reflect the
changes in the heading titles. (TIL that if a heading has a character
like `&` surrounded by spaces, it becomes `--` in the generated `href`.)
2025-04-22 10:07:40 -04:00

73 lines
1.8 KiB
YAML

name: ci
on:
pull_request: { branches: [main] }
push: { branches: [main] }
jobs:
build-test:
runs-on: ubuntu-latest
timeout-minutes: 10
env:
NODE_OPTIONS: --max-old-space-size=4096
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 22
- name: Setup pnpm
uses: pnpm/action-setup@v4
with:
version: 10.8.1
run_install: false
- name: Get pnpm store directory
id: pnpm-cache
shell: bash
run: |
echo "store_path=$(pnpm store path --silent)" >> $GITHUB_OUTPUT
- name: Setup pnpm cache
uses: actions/cache@v4
with:
path: ${{ steps.pnpm-cache.outputs.store_path }}
key: ${{ runner.os }}-pnpm-store-${{ hashFiles('**/pnpm-lock.yaml') }}
restore-keys: |
${{ runner.os }}-pnpm-store-
- name: Install dependencies
run: pnpm install
# Run all tasks using workspace filters
- name: Check TypeScript code formatting
working-directory: codex-cli
run: pnpm run format
- name: Check Markdown and config file formatting
run: pnpm run format
- name: Run tests
run: pnpm run test
- name: Lint
run: |
pnpm --filter @openai/codex exec -- eslint src tests --ext ts --ext tsx \
--report-unused-disable-directives \
--rule "no-console:error" \
--rule "no-debugger:error" \
--max-warnings=-1
- name: Type-check
run: pnpm run typecheck
- name: Build
run: pnpm run build
- name: Ensure README.md contains only ASCII and certain Unicode code points
run: ./scripts/asciicheck.py README.md