feat: introduce npm module for codex-responses-api-proxy (#4417)

This PR expands `.github/workflows/rust-release.yml` so that it also builds and publishes the `npm` module for `@openai/codex-responses-api-proxy` in addition to `@openai/codex`. Note both `npm` modules are similar, in that they each contain a single `.js` file that is a thin launcher around the appropriate native executable. (Since we have a minimal dependency on Node.js, I also lowered the minimum version from 20 to 16 and verified that works on my machine.) As part of this change, we tighten up some of the docs around `codex-responses-api-proxy` and ensure the details regarding protecting the `OPENAI_API_KEY` in memory match the implementation. To test the `npm` build process, I ran: ``` ./codex-cli/scripts/build_npm_package.py --package codex-responses-api-proxy --version 0.43.0-alpha.3 ``` which stages the `npm` module for `@openai/codex-responses-api-proxy` in a temp directory, using the binary artifacts from https://github.com/openai/codex/releases/tag/rust-v0.43.0-alpha.3.
2025-09-28 19:34:06 -07:00
parent a9d54b9e92
commit 618a42adf5
13 changed files with 387 additions and 64 deletions
--- a/codex-rs/responses-api-proxy/README.md
+++ b/codex-rs/responses-api-proxy/README.md
@@ -4,12 +4,12 @@ A strict HTTP proxy that only forwards `POST` requests to `/v1/responses` to the

 ## Expected Usage

-**IMPORTANT:** This is designed to be used with `CODEX_SECURE_MODE=1` so that an unprivileged user cannot inspect or tamper with this process. Though if `--http-shutdown` is specified, an unprivileged user _can_ shutdown the server.
+**IMPORTANT:** `codex-responses-api-proxy` is designed to be run by a privileged user with access to `OPENAI_API_KEY` so that an unprivileged user cannot inspect or tamper with the process. Though if `--http-shutdown` is specified, an unprivileged user _can_ make a `GET` request to `/shutdown` to shutdown the server, as an unprivileged could not send `SIGTERM` to kill the process.

-A privileged user (i.e., `root` or a user with `sudo`) who has access to `OPENAI_API_KEY` would run the following to start the server:
+A privileged user (i.e., `root` or a user with `sudo`) who has access to `OPENAI_API_KEY` would run the following to start the server, as `codex-responses-api-proxy` reads the auth token from `stdin`:

 ```shell
-printenv OPENAI_API_KEY | CODEX_SECURE_MODE=1 codex responses-api-proxy --http-shutdown --server-info /tmp/server-info.json
+printenv OPENAI_API_KEY | codex-responses-api-proxy --http-shutdown --server-info /tmp/server-info.json
 ```

 A non-privileged user would then run Codex as follows, specifying the `model_provider` dynamically:
@@ -22,7 +22,7 @@ codex exec -c "model_providers.openai-proxy={ name = 'OpenAI Proxy', base_url =
    'Your prompt here'
 ```

-When the unprivileged user was finished, they could shutdown the server using `curl` (since `kill -9` is not an option):
+When the unprivileged user was finished, they could shutdown the server using `curl` (since `kill -SIGTERM` is not an option):

 ```shell
 curl --fail --silent --show-error "${PROXY_BASE_URL}/shutdown"
@@ -30,17 +30,17 @@ curl --fail --silent --show-error "${PROXY_BASE_URL}/shutdown"

 ## Behavior

- Reads the API key from `stdin`. All callers should pipe the key in (for example, `printenv OPENAI_API_KEY | codex responses-api-proxy`).
+- Reads the API key from `stdin`. All callers should pipe the key in (for example, `printenv OPENAI_API_KEY | codex-responses-api-proxy`).
 - Formats the header value as `Bearer <key>` and attempts to `mlock(2)` the memory holding that header so it is not swapped to disk.
 - Listens on the provided port or an ephemeral port if `--port` is not specified.
 - Accepts exactly `POST /v1/responses` (no query string). The request body is forwarded to `https://api.openai.com/v1/responses` with `Authorization: Bearer <key>` set. All original request headers (except any incoming `Authorization`) are forwarded upstream. For other requests, it responds with `403`.
 - Optionally writes a single-line JSON file with server info, currently `{ "port": <u16> }`.
- Optional `--http-shutdown` enables `GET /shutdown` to terminate the process with exit code 0. This allows one user (e.g., root) to start the proxy and another unprivileged user on the host to shut it down.
+- Optional `--http-shutdown` enables `GET /shutdown` to terminate the process with exit code 0. This allows one user (e.g., `root`) to start the proxy and another unprivileged user on the host to shut it down.

 ## CLI

 ```
-responses-api-proxy [--port <PORT>] [--server-info <FILE>] [--http-shutdown]
+codex-responses-api-proxy [--port <PORT>] [--server-info <FILE>] [--http-shutdown]
 ```

 - `--port <PORT>`: Port to bind on `127.0.0.1`. If omitted, an ephemeral port is chosen.
@@ -51,3 +51,19 @@ responses-api-proxy [--port <PORT>] [--server-info <FILE>] [--http-shutdown]

 - Only `POST /v1/responses` is permitted. No query strings are allowed.
 - All request headers are forwarded to the upstream call (aside from overriding `Authorization`). Response status and content-type are mirrored from upstream.
+
+## Hardening Details
+
+Care is taken to restrict access/copying to the value of `OPENAI_API_KEY` retained in memory:
+
+- We leverage [`codex_process_hardening`](https://github.com/openai/codex/blob/main/codex-rs/process-hardening/README.md) so `codex-responses-api-proxy` is run with standard process-hardening techniques.
+- At startup, we allocate a `1024` byte buffer on the stack and write `"Bearer "` as the first `7` bytes.
+- We then read from `stdin`, copying the contents into the buffer after `"Bearer "`.
+- After verifying the key matches `/^[a-zA-Z0-9_-]+$/` (and does not exceed the buffer), we create a `String` from that buffer (so the data is now on the heap).
+- We zero out the stack-allocated buffer using https://crates.io/crates/zeroize so it is not optimized away by the compiler.
+- We invoke `.leak()` on the `String` so we can treat its contents as a `&'static str`, as it will live for the rest of the process.
+- On UNIX, we `mlock(2)` the memory backing the `&'static str`.
+- When using the `&'static str` when building an HTTP request, we use `HeaderValue::from_static()` to avoid copying the `&str`.
+- We also invoke `.set_sensitive(true)` on the `HeaderValue`, which in theory indicates to other parts of the HTTP stack that the header should be treated with "special care" to avoid leakage:
+
+https://github.com/hyperium/http/blob/439d1c50d71e3be3204b6c4a1bf2255ed78e1f93/src/header/value.rs#L346-L376