fix: use low-level stdin read logic to avoid a BufReader (#4778)
`codex-responses-api-proxy` is designed so that there should be exactly one copy of the API key in memory (that is `mlock`'d on UNIX), but in practice, I was seeing two when I dumped the process data from `/proc/$PID/mem`. It appears that `std::io::stdin()` maintains an internal `BufReader` that we cannot zero out, so this PR changes the implementation on UNIX so that we use a low-level `read(2)` instead. Even though it seems like it would be incredibly unlikely, we also make this logic tolerant of short reads. Either `\n` or `EOF` must be sent to signal the end of the key written to stdin.
This commit is contained in:
@@ -4,12 +4,12 @@ A strict HTTP proxy that only forwards `POST` requests to `/v1/responses` to the
|
|||||||
|
|
||||||
## Expected Usage
|
## Expected Usage
|
||||||
|
|
||||||
**IMPORTANT:** `codex-responses-api-proxy` is designed to be run by a privileged user with access to `OPENAI_API_KEY` so that an unprivileged user cannot inspect or tamper with the process. Though if `--http-shutdown` is specified, an unprivileged user _can_ make a `GET` request to `/shutdown` to shutdown the server, as an unprivileged could not send `SIGTERM` to kill the process.
|
**IMPORTANT:** `codex-responses-api-proxy` is designed to be run by a privileged user with access to `OPENAI_API_KEY` so that an unprivileged user cannot inspect or tamper with the process. Though if `--http-shutdown` is specified, an unprivileged user _can_ make a `GET` request to `/shutdown` to shutdown the server, as an unprivileged user could not send `SIGTERM` to kill the process.
|
||||||
|
|
||||||
A privileged user (i.e., `root` or a user with `sudo`) who has access to `OPENAI_API_KEY` would run the following to start the server, as `codex-responses-api-proxy` reads the auth token from `stdin`:
|
A privileged user (i.e., `root` or a user with `sudo`) who has access to `OPENAI_API_KEY` would run the following to start the server, as `codex-responses-api-proxy` reads the auth token from `stdin`:
|
||||||
|
|
||||||
```shell
|
```shell
|
||||||
printenv OPENAI_API_KEY | codex-responses-api-proxy --http-shutdown --server-info /tmp/server-info.json
|
printenv OPENAI_API_KEY | env -u OPENAI_API_KEY codex-responses-api-proxy --http-shutdown --server-info /tmp/server-info.json
|
||||||
```
|
```
|
||||||
|
|
||||||
A non-privileged user would then run Codex as follows, specifying the `model_provider` dynamically:
|
A non-privileged user would then run Codex as follows, specifying the `model_provider` dynamically:
|
||||||
@@ -35,7 +35,7 @@ curl --fail --silent --show-error "${PROXY_BASE_URL}/shutdown"
|
|||||||
- Listens on the provided port or an ephemeral port if `--port` is not specified.
|
- Listens on the provided port or an ephemeral port if `--port` is not specified.
|
||||||
- Accepts exactly `POST /v1/responses` (no query string). The request body is forwarded to `https://api.openai.com/v1/responses` with `Authorization: Bearer <key>` set. All original request headers (except any incoming `Authorization`) are forwarded upstream. For other requests, it responds with `403`.
|
- Accepts exactly `POST /v1/responses` (no query string). The request body is forwarded to `https://api.openai.com/v1/responses` with `Authorization: Bearer <key>` set. All original request headers (except any incoming `Authorization`) are forwarded upstream. For other requests, it responds with `403`.
|
||||||
- Optionally writes a single-line JSON file with server info, currently `{ "port": <u16> }`.
|
- Optionally writes a single-line JSON file with server info, currently `{ "port": <u16> }`.
|
||||||
- Optional `--http-shutdown` enables `GET /shutdown` to terminate the process with exit code 0. This allows one user (e.g., `root`) to start the proxy and another unprivileged user on the host to shut it down.
|
- Optional `--http-shutdown` enables `GET /shutdown` to terminate the process with exit code `0`. This allows one user (e.g., `root`) to start the proxy and another unprivileged user on the host to shut it down.
|
||||||
|
|
||||||
## CLI
|
## CLI
|
||||||
|
|
||||||
@@ -44,7 +44,7 @@ codex-responses-api-proxy [--port <PORT>] [--server-info <FILE>] [--http-shutdow
|
|||||||
```
|
```
|
||||||
|
|
||||||
- `--port <PORT>`: Port to bind on `127.0.0.1`. If omitted, an ephemeral port is chosen.
|
- `--port <PORT>`: Port to bind on `127.0.0.1`. If omitted, an ephemeral port is chosen.
|
||||||
- `--server-info <FILE>`: If set, the proxy writes a single line of JSON with `{ "port": <PORT> }` once listening.
|
- `--server-info <FILE>`: If set, the proxy writes a single line of JSON with `{ "port": <PORT>, "pid": <PID> }` once listening.
|
||||||
- `--http-shutdown`: If set, enables `GET /shutdown` to exit the process with code `0`.
|
- `--http-shutdown`: If set, enables `GET /shutdown` to exit the process with code `0`.
|
||||||
|
|
||||||
## Notes
|
## Notes
|
||||||
|
|||||||
@@ -1,7 +1,6 @@
|
|||||||
use anyhow::Context;
|
use anyhow::Context;
|
||||||
use anyhow::Result;
|
use anyhow::Result;
|
||||||
use anyhow::anyhow;
|
use anyhow::anyhow;
|
||||||
use std::io::Read;
|
|
||||||
use zeroize::Zeroize;
|
use zeroize::Zeroize;
|
||||||
|
|
||||||
/// Use a generous buffer size to avoid truncation and to allow for longer API
|
/// Use a generous buffer size to avoid truncation and to allow for longer API
|
||||||
@@ -13,13 +12,66 @@ const AUTH_HEADER_PREFIX: &[u8] = b"Bearer ";
|
|||||||
/// value with the auth token used with `Bearer`. The header value is returned
|
/// value with the auth token used with `Bearer`. The header value is returned
|
||||||
/// as a `&'static str` whose bytes are locked in memory to avoid accidental
|
/// as a `&'static str` whose bytes are locked in memory to avoid accidental
|
||||||
/// exposure.
|
/// exposure.
|
||||||
|
#[cfg(unix)]
|
||||||
pub(crate) fn read_auth_header_from_stdin() -> Result<&'static str> {
|
pub(crate) fn read_auth_header_from_stdin() -> Result<&'static str> {
|
||||||
|
read_auth_header_with(read_from_unix_stdin)
|
||||||
|
}
|
||||||
|
|
||||||
|
#[cfg(windows)]
|
||||||
|
pub(crate) fn read_auth_header_from_stdin() -> Result<&'static str> {
|
||||||
|
use std::io::Read;
|
||||||
|
|
||||||
|
// Use of `stdio::io::stdin()` has the problem mentioned in the docstring on
|
||||||
|
// the UNIX version of `read_from_unix_stdin()`, so this should ultimately
|
||||||
|
// be replaced the low-level Windows equivalent. Because we do not have an
|
||||||
|
// equivalent of mlock() on Windows right now, it is not pressing until we
|
||||||
|
// address that issue.
|
||||||
read_auth_header_with(|buffer| std::io::stdin().read(buffer))
|
read_auth_header_with(|buffer| std::io::stdin().read(buffer))
|
||||||
}
|
}
|
||||||
|
|
||||||
fn read_auth_header_with<F>(read_fn: F) -> Result<&'static str>
|
/// We perform a low-level read with `read(2)` because `stdio::io::stdin()` has
|
||||||
|
/// an internal BufReader:
|
||||||
|
///
|
||||||
|
/// https://github.com/rust-lang/rust/blob/bcbbdcb8522fd3cb4a8dde62313b251ab107694d/library/std/src/io/stdio.rs#L250-L252
|
||||||
|
///
|
||||||
|
/// that can end up retaining a copy of stdin data in memory with no way to zero
|
||||||
|
/// it out, whereas we aim to guarantee there is exactly one copy of the API key
|
||||||
|
/// in memory, protected by mlock(2).
|
||||||
|
#[cfg(unix)]
|
||||||
|
fn read_from_unix_stdin(buffer: &mut [u8]) -> std::io::Result<usize> {
|
||||||
|
use libc::c_void;
|
||||||
|
use libc::read;
|
||||||
|
|
||||||
|
// Perform a single read(2) call into the provided buffer slice.
|
||||||
|
// Looping and newline/EOF handling are managed by the caller.
|
||||||
|
loop {
|
||||||
|
let result = unsafe {
|
||||||
|
read(
|
||||||
|
libc::STDIN_FILENO,
|
||||||
|
buffer.as_mut_ptr().cast::<c_void>(),
|
||||||
|
buffer.len(),
|
||||||
|
)
|
||||||
|
};
|
||||||
|
|
||||||
|
if result == 0 {
|
||||||
|
return Ok(0);
|
||||||
|
}
|
||||||
|
|
||||||
|
if result < 0 {
|
||||||
|
let err = std::io::Error::last_os_error();
|
||||||
|
if err.kind() == std::io::ErrorKind::Interrupted {
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
return Err(err);
|
||||||
|
}
|
||||||
|
|
||||||
|
return Ok(result as usize);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn read_auth_header_with<F>(mut read_fn: F) -> Result<&'static str>
|
||||||
where
|
where
|
||||||
F: FnOnce(&mut [u8]) -> std::io::Result<usize>,
|
F: FnMut(&mut [u8]) -> std::io::Result<usize>,
|
||||||
{
|
{
|
||||||
// TAKE CARE WHEN MODIFYING THIS CODE!!!
|
// TAKE CARE WHEN MODIFYING THIS CODE!!!
|
||||||
//
|
//
|
||||||
@@ -31,19 +83,50 @@ where
|
|||||||
let mut buf = [0u8; BUFFER_SIZE];
|
let mut buf = [0u8; BUFFER_SIZE];
|
||||||
buf[..AUTH_HEADER_PREFIX.len()].copy_from_slice(AUTH_HEADER_PREFIX);
|
buf[..AUTH_HEADER_PREFIX.len()].copy_from_slice(AUTH_HEADER_PREFIX);
|
||||||
|
|
||||||
let read = read_fn(&mut buf[AUTH_HEADER_PREFIX.len()..]).inspect_err(|_err| {
|
let prefix_len = AUTH_HEADER_PREFIX.len();
|
||||||
buf.zeroize();
|
let capacity = buf.len() - prefix_len;
|
||||||
})?;
|
let mut total_read = 0usize; // number of bytes read into the token region
|
||||||
|
let mut saw_newline = false;
|
||||||
|
let mut saw_eof = false;
|
||||||
|
|
||||||
if read == buf.len() - AUTH_HEADER_PREFIX.len() {
|
while total_read < capacity {
|
||||||
|
let slice = &mut buf[prefix_len + total_read..];
|
||||||
|
let read = match read_fn(slice) {
|
||||||
|
Ok(n) => n,
|
||||||
|
Err(err) => {
|
||||||
|
buf.zeroize();
|
||||||
|
return Err(err.into());
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
if read == 0 {
|
||||||
|
saw_eof = true;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Search only the newly written region for a newline.
|
||||||
|
let newly_written = &slice[..read];
|
||||||
|
if let Some(pos) = newly_written.iter().position(|&b| b == b'\n') {
|
||||||
|
total_read += pos + 1; // include the newline for trimming below
|
||||||
|
saw_newline = true;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
|
||||||
|
total_read += read;
|
||||||
|
|
||||||
|
// Continue loop; if buffer fills without newline/EOF we'll error below.
|
||||||
|
}
|
||||||
|
|
||||||
|
// If buffer filled and we did not see newline or EOF, error out.
|
||||||
|
if total_read == capacity && !saw_newline && !saw_eof {
|
||||||
buf.zeroize();
|
buf.zeroize();
|
||||||
return Err(anyhow!(
|
return Err(anyhow!(
|
||||||
"OPENAI_API_KEY is too large to fit in the 512-byte buffer"
|
"OPENAI_API_KEY is too large to fit in the 512-byte buffer"
|
||||||
));
|
));
|
||||||
}
|
}
|
||||||
|
|
||||||
let mut total = AUTH_HEADER_PREFIX.len() + read;
|
let mut total = prefix_len + total_read;
|
||||||
while total > AUTH_HEADER_PREFIX.len() && (buf[total - 1] == b'\n' || buf[total - 1] == b'\r') {
|
while total > prefix_len && (buf[total - 1] == b'\n' || buf[total - 1] == b'\r') {
|
||||||
total -= 1;
|
total -= 1;
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -138,13 +221,19 @@ fn validate_auth_header_bytes(key_bytes: &[u8]) -> Result<()> {
|
|||||||
#[cfg(test)]
|
#[cfg(test)]
|
||||||
mod tests {
|
mod tests {
|
||||||
use super::*;
|
use super::*;
|
||||||
|
use std::collections::VecDeque;
|
||||||
use std::io;
|
use std::io;
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn reads_key_with_no_newlines() {
|
fn reads_key_with_no_newlines() {
|
||||||
|
let mut sent = false;
|
||||||
let result = read_auth_header_with(|buf| {
|
let result = read_auth_header_with(|buf| {
|
||||||
|
if sent {
|
||||||
|
return Ok(0);
|
||||||
|
}
|
||||||
let data = b"sk-abc123";
|
let data = b"sk-abc123";
|
||||||
buf[..data.len()].copy_from_slice(data);
|
buf[..data.len()].copy_from_slice(data);
|
||||||
|
sent = true;
|
||||||
Ok(data.len())
|
Ok(data.len())
|
||||||
})
|
})
|
||||||
.unwrap();
|
.unwrap();
|
||||||
@@ -152,11 +241,32 @@ mod tests {
|
|||||||
assert_eq!(result, "Bearer sk-abc123");
|
assert_eq!(result, "Bearer sk-abc123");
|
||||||
}
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn reads_key_with_short_reads() {
|
||||||
|
let mut chunks: VecDeque<&[u8]> =
|
||||||
|
VecDeque::from(vec![b"sk-".as_ref(), b"abc".as_ref(), b"123\n".as_ref()]);
|
||||||
|
let result = read_auth_header_with(|buf| match chunks.pop_front() {
|
||||||
|
Some(chunk) if !chunk.is_empty() => {
|
||||||
|
buf[..chunk.len()].copy_from_slice(chunk);
|
||||||
|
Ok(chunk.len())
|
||||||
|
}
|
||||||
|
_ => Ok(0),
|
||||||
|
})
|
||||||
|
.unwrap();
|
||||||
|
|
||||||
|
assert_eq!(result, "Bearer sk-abc123");
|
||||||
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn reads_key_and_trims_newlines() {
|
fn reads_key_and_trims_newlines() {
|
||||||
|
let mut sent = false;
|
||||||
let result = read_auth_header_with(|buf| {
|
let result = read_auth_header_with(|buf| {
|
||||||
|
if sent {
|
||||||
|
return Ok(0);
|
||||||
|
}
|
||||||
let data = b"sk-abc123\r\n";
|
let data = b"sk-abc123\r\n";
|
||||||
buf[..data.len()].copy_from_slice(data);
|
buf[..data.len()].copy_from_slice(data);
|
||||||
|
sent = true;
|
||||||
Ok(data.len())
|
Ok(data.len())
|
||||||
})
|
})
|
||||||
.unwrap();
|
.unwrap();
|
||||||
@@ -194,9 +304,14 @@ mod tests {
|
|||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn errors_on_invalid_utf8() {
|
fn errors_on_invalid_utf8() {
|
||||||
|
let mut sent = false;
|
||||||
let err = read_auth_header_with(|buf| {
|
let err = read_auth_header_with(|buf| {
|
||||||
|
if sent {
|
||||||
|
return Ok(0);
|
||||||
|
}
|
||||||
let data = b"sk-abc\xff";
|
let data = b"sk-abc\xff";
|
||||||
buf[..data.len()].copy_from_slice(data);
|
buf[..data.len()].copy_from_slice(data);
|
||||||
|
sent = true;
|
||||||
Ok(data.len())
|
Ok(data.len())
|
||||||
})
|
})
|
||||||
.unwrap_err();
|
.unwrap_err();
|
||||||
@@ -209,9 +324,14 @@ mod tests {
|
|||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn errors_on_invalid_characters() {
|
fn errors_on_invalid_characters() {
|
||||||
|
let mut sent = false;
|
||||||
let err = read_auth_header_with(|buf| {
|
let err = read_auth_header_with(|buf| {
|
||||||
|
if sent {
|
||||||
|
return Ok(0);
|
||||||
|
}
|
||||||
let data = b"sk-abc!23";
|
let data = b"sk-abc!23";
|
||||||
buf[..data.len()].copy_from_slice(data);
|
buf[..data.len()].copy_from_slice(data);
|
||||||
|
sent = true;
|
||||||
Ok(data.len())
|
Ok(data.len())
|
||||||
})
|
})
|
||||||
.unwrap_err();
|
.unwrap_err();
|
||||||
|
|||||||
Reference in New Issue
Block a user