Files
Sebastian Krüger 3c7efc58c8 feat: Complete LLMX v0.1.0 - Rebrand from Codex with LiteLLM Integration
This release represents a comprehensive transformation of the codebase from Codex to LLMX,
enhanced with LiteLLM integration to support 100+ LLM providers through a unified API.

## Major Changes

### Phase 1: Repository & Infrastructure Setup
- Established new repository structure and branching strategy
- Created comprehensive project documentation (CLAUDE.md, LITELLM-SETUP.md)
- Set up development environment and tooling configuration

### Phase 2: Rust Workspace Transformation
- Renamed all Rust crates from `codex-*` to `llmx-*` (30+ crates)
- Updated package names, binary names, and workspace members
- Renamed core modules: codex.rs → llmx.rs, codex_delegate.rs → llmx_delegate.rs
- Updated all internal references, imports, and type names
- Renamed directories: codex-rs/ → llmx-rs/, codex-backend-openapi-models/ → llmx-backend-openapi-models/
- Fixed all Rust compilation errors after mass rename

### Phase 3: LiteLLM Integration
- Integrated LiteLLM for multi-provider LLM support (Anthropic, OpenAI, Azure, Google AI, AWS Bedrock, etc.)
- Implemented OpenAI-compatible Chat Completions API support
- Added model family detection and provider-specific handling
- Updated authentication to support LiteLLM API keys
- Renamed environment variables: OPENAI_BASE_URL → LLMX_BASE_URL
- Added LLMX_API_KEY for unified authentication
- Enhanced error handling for Chat Completions API responses
- Implemented fallback mechanisms between Responses API and Chat Completions API

### Phase 4: TypeScript/Node.js Components
- Renamed npm package: @codex/codex-cli → @valknar/llmx
- Updated TypeScript SDK to use new LLMX APIs and endpoints
- Fixed all TypeScript compilation and linting errors
- Updated SDK tests to support both API backends
- Enhanced mock server to handle multiple API formats
- Updated build scripts for cross-platform packaging

### Phase 5: Configuration & Documentation
- Updated all configuration files to use LLMX naming
- Rewrote README and documentation for LLMX branding
- Updated config paths: ~/.codex/ → ~/.llmx/
- Added comprehensive LiteLLM setup guide
- Updated all user-facing strings and help text
- Created release plan and migration documentation

### Phase 6: Testing & Validation
- Fixed all Rust tests for new naming scheme
- Updated snapshot tests in TUI (36 frame files)
- Fixed authentication storage tests
- Updated Chat Completions payload and SSE tests
- Fixed SDK tests for new API endpoints
- Ensured compatibility with Claude Sonnet 4.5 model
- Fixed test environment variables (LLMX_API_KEY, LLMX_BASE_URL)

### Phase 7: Build & Release Pipeline
- Updated GitHub Actions workflows for LLMX binary names
- Fixed rust-release.yml to reference llmx-rs/ instead of codex-rs/
- Updated CI/CD pipelines for new package names
- Made Apple code signing optional in release workflow
- Enhanced npm packaging resilience for partial platform builds
- Added Windows sandbox support to workspace
- Updated dotslash configuration for new binary names

### Phase 8: Final Polish
- Renamed all assets (.github images, labels, templates)
- Updated VSCode and DevContainer configurations
- Fixed all clippy warnings and formatting issues
- Applied cargo fmt and prettier formatting across codebase
- Updated issue templates and pull request templates
- Fixed all remaining UI text references

## Technical Details

**Breaking Changes:**
- Binary name changed from `codex` to `llmx`
- Config directory changed from `~/.codex/` to `~/.llmx/`
- Environment variables renamed (CODEX_* → LLMX_*)
- npm package renamed to `@valknar/llmx`

**New Features:**
- Support for 100+ LLM providers via LiteLLM
- Unified authentication with LLMX_API_KEY
- Enhanced model provider detection and handling
- Improved error handling and fallback mechanisms

**Files Changed:**
- 578 files modified across Rust, TypeScript, and documentation
- 30+ Rust crates renamed and updated
- Complete rebrand of UI, CLI, and documentation
- All tests updated and passing

**Dependencies:**
- Updated Cargo.lock with new package names
- Updated npm dependencies in llmx-cli
- Enhanced OpenAPI models for LLMX backend

This release establishes LLMX as a standalone project with comprehensive LiteLLM
integration, maintaining full backward compatibility with existing functionality
while opening support for a wide ecosystem of LLM providers.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Sebastian Krüger <support@pivoine.art>
2025-11-12 20:40:44 +01:00

81 lines
5.2 KiB
Markdown

# llmx-responses-api-proxy
A strict HTTP proxy that only forwards `POST` requests to `/v1/responses` to the OpenAI API (`https://api.openai.com`), injecting the `Authorization: Bearer $OPENAI_API_KEY` header. Everything else is rejected with `403 Forbidden`.
## Expected Usage
**IMPORTANT:** `llmx-responses-api-proxy` is designed to be run by a privileged user with access to `OPENAI_API_KEY` so that an unprivileged user cannot inspect or tamper with the process. Though if `--http-shutdown` is specified, an unprivileged user _can_ make a `GET` request to `/shutdown` to shutdown the server, as an unprivileged user could not send `SIGTERM` to kill the process.
A privileged user (i.e., `root` or a user with `sudo`) who has access to `OPENAI_API_KEY` would run the following to start the server, as `llmx-responses-api-proxy` reads the auth token from `stdin`:
```shell
printenv OPENAI_API_KEY | env -u OPENAI_API_KEY llmx-responses-api-proxy --http-shutdown --server-info /tmp/server-info.json
```
A non-privileged user would then run LLMX as follows, specifying the `model_provider` dynamically:
```shell
PROXY_PORT=$(jq .port /tmp/server-info.json)
PROXY_BASE_URL="http://127.0.0.1:${PROXY_PORT}"
llmx exec -c "model_providers.openai-proxy={ name = 'OpenAI Proxy', base_url = '${PROXY_BASE_URL}/v1', wire_api='responses' }" \
-c model_provider="openai-proxy" \
'Your prompt here'
```
When the unprivileged user was finished, they could shutdown the server using `curl` (since `kill -SIGTERM` is not an option):
```shell
curl --fail --silent --show-error "${PROXY_BASE_URL}/shutdown"
```
## Behavior
- Reads the API key from `stdin`. All callers should pipe the key in (for example, `printenv OPENAI_API_KEY | llmx-responses-api-proxy`).
- Formats the header value as `Bearer <key>` and attempts to `mlock(2)` the memory holding that header so it is not swapped to disk.
- Listens on the provided port or an ephemeral port if `--port` is not specified.
- Accepts exactly `POST /v1/responses` (no query string). The request body is forwarded to `https://api.openai.com/v1/responses` with `Authorization: Bearer <key>` set. All original request headers (except any incoming `Authorization`) are forwarded upstream, with `Host` overridden to `api.openai.com`. For other requests, it responds with `403`.
- Optionally writes a single-line JSON file with server info, currently `{ "port": <u16>, "pid": <u32> }`.
- Optional `--http-shutdown` enables `GET /shutdown` to terminate the process with exit code `0`. This allows one user (e.g., `root`) to start the proxy and another unprivileged user on the host to shut it down.
## CLI
```
llmx-responses-api-proxy [--port <PORT>] [--server-info <FILE>] [--http-shutdown] [--upstream-url <URL>]
```
- `--port <PORT>`: Port to bind on `127.0.0.1`. If omitted, an ephemeral port is chosen.
- `--server-info <FILE>`: If set, the proxy writes a single line of JSON with `{ "port": <PORT>, "pid": <PID> }` once listening.
- `--http-shutdown`: If set, enables `GET /shutdown` to exit the process with code `0`.
- `--upstream-url <URL>`: Absolute URL to forward requests to. Defaults to `https://api.openai.com/v1/responses`.
- Authentication is fixed to `Authorization: Bearer <key>` to match the LLMX CLI expectations.
For Azure, for example (ensure your deployment accepts `Authorization: Bearer <key>`):
```shell
printenv AZURE_OPENAI_API_KEY | env -u AZURE_OPENAI_API_KEY llmx-responses-api-proxy \
--http-shutdown \
--server-info /tmp/server-info.json \
--upstream-url "https://YOUR_PROJECT_NAME.openai.azure.com/openai/deployments/YOUR_DEPLOYMENT/responses?api-version=2025-04-01-preview"
```
## Notes
- Only `POST /v1/responses` is permitted. No query strings are allowed.
- All request headers are forwarded to the upstream call (aside from overriding `Authorization` and `Host`). Response status and content-type are mirrored from upstream.
## Hardening Details
Care is taken to restrict access/copying to the value of `OPENAI_API_KEY` retained in memory:
- We leverage [`llmx_process_hardening`](https://github.com/valknar/llmx/blob/main/llmx-rs/process-hardening/README.md) so `llmx-responses-api-proxy` is run with standard process-hardening techniques.
- At startup, we allocate a `1024` byte buffer on the stack and copy `"Bearer "` into the start of the buffer.
- We then read from `stdin`, copying the contents into the buffer after `"Bearer "`.
- After verifying the key matches `/^[a-zA-Z0-9_-]+$/` (and does not exceed the buffer), we create a `String` from that buffer (so the data is now on the heap).
- We zero out the stack-allocated buffer using https://crates.io/crates/zeroize so it is not optimized away by the compiler.
- We invoke `.leak()` on the `String` so we can treat its contents as a `&'static str`, as it will live for the rest of the process.
- On UNIX, we `mlock(2)` the memory backing the `&'static str`.
- When using the `&'static str` when building an HTTP request, we use `HeaderValue::from_static()` to avoid copying the `&str`.
- We also invoke `.set_sensitive(true)` on the `HeaderValue`, which in theory indicates to other parts of the HTTP stack that the header should be treated with "special care" to avoid leakage:
https://github.com/hyperium/http/blob/439d1c50d71e3be3204b6c4a1bf2255ed78e1f93/src/header/value.rs#L346-L376