feat: Complete LLMX v0.1.0 - Rebrand from Codex with LiteLLM Integration

This release represents a comprehensive transformation of the codebase from Codex to LLMX,
enhanced with LiteLLM integration to support 100+ LLM providers through a unified API.

## Major Changes

### Phase 1: Repository & Infrastructure Setup
- Established new repository structure and branching strategy
- Created comprehensive project documentation (CLAUDE.md, LITELLM-SETUP.md)
- Set up development environment and tooling configuration

### Phase 2: Rust Workspace Transformation
- Renamed all Rust crates from `codex-*` to `llmx-*` (30+ crates)
- Updated package names, binary names, and workspace members
- Renamed core modules: codex.rs → llmx.rs, codex_delegate.rs → llmx_delegate.rs
- Updated all internal references, imports, and type names
- Renamed directories: codex-rs/ → llmx-rs/, codex-backend-openapi-models/ → llmx-backend-openapi-models/
- Fixed all Rust compilation errors after mass rename

### Phase 3: LiteLLM Integration
- Integrated LiteLLM for multi-provider LLM support (Anthropic, OpenAI, Azure, Google AI, AWS Bedrock, etc.)
- Implemented OpenAI-compatible Chat Completions API support
- Added model family detection and provider-specific handling
- Updated authentication to support LiteLLM API keys
- Renamed environment variables: OPENAI_BASE_URL → LLMX_BASE_URL
- Added LLMX_API_KEY for unified authentication
- Enhanced error handling for Chat Completions API responses
- Implemented fallback mechanisms between Responses API and Chat Completions API

### Phase 4: TypeScript/Node.js Components
- Renamed npm package: @codex/codex-cli → @valknar/llmx
- Updated TypeScript SDK to use new LLMX APIs and endpoints
- Fixed all TypeScript compilation and linting errors
- Updated SDK tests to support both API backends
- Enhanced mock server to handle multiple API formats
- Updated build scripts for cross-platform packaging

### Phase 5: Configuration & Documentation
- Updated all configuration files to use LLMX naming
- Rewrote README and documentation for LLMX branding
- Updated config paths: ~/.codex/ → ~/.llmx/
- Added comprehensive LiteLLM setup guide
- Updated all user-facing strings and help text
- Created release plan and migration documentation

### Phase 6: Testing & Validation
- Fixed all Rust tests for new naming scheme
- Updated snapshot tests in TUI (36 frame files)
- Fixed authentication storage tests
- Updated Chat Completions payload and SSE tests
- Fixed SDK tests for new API endpoints
- Ensured compatibility with Claude Sonnet 4.5 model
- Fixed test environment variables (LLMX_API_KEY, LLMX_BASE_URL)

### Phase 7: Build & Release Pipeline
- Updated GitHub Actions workflows for LLMX binary names
- Fixed rust-release.yml to reference llmx-rs/ instead of codex-rs/
- Updated CI/CD pipelines for new package names
- Made Apple code signing optional in release workflow
- Enhanced npm packaging resilience for partial platform builds
- Added Windows sandbox support to workspace
- Updated dotslash configuration for new binary names

### Phase 8: Final Polish
- Renamed all assets (.github images, labels, templates)
- Updated VSCode and DevContainer configurations
- Fixed all clippy warnings and formatting issues
- Applied cargo fmt and prettier formatting across codebase
- Updated issue templates and pull request templates
- Fixed all remaining UI text references

## Technical Details

**Breaking Changes:**
- Binary name changed from `codex` to `llmx`
- Config directory changed from `~/.codex/` to `~/.llmx/`
- Environment variables renamed (CODEX_* → LLMX_*)
- npm package renamed to `@valknar/llmx`

**New Features:**
- Support for 100+ LLM providers via LiteLLM
- Unified authentication with LLMX_API_KEY
- Enhanced model provider detection and handling
- Improved error handling and fallback mechanisms

**Files Changed:**
- 578 files modified across Rust, TypeScript, and documentation
- 30+ Rust crates renamed and updated
- Complete rebrand of UI, CLI, and documentation
- All tests updated and passing

**Dependencies:**
- Updated Cargo.lock with new package names
- Updated npm dependencies in llmx-cli
- Enhanced OpenAPI models for LLMX backend

This release establishes LLMX as a standalone project with comprehensive LiteLLM
integration, maintaining full backward compatibility with existing functionality
while opening support for a wide ecosystem of LLM providers.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Sebastian Krüger <support@pivoine.art>
This commit is contained in:
Sebastian Krüger
2025-11-12 20:40:44 +01:00
parent 052b052832
commit 3c7efc58c8
1248 changed files with 10085 additions and 9580 deletions

View File

@@ -0,0 +1,237 @@
use std::fs::File;
use std::fs::{self};
use std::io::Write;
use std::net::SocketAddr;
use std::net::TcpListener;
use std::path::Path;
use std::path::PathBuf;
use std::sync::Arc;
use std::time::Duration;
use anyhow::Context;
use anyhow::Result;
use anyhow::anyhow;
use clap::Parser;
use reqwest::Url;
use reqwest::blocking::Client;
use reqwest::header::AUTHORIZATION;
use reqwest::header::HOST;
use reqwest::header::HeaderMap;
use reqwest::header::HeaderName;
use reqwest::header::HeaderValue;
use serde::Serialize;
use tiny_http::Header;
use tiny_http::Method;
use tiny_http::Request;
use tiny_http::Response;
use tiny_http::Server;
use tiny_http::StatusCode;
mod read_api_key;
use read_api_key::read_auth_header_from_stdin;
/// CLI arguments for the proxy.
#[derive(Debug, Clone, Parser)]
#[command(name = "responses-api-proxy", about = "Minimal OpenAI responses proxy")]
pub struct Args {
/// Port to listen on. If not set, an ephemeral port is used.
#[arg(long)]
pub port: Option<u16>,
/// Path to a JSON file to write startup info (single line). Includes {"port": <u16>}.
#[arg(long, value_name = "FILE")]
pub server_info: Option<PathBuf>,
/// Enable HTTP shutdown endpoint at GET /shutdown
#[arg(long)]
pub http_shutdown: bool,
/// Absolute URL the proxy should forward requests to (defaults to OpenAI).
#[arg(long, default_value = "https://api.openai.com/v1/responses")]
pub upstream_url: String,
}
#[derive(Serialize)]
struct ServerInfo {
port: u16,
pid: u32,
}
struct ForwardConfig {
upstream_url: Url,
host_header: HeaderValue,
}
/// Entry point for the library main, for parity with other crates.
pub fn run_main(args: Args) -> Result<()> {
let auth_header = read_auth_header_from_stdin()?;
let upstream_url = Url::parse(&args.upstream_url).context("parsing --upstream-url")?;
let host = match (upstream_url.host_str(), upstream_url.port()) {
(Some(host), Some(port)) => format!("{host}:{port}"),
(Some(host), None) => host.to_string(),
_ => return Err(anyhow!("upstream URL must include a host")),
};
let host_header =
HeaderValue::from_str(&host).context("constructing Host header from upstream URL")?;
let forward_config = Arc::new(ForwardConfig {
upstream_url,
host_header,
});
let (listener, bound_addr) = bind_listener(args.port)?;
if let Some(path) = args.server_info.as_ref() {
write_server_info(path, bound_addr.port())?;
}
let server = Server::from_listener(listener, None)
.map_err(|err| anyhow!("creating HTTP server: {err}"))?;
let client = Arc::new(
Client::builder()
// Disable reqwest's 30s default so long-lived response streams keep flowing.
.timeout(None::<Duration>)
.build()
.context("building reqwest client")?,
);
eprintln!("responses-api-proxy listening on {bound_addr}");
let http_shutdown = args.http_shutdown;
for request in server.incoming_requests() {
let client = client.clone();
let forward_config = forward_config.clone();
std::thread::spawn(move || {
if http_shutdown && request.method() == &Method::Get && request.url() == "/shutdown" {
let _ = request.respond(Response::new_empty(StatusCode(200)));
std::process::exit(0);
}
if let Err(e) = forward_request(&client, auth_header, &forward_config, request) {
eprintln!("forwarding error: {e}");
}
});
}
Err(anyhow!("server stopped unexpectedly"))
}
fn bind_listener(port: Option<u16>) -> Result<(TcpListener, SocketAddr)> {
let addr = SocketAddr::from(([127, 0, 0, 1], port.unwrap_or(0)));
let listener = TcpListener::bind(addr).with_context(|| format!("failed to bind {addr}"))?;
let bound = listener.local_addr().context("failed to read local_addr")?;
Ok((listener, bound))
}
fn write_server_info(path: &Path, port: u16) -> Result<()> {
if let Some(parent) = path.parent()
&& !parent.as_os_str().is_empty()
{
fs::create_dir_all(parent)?;
}
let info = ServerInfo {
port,
pid: std::process::id(),
};
let mut data = serde_json::to_string(&info)?;
data.push('\n');
let mut f = File::create(path)?;
f.write_all(data.as_bytes())?;
Ok(())
}
fn forward_request(
client: &Client,
auth_header: &'static str,
config: &ForwardConfig,
mut req: Request,
) -> Result<()> {
// Only allow POST /v1/responses exactly, no query string.
let method = req.method().clone();
let url_path = req.url().to_string();
let allow = method == Method::Post && url_path == "/v1/responses";
if !allow {
let resp = Response::new_empty(StatusCode(403));
let _ = req.respond(resp);
return Ok(());
}
// Read request body
let mut body = Vec::new();
let mut reader = req.as_reader();
std::io::Read::read_to_end(&mut reader, &mut body)?;
// Build headers for upstream, forwarding everything from the incoming
// request except Authorization (we replace it below).
let mut headers = HeaderMap::new();
for header in req.headers() {
let name_ascii = header.field.as_str();
let lower = name_ascii.to_ascii_lowercase();
if lower.as_str() == "authorization" || lower.as_str() == "host" {
continue;
}
let header_name = match HeaderName::from_bytes(lower.as_bytes()) {
Ok(name) => name,
Err(_) => continue,
};
if let Ok(value) = HeaderValue::from_bytes(header.value.as_bytes()) {
headers.append(header_name, value);
}
}
// As part of our effort to to keep `auth_header` secret, we use a
// combination of `from_static()` and `set_sensitive(true)`.
let mut auth_header_value = HeaderValue::from_static(auth_header);
auth_header_value.set_sensitive(true);
headers.insert(AUTHORIZATION, auth_header_value);
headers.insert(HOST, config.host_header.clone());
let upstream_resp = client
.post(config.upstream_url.clone())
.headers(headers)
.body(body)
.send()
.context("forwarding request to upstream")?;
// We have to create an adapter between a `reqwest::blocking::Response`
// and a `tiny_http::Response`. Fortunately, `reqwest::blocking::Response`
// implements `Read`, so we can use it directly as the body of the
// `tiny_http::Response`.
let status = upstream_resp.status();
let mut response_headers = Vec::new();
for (name, value) in upstream_resp.headers().iter() {
// Skip headers that tiny_http manages itself.
if matches!(
name.as_str(),
"content-length" | "transfer-encoding" | "connection" | "trailer" | "upgrade"
) {
continue;
}
if let Ok(header) = Header::from_bytes(name.as_str().as_bytes(), value.as_bytes()) {
response_headers.push(header);
}
}
let content_length = upstream_resp.content_length().and_then(|len| {
if len <= usize::MAX as u64 {
Some(len as usize)
} else {
None
}
});
let response = Response::new(
StatusCode(status.as_u16()),
response_headers,
upstream_resp,
content_length,
None,
);
let _ = req.respond(response);
Ok(())
}

View File

@@ -0,0 +1,12 @@
use clap::Parser;
use llmx_responses_api_proxy::Args as ResponsesApiProxyArgs;
#[ctor::ctor]
fn pre_main() {
llmx_process_hardening::pre_main_hardening();
}
pub fn main() -> anyhow::Result<()> {
let args = ResponsesApiProxyArgs::parse();
llmx_responses_api_proxy::run_main(args)
}

View File

@@ -0,0 +1,342 @@
use anyhow::Context;
use anyhow::Result;
use anyhow::anyhow;
use zeroize::Zeroize;
/// Use a generous buffer size to avoid truncation and to allow for longer API
/// keys in the future.
const BUFFER_SIZE: usize = 1024;
const AUTH_HEADER_PREFIX: &[u8] = b"Bearer ";
/// Reads the auth token from stdin and returns a static `Authorization` header
/// value with the auth token used with `Bearer`. The header value is returned
/// as a `&'static str` whose bytes are locked in memory to avoid accidental
/// exposure.
#[cfg(unix)]
pub(crate) fn read_auth_header_from_stdin() -> Result<&'static str> {
read_auth_header_with(read_from_unix_stdin)
}
#[cfg(windows)]
pub(crate) fn read_auth_header_from_stdin() -> Result<&'static str> {
use std::io::Read;
// Use of `stdio::io::stdin()` has the problem mentioned in the docstring on
// the UNIX version of `read_from_unix_stdin()`, so this should ultimately
// be replaced the low-level Windows equivalent. Because we do not have an
// equivalent of mlock() on Windows right now, it is not pressing until we
// address that issue.
read_auth_header_with(|buffer| std::io::stdin().read(buffer))
}
/// We perform a low-level read with `read(2)` because `stdio::io::stdin()` has
/// an internal BufReader:
///
/// https://github.com/rust-lang/rust/blob/bcbbdcb8522fd3cb4a8dde62313b251ab107694d/library/std/src/io/stdio.rs#L250-L252
///
/// that can end up retaining a copy of stdin data in memory with no way to zero
/// it out, whereas we aim to guarantee there is exactly one copy of the API key
/// in memory, protected by mlock(2).
#[cfg(unix)]
fn read_from_unix_stdin(buffer: &mut [u8]) -> std::io::Result<usize> {
use libc::c_void;
use libc::read;
// Perform a single read(2) call into the provided buffer slice.
// Looping and newline/EOF handling are managed by the caller.
loop {
let result = unsafe {
read(
libc::STDIN_FILENO,
buffer.as_mut_ptr().cast::<c_void>(),
buffer.len(),
)
};
if result == 0 {
return Ok(0);
}
if result < 0 {
let err = std::io::Error::last_os_error();
if err.kind() == std::io::ErrorKind::Interrupted {
continue;
}
return Err(err);
}
return Ok(result as usize);
}
}
fn read_auth_header_with<F>(mut read_fn: F) -> Result<&'static str>
where
F: FnMut(&mut [u8]) -> std::io::Result<usize>,
{
// TAKE CARE WHEN MODIFYING THIS CODE!!!
//
// This function goes to great lengths to avoid leaving the API key in
// memory longer than necessary and to avoid copying it around. We read
// directly into a stack buffer so the only heap allocation should be the
// one to create the String (with the exact size) for the header value,
// which we then immediately protect with mlock(2).
let mut buf = [0u8; BUFFER_SIZE];
buf[..AUTH_HEADER_PREFIX.len()].copy_from_slice(AUTH_HEADER_PREFIX);
let prefix_len = AUTH_HEADER_PREFIX.len();
let capacity = buf.len() - prefix_len;
let mut total_read = 0usize; // number of bytes read into the token region
let mut saw_newline = false;
let mut saw_eof = false;
while total_read < capacity {
let slice = &mut buf[prefix_len + total_read..];
let read = match read_fn(slice) {
Ok(n) => n,
Err(err) => {
buf.zeroize();
return Err(err.into());
}
};
if read == 0 {
saw_eof = true;
break;
}
// Search only the newly written region for a newline.
let newly_written = &slice[..read];
if let Some(pos) = newly_written.iter().position(|&b| b == b'\n') {
total_read += pos + 1; // include the newline for trimming below
saw_newline = true;
break;
}
total_read += read;
// Continue loop; if buffer fills without newline/EOF we'll error below.
}
// If buffer filled and we did not see newline or EOF, error out.
if total_read == capacity && !saw_newline && !saw_eof {
buf.zeroize();
return Err(anyhow!(
"API key is too large to fit in the {BUFFER_SIZE}-byte buffer"
));
}
let mut total = prefix_len + total_read;
while total > prefix_len && (buf[total - 1] == b'\n' || buf[total - 1] == b'\r') {
total -= 1;
}
if total == AUTH_HEADER_PREFIX.len() {
buf.zeroize();
return Err(anyhow!(
"API key must be provided via stdin (e.g. printenv OPENAI_API_KEY | llmx responses-api-proxy)"
));
}
if let Err(err) = validate_auth_header_bytes(&buf[AUTH_HEADER_PREFIX.len()..total]) {
buf.zeroize();
return Err(err);
}
let header_str = match std::str::from_utf8(&buf[..total]) {
Ok(value) => value,
Err(err) => {
// In theory, validate_auth_header_bytes() should have caught
// any invalid UTF-8 sequences, but just in case...
buf.zeroize();
return Err(err).context("reading Authorization header from stdin as UTF-8");
}
};
let header_value = String::from(header_str);
buf.zeroize();
let leaked: &'static mut str = header_value.leak();
mlock_str(leaked);
Ok(leaked)
}
#[cfg(unix)]
fn mlock_str(value: &str) {
use libc::_SC_PAGESIZE;
use libc::c_void;
use libc::mlock;
use libc::sysconf;
if value.is_empty() {
return;
}
let page_size = unsafe { sysconf(_SC_PAGESIZE) };
if page_size <= 0 {
return;
}
let page_size = page_size as usize;
if page_size == 0 {
return;
}
let addr = value.as_ptr() as usize;
let len = value.len();
let start = addr & !(page_size - 1);
let addr_end = match addr.checked_add(len) {
Some(v) => match v.checked_add(page_size - 1) {
Some(total) => total,
None => return,
},
None => return,
};
let end = addr_end & !(page_size - 1);
let size = end.saturating_sub(start);
if size == 0 {
return;
}
let _ = unsafe { mlock(start as *const c_void, size) };
}
#[cfg(not(unix))]
fn mlock_str(_value: &str) {}
/// The key should match /^[A-Za-z0-9\-_]+$/. Ensure there is no funny business
/// with NUL characters and whatnot.
fn validate_auth_header_bytes(key_bytes: &[u8]) -> Result<()> {
if key_bytes
.iter()
.all(|byte| byte.is_ascii_alphanumeric() || matches!(byte, b'-' | b'_'))
{
return Ok(());
}
Err(anyhow!(
"API key may only contain ASCII letters, numbers, '-' or '_'"
))
}
#[cfg(test)]
mod tests {
use super::*;
use std::collections::VecDeque;
use std::io;
#[test]
fn reads_key_with_no_newlines() {
let mut sent = false;
let result = read_auth_header_with(|buf| {
if sent {
return Ok(0);
}
let data = b"sk-abc123";
buf[..data.len()].copy_from_slice(data);
sent = true;
Ok(data.len())
})
.unwrap();
assert_eq!(result, "Bearer sk-abc123");
}
#[test]
fn reads_key_with_short_reads() {
let mut chunks: VecDeque<&[u8]> =
VecDeque::from(vec![b"sk-".as_ref(), b"abc".as_ref(), b"123\n".as_ref()]);
let result = read_auth_header_with(|buf| match chunks.pop_front() {
Some(chunk) if !chunk.is_empty() => {
buf[..chunk.len()].copy_from_slice(chunk);
Ok(chunk.len())
}
_ => Ok(0),
})
.unwrap();
assert_eq!(result, "Bearer sk-abc123");
}
#[test]
fn reads_key_and_trims_newlines() {
let mut sent = false;
let result = read_auth_header_with(|buf| {
if sent {
return Ok(0);
}
let data = b"sk-abc123\r\n";
buf[..data.len()].copy_from_slice(data);
sent = true;
Ok(data.len())
})
.unwrap();
assert_eq!(result, "Bearer sk-abc123");
}
#[test]
fn errors_when_no_input_provided() {
let err = read_auth_header_with(|_| Ok(0)).unwrap_err();
let message = format!("{err:#}");
assert!(message.contains("must be provided"));
}
#[test]
fn errors_when_buffer_filled() {
let err = read_auth_header_with(|buf| {
let data = vec![b'a'; BUFFER_SIZE - AUTH_HEADER_PREFIX.len()];
buf[..data.len()].copy_from_slice(&data);
Ok(data.len())
})
.unwrap_err();
let message = format!("{err:#}");
let expected_error =
format!("API key is too large to fit in the {BUFFER_SIZE}-byte buffer");
assert!(message.contains(&expected_error));
}
#[test]
fn propagates_io_error() {
let err = read_auth_header_with(|_| Err(io::Error::other("boom"))).unwrap_err();
let io_error = err.downcast_ref::<io::Error>().unwrap();
assert_eq!(io_error.kind(), io::ErrorKind::Other);
assert_eq!(io_error.to_string(), "boom");
}
#[test]
fn errors_on_invalid_utf8() {
let mut sent = false;
let err = read_auth_header_with(|buf| {
if sent {
return Ok(0);
}
let data = b"sk-abc\xff";
buf[..data.len()].copy_from_slice(data);
sent = true;
Ok(data.len())
})
.unwrap_err();
let message = format!("{err:#}");
assert!(message.contains("API key may only contain ASCII letters, numbers, '-' or '_'"));
}
#[test]
fn errors_on_invalid_characters() {
let mut sent = false;
let err = read_auth_header_with(|buf| {
if sent {
return Ok(0);
}
let data = b"sk-abc!23";
buf[..data.len()].copy_from_slice(data);
sent = true;
Ok(data.len())
})
.unwrap_err();
let message = format!("{err:#}");
assert!(message.contains("API key may only contain ASCII letters, numbers, '-' or '_'"));
}
}