Files
llmx/llmx-rs/mcp-server/tests/suite/llmx_tool.rs
Sebastian Krüger 3c7efc58c8 feat: Complete LLMX v0.1.0 - Rebrand from Codex with LiteLLM Integration
This release represents a comprehensive transformation of the codebase from Codex to LLMX,
enhanced with LiteLLM integration to support 100+ LLM providers through a unified API.

## Major Changes

### Phase 1: Repository & Infrastructure Setup
- Established new repository structure and branching strategy
- Created comprehensive project documentation (CLAUDE.md, LITELLM-SETUP.md)
- Set up development environment and tooling configuration

### Phase 2: Rust Workspace Transformation
- Renamed all Rust crates from `codex-*` to `llmx-*` (30+ crates)
- Updated package names, binary names, and workspace members
- Renamed core modules: codex.rs → llmx.rs, codex_delegate.rs → llmx_delegate.rs
- Updated all internal references, imports, and type names
- Renamed directories: codex-rs/ → llmx-rs/, codex-backend-openapi-models/ → llmx-backend-openapi-models/
- Fixed all Rust compilation errors after mass rename

### Phase 3: LiteLLM Integration
- Integrated LiteLLM for multi-provider LLM support (Anthropic, OpenAI, Azure, Google AI, AWS Bedrock, etc.)
- Implemented OpenAI-compatible Chat Completions API support
- Added model family detection and provider-specific handling
- Updated authentication to support LiteLLM API keys
- Renamed environment variables: OPENAI_BASE_URL → LLMX_BASE_URL
- Added LLMX_API_KEY for unified authentication
- Enhanced error handling for Chat Completions API responses
- Implemented fallback mechanisms between Responses API and Chat Completions API

### Phase 4: TypeScript/Node.js Components
- Renamed npm package: @codex/codex-cli → @valknar/llmx
- Updated TypeScript SDK to use new LLMX APIs and endpoints
- Fixed all TypeScript compilation and linting errors
- Updated SDK tests to support both API backends
- Enhanced mock server to handle multiple API formats
- Updated build scripts for cross-platform packaging

### Phase 5: Configuration & Documentation
- Updated all configuration files to use LLMX naming
- Rewrote README and documentation for LLMX branding
- Updated config paths: ~/.codex/ → ~/.llmx/
- Added comprehensive LiteLLM setup guide
- Updated all user-facing strings and help text
- Created release plan and migration documentation

### Phase 6: Testing & Validation
- Fixed all Rust tests for new naming scheme
- Updated snapshot tests in TUI (36 frame files)
- Fixed authentication storage tests
- Updated Chat Completions payload and SSE tests
- Fixed SDK tests for new API endpoints
- Ensured compatibility with Claude Sonnet 4.5 model
- Fixed test environment variables (LLMX_API_KEY, LLMX_BASE_URL)

### Phase 7: Build & Release Pipeline
- Updated GitHub Actions workflows for LLMX binary names
- Fixed rust-release.yml to reference llmx-rs/ instead of codex-rs/
- Updated CI/CD pipelines for new package names
- Made Apple code signing optional in release workflow
- Enhanced npm packaging resilience for partial platform builds
- Added Windows sandbox support to workspace
- Updated dotslash configuration for new binary names

### Phase 8: Final Polish
- Renamed all assets (.github images, labels, templates)
- Updated VSCode and DevContainer configurations
- Fixed all clippy warnings and formatting issues
- Applied cargo fmt and prettier formatting across codebase
- Updated issue templates and pull request templates
- Fixed all remaining UI text references

## Technical Details

**Breaking Changes:**
- Binary name changed from `codex` to `llmx`
- Config directory changed from `~/.codex/` to `~/.llmx/`
- Environment variables renamed (CODEX_* → LLMX_*)
- npm package renamed to `@valknar/llmx`

**New Features:**
- Support for 100+ LLM providers via LiteLLM
- Unified authentication with LLMX_API_KEY
- Enhanced model provider detection and handling
- Improved error handling and fallback mechanisms

**Files Changed:**
- 578 files modified across Rust, TypeScript, and documentation
- 30+ Rust crates renamed and updated
- Complete rebrand of UI, CLI, and documentation
- All tests updated and passing

**Dependencies:**
- Updated Cargo.lock with new package names
- Updated npm dependencies in llmx-cli
- Enhanced OpenAPI models for LLMX backend

This release establishes LLMX as a standalone project with comprehensive LiteLLM
integration, maintaining full backward compatibility with existing functionality
while opening support for a wide ecosystem of LLM providers.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Sebastian Krüger <support@pivoine.art>
2025-11-12 20:40:44 +01:00

482 lines
16 KiB
Rust
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
use std::collections::HashMap;
use std::env;
use std::path::Path;
use std::path::PathBuf;
use llmx_core::parse_command;
use llmx_core::protocol::FileChange;
use llmx_core::protocol::ReviewDecision;
use llmx_core::spawn::LLMX_SANDBOX_NETWORK_DISABLED_ENV_VAR;
use llmx_mcp_server::ExecApprovalElicitRequestParams;
use llmx_mcp_server::ExecApprovalResponse;
use llmx_mcp_server::LlmxToolCallParam;
use llmx_mcp_server::PatchApprovalElicitRequestParams;
use llmx_mcp_server::PatchApprovalResponse;
use mcp_types::ElicitRequest;
use mcp_types::ElicitRequestParamsRequestedSchema;
use mcp_types::JSONRPC_VERSION;
use mcp_types::JSONRPCRequest;
use mcp_types::JSONRPCResponse;
use mcp_types::ModelContextProtocolRequest;
use mcp_types::RequestId;
use pretty_assertions::assert_eq;
use serde_json::json;
use tempfile::TempDir;
use tokio::time::timeout;
use wiremock::MockServer;
use core_test_support::skip_if_no_network;
use mcp_test_support::McpProcess;
use mcp_test_support::create_apply_patch_sse_response;
use mcp_test_support::create_final_assistant_message_sse_response;
use mcp_test_support::create_mock_chat_completions_server;
use mcp_test_support::create_shell_sse_response;
// Allow ample time on slower CI or under load to avoid flakes.
const DEFAULT_READ_TIMEOUT: std::time::Duration = std::time::Duration::from_secs(20);
/// Test that a shell command that is not on the "trusted" list triggers an
/// elicitation request to the MCP and that sending the approval runs the
/// command, as expected.
#[tokio::test(flavor = "multi_thread", worker_threads = 4)]
async fn test_shell_command_approval_triggers_elicitation() {
if env::var(LLMX_SANDBOX_NETWORK_DISABLED_ENV_VAR).is_ok() {
println!(
"Skipping test because it cannot execute when network is disabled in an LLMX sandbox."
);
return;
}
// Apparently `#[tokio::test]` must return `()`, so we create a helper
// function that returns `Result` so we can use `?` in favor of `unwrap`.
if let Err(err) = shell_command_approval_triggers_elicitation().await {
panic!("failure: {err}");
}
}
async fn shell_command_approval_triggers_elicitation() -> anyhow::Result<()> {
// Use a simple, untrusted command that creates a file so we can
// observe a side-effect.
//
// Crossplatform approach: run a tiny Python snippet to touch the file
// using `python3 -c ...` on all platforms.
let workdir_for_shell_function_call = TempDir::new()?;
let created_filename = "created_by_shell_tool.txt";
let created_file = workdir_for_shell_function_call
.path()
.join(created_filename);
let shell_command = vec![
"python3".to_string(),
"-c".to_string(),
format!("import pathlib; pathlib.Path('{created_filename}').touch()"),
];
let McpHandle {
process: mut mcp_process,
server: _server,
dir: _dir,
} = create_mcp_process(vec![
create_shell_sse_response(
shell_command.clone(),
Some(workdir_for_shell_function_call.path()),
Some(5_000),
"call1234",
)?,
create_final_assistant_message_sse_response("File created!")?,
])
.await?;
// Send a "llmx" tool request, which should hit the completions endpoint.
// In turn, it should reply with a tool call, which the MCP should forward
// as an elicitation.
let llmx_request_id = mcp_process
.send_llmx_tool_call(LlmxToolCallParam {
prompt: "run `git init`".to_string(),
..Default::default()
})
.await?;
let elicitation_request = timeout(
DEFAULT_READ_TIMEOUT,
mcp_process.read_stream_until_request_message(),
)
.await??;
let elicitation_request_id = elicitation_request.id.clone();
let params = serde_json::from_value::<ExecApprovalElicitRequestParams>(
elicitation_request
.params
.clone()
.ok_or_else(|| anyhow::anyhow!("elicitation_request.params must be set"))?,
)?;
let expected_elicitation_request = create_expected_elicitation_request(
elicitation_request_id.clone(),
shell_command.clone(),
workdir_for_shell_function_call.path(),
llmx_request_id.to_string(),
params.llmx_event_id.clone(),
)?;
assert_eq!(expected_elicitation_request, elicitation_request);
// Accept the `git init` request by responding to the elicitation.
mcp_process
.send_response(
elicitation_request_id,
serde_json::to_value(ExecApprovalResponse {
decision: ReviewDecision::Approved,
})?,
)
.await?;
// Verify task_complete notification arrives before the tool call completes.
#[expect(clippy::expect_used)]
let _task_complete = timeout(
DEFAULT_READ_TIMEOUT,
mcp_process.read_stream_until_legacy_task_complete_notification(),
)
.await
.expect("task_complete_notification timeout")
.expect("task_complete_notification resp");
// Verify the original `llmx` tool call completes and that the file was created.
let llmx_response = timeout(
DEFAULT_READ_TIMEOUT,
mcp_process.read_stream_until_response_message(RequestId::Integer(llmx_request_id)),
)
.await??;
assert_eq!(
JSONRPCResponse {
jsonrpc: JSONRPC_VERSION.into(),
id: RequestId::Integer(llmx_request_id),
result: json!({
"content": [
{
"text": "File created!",
"type": "text"
}
]
}),
},
llmx_response
);
assert!(created_file.is_file(), "created file should exist");
Ok(())
}
fn create_expected_elicitation_request(
elicitation_request_id: RequestId,
command: Vec<String>,
workdir: &Path,
llmx_mcp_tool_call_id: String,
llmx_event_id: String,
) -> anyhow::Result<JSONRPCRequest> {
let expected_message = format!(
"Allow LLMX to run `{}` in `{}`?",
shlex::try_join(command.iter().map(std::convert::AsRef::as_ref))?,
workdir.to_string_lossy()
);
let llmx_parsed_cmd = parse_command::parse_command(&command);
Ok(JSONRPCRequest {
jsonrpc: JSONRPC_VERSION.into(),
id: elicitation_request_id,
method: ElicitRequest::METHOD.to_string(),
params: Some(serde_json::to_value(&ExecApprovalElicitRequestParams {
message: expected_message,
requested_schema: ElicitRequestParamsRequestedSchema {
r#type: "object".to_string(),
properties: json!({}),
required: None,
},
llmx_elicitation: "exec-approval".to_string(),
llmx_mcp_tool_call_id,
llmx_event_id,
llmx_command: command,
llmx_cwd: workdir.to_path_buf(),
llmx_call_id: "call1234".to_string(),
llmx_parsed_cmd,
llmx_risk: None,
})?),
})
}
/// Test that patch approval triggers an elicitation request to the MCP and that
/// sending the approval applies the patch, as expected.
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn test_patch_approval_triggers_elicitation() {
if env::var(LLMX_SANDBOX_NETWORK_DISABLED_ENV_VAR).is_ok() {
println!(
"Skipping test because it cannot execute when network is disabled in an LLMX sandbox."
);
return;
}
if let Err(err) = patch_approval_triggers_elicitation().await {
panic!("failure: {err}");
}
}
async fn patch_approval_triggers_elicitation() -> anyhow::Result<()> {
let cwd = TempDir::new()?;
let test_file = cwd.path().join("destination_file.txt");
std::fs::write(&test_file, "original content\n")?;
let patch_content = format!(
"*** Begin Patch\n*** Update File: {}\n-original content\n+modified content\n*** End Patch",
test_file.as_path().to_string_lossy()
);
let McpHandle {
process: mut mcp_process,
server: _server,
dir: _dir,
} = create_mcp_process(vec![
create_apply_patch_sse_response(&patch_content, "call1234")?,
create_final_assistant_message_sse_response("Patch has been applied successfully!")?,
])
.await?;
// Send a "llmx" tool request that will trigger the apply_patch command
let llmx_request_id = mcp_process
.send_llmx_tool_call(LlmxToolCallParam {
cwd: Some(cwd.path().to_string_lossy().to_string()),
prompt: "please modify the test file".to_string(),
..Default::default()
})
.await?;
let elicitation_request = timeout(
DEFAULT_READ_TIMEOUT,
mcp_process.read_stream_until_request_message(),
)
.await??;
let elicitation_request_id = RequestId::Integer(0);
let mut expected_changes = HashMap::new();
expected_changes.insert(
test_file.as_path().to_path_buf(),
FileChange::Update {
unified_diff: "@@ -1 +1 @@\n-original content\n+modified content\n".to_string(),
move_path: None,
},
);
let expected_elicitation_request = create_expected_patch_approval_elicitation_request(
elicitation_request_id.clone(),
expected_changes,
None, // No grant_root expected
None, // No reason expected
llmx_request_id.to_string(),
"1".to_string(),
)?;
assert_eq!(expected_elicitation_request, elicitation_request);
// Accept the patch approval request by responding to the elicitation
mcp_process
.send_response(
elicitation_request_id,
serde_json::to_value(PatchApprovalResponse {
decision: ReviewDecision::Approved,
})?,
)
.await?;
// Verify the original `llmx` tool call completes
let llmx_response = timeout(
DEFAULT_READ_TIMEOUT,
mcp_process.read_stream_until_response_message(RequestId::Integer(llmx_request_id)),
)
.await??;
assert_eq!(
JSONRPCResponse {
jsonrpc: JSONRPC_VERSION.into(),
id: RequestId::Integer(llmx_request_id),
result: json!({
"content": [
{
"text": "Patch has been applied successfully!",
"type": "text"
}
]
}),
},
llmx_response
);
let file_contents = std::fs::read_to_string(test_file.as_path())?;
assert_eq!(file_contents, "modified content\n");
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn test_llmx_tool_passes_base_instructions() {
skip_if_no_network!();
// Apparently `#[tokio::test]` must return `()`, so we create a helper
// function that returns `Result` so we can use `?` in favor of `unwrap`.
if let Err(err) = llmx_tool_passes_base_instructions().await {
panic!("failure: {err}");
}
}
async fn llmx_tool_passes_base_instructions() -> anyhow::Result<()> {
#![expect(clippy::unwrap_used)]
let server =
create_mock_chat_completions_server(vec![create_final_assistant_message_sse_response(
"Enjoy!",
)?])
.await;
// Run `llmx mcp` with a specific config.toml.
let llmx_home = TempDir::new()?;
create_config_toml(llmx_home.path(), &server.uri())?;
let mut mcp_process = McpProcess::new(llmx_home.path()).await?;
timeout(DEFAULT_READ_TIMEOUT, mcp_process.initialize()).await??;
// Send a "llmx" tool request, which should hit the completions endpoint.
let llmx_request_id = mcp_process
.send_llmx_tool_call(LlmxToolCallParam {
prompt: "How are you?".to_string(),
base_instructions: Some("You are a helpful assistant.".to_string()),
developer_instructions: Some("Foreshadow upcoming tool calls.".to_string()),
..Default::default()
})
.await?;
let llmx_response = timeout(
DEFAULT_READ_TIMEOUT,
mcp_process.read_stream_until_response_message(RequestId::Integer(llmx_request_id)),
)
.await??;
assert_eq!(
JSONRPCResponse {
jsonrpc: JSONRPC_VERSION.into(),
id: RequestId::Integer(llmx_request_id),
result: json!({
"content": [
{
"text": "Enjoy!",
"type": "text"
}
]
}),
},
llmx_response
);
let requests = server.received_requests().await.unwrap();
let request = requests[0].body_json::<serde_json::Value>()?;
let instructions = request["messages"][0]["content"].as_str().unwrap();
assert!(instructions.starts_with("You are a helpful assistant."));
let developer_msg = request["messages"]
.as_array()
.and_then(|messages| {
messages
.iter()
.find(|msg| msg.get("role").and_then(|role| role.as_str()) == Some("developer"))
})
.unwrap();
let developer_content = developer_msg
.get("content")
.and_then(|value| value.as_str())
.unwrap();
assert!(
!developer_content.contains('<'),
"expected developer instructions without XML tags, got `{developer_content}`"
);
assert_eq!(developer_content, "Foreshadow upcoming tool calls.");
Ok(())
}
fn create_expected_patch_approval_elicitation_request(
elicitation_request_id: RequestId,
changes: HashMap<PathBuf, FileChange>,
grant_root: Option<PathBuf>,
reason: Option<String>,
llmx_mcp_tool_call_id: String,
llmx_event_id: String,
) -> anyhow::Result<JSONRPCRequest> {
let mut message_lines = Vec::new();
if let Some(r) = &reason {
message_lines.push(r.clone());
}
message_lines.push("Allow LLMX to apply proposed code changes?".to_string());
Ok(JSONRPCRequest {
jsonrpc: JSONRPC_VERSION.into(),
id: elicitation_request_id,
method: ElicitRequest::METHOD.to_string(),
params: Some(serde_json::to_value(&PatchApprovalElicitRequestParams {
message: message_lines.join("\n"),
requested_schema: ElicitRequestParamsRequestedSchema {
r#type: "object".to_string(),
properties: json!({}),
required: None,
},
llmx_elicitation: "patch-approval".to_string(),
llmx_mcp_tool_call_id,
llmx_event_id,
llmx_reason: reason,
llmx_grant_root: grant_root,
llmx_changes: changes,
llmx_call_id: "call1234".to_string(),
})?),
})
}
/// This handle is used to ensure that the MockServer and TempDir are not dropped while
/// the McpProcess is still running.
pub struct McpHandle {
pub process: McpProcess,
/// Retain the server for the lifetime of the McpProcess.
#[allow(dead_code)]
server: MockServer,
/// Retain the temporary directory for the lifetime of the McpProcess.
#[allow(dead_code)]
dir: TempDir,
}
async fn create_mcp_process(responses: Vec<String>) -> anyhow::Result<McpHandle> {
let server = create_mock_chat_completions_server(responses).await;
let llmx_home = TempDir::new()?;
create_config_toml(llmx_home.path(), &server.uri())?;
let mut mcp_process = McpProcess::new(llmx_home.path()).await?;
timeout(DEFAULT_READ_TIMEOUT, mcp_process.initialize()).await??;
Ok(McpHandle {
process: mcp_process,
server,
dir: llmx_home,
})
}
/// Create a Llmx config that uses the mock server as the model provider.
/// It also uses `approval_policy = "untrusted"` so that we exercise the
/// elicitation code path for shell commands.
fn create_config_toml(llmx_home: &Path, server_uri: &str) -> std::io::Result<()> {
let config_toml = llmx_home.join("config.toml");
std::fs::write(
config_toml,
format!(
r#"
model = "mock-model"
approval_policy = "untrusted"
sandbox_policy = "workspace-write"
model_provider = "mock_provider"
[model_providers.mock_provider]
name = "Mock provider for test"
base_url = "{server_uri}/v1"
wire_api = "chat"
request_max_retries = 0
stream_max_retries = 0
"#
),
)
}