Files
llmx/codex-rs/mcp-server/src/codex_tool_runner.rs

201 lines
8.2 KiB
Rust
Raw Normal View History

feat: make Codex available as a tool when running it as an MCP server (#811) This PR replaces the placeholder `"echo"` tool call in the MCP server with a `"codex"` tool that calls Codex. Events such as `ExecApprovalRequest` and `ApplyPatchApprovalRequest` are not handled properly yet, but I have `approval_policy = "never"` set in my `~/.codex/config.toml` such that those codepaths are not exercised. The schema for this MPC tool is defined by a new `CodexToolCallParam` struct introduced in this PR. It is fairly similar to `ConfigOverrides`, as the param is used to help create the `Config` used to start the Codex session, though it also includes the `prompt` used to kick off the session. This PR also introduces the use of the third-party `schemars` crate to generate the JSON schema, which is verified in the `verify_codex_tool_json_schema()` unit test. Events that are dispatched during the Codex session are sent back to the MCP client as MCP notifications. This gives the client a way to monitor progress as the tool call itself may take minutes to complete depending on the complexity of the task requested by the user. In the video below, I launched the server via: ```shell mcp-server$ RUST_LOG=debug npx @modelcontextprotocol/inspector cargo run -- ``` In the video, you can see the flow of: * requesting the list of tools * choosing the **codex** tool * entering a value for **prompt** and then making the tool call Note that I left the other fields blank because when unspecified, the values in my `~/.codex/config.toml` were used: https://github.com/user-attachments/assets/1975058c-b004-43ef-8c8d-800a953b8192 Note that while using the inspector, I did run into https://github.com/modelcontextprotocol/inspector/issues/293, though the tip about ensuring I had only one instance of the **MCP Inspector** tab open in my browser seemed to fix things.
2025-05-05 07:16:19 -07:00
//! Asynchronous worker that executes a **Codex** tool-call inside a spawned
//! Tokio task. Separated from `message_processor.rs` to keep that file small
//! and to make future feature-growth easier to manage.
use codex_core::codex_wrapper::init_codex;
use codex_core::config::Config as CodexConfig;
use codex_core::protocol::AgentMessageEvent;
feat: make Codex available as a tool when running it as an MCP server (#811) This PR replaces the placeholder `"echo"` tool call in the MCP server with a `"codex"` tool that calls Codex. Events such as `ExecApprovalRequest` and `ApplyPatchApprovalRequest` are not handled properly yet, but I have `approval_policy = "never"` set in my `~/.codex/config.toml` such that those codepaths are not exercised. The schema for this MPC tool is defined by a new `CodexToolCallParam` struct introduced in this PR. It is fairly similar to `ConfigOverrides`, as the param is used to help create the `Config` used to start the Codex session, though it also includes the `prompt` used to kick off the session. This PR also introduces the use of the third-party `schemars` crate to generate the JSON schema, which is verified in the `verify_codex_tool_json_schema()` unit test. Events that are dispatched during the Codex session are sent back to the MCP client as MCP notifications. This gives the client a way to monitor progress as the tool call itself may take minutes to complete depending on the complexity of the task requested by the user. In the video below, I launched the server via: ```shell mcp-server$ RUST_LOG=debug npx @modelcontextprotocol/inspector cargo run -- ``` In the video, you can see the flow of: * requesting the list of tools * choosing the **codex** tool * entering a value for **prompt** and then making the tool call Note that I left the other fields blank because when unspecified, the values in my `~/.codex/config.toml` were used: https://github.com/user-attachments/assets/1975058c-b004-43ef-8c8d-800a953b8192 Note that while using the inspector, I did run into https://github.com/modelcontextprotocol/inspector/issues/293, though the tip about ensuring I had only one instance of the **MCP Inspector** tab open in my browser seemed to fix things.
2025-05-05 07:16:19 -07:00
use codex_core::protocol::Event;
use codex_core::protocol::EventMsg;
use codex_core::protocol::InputItem;
use codex_core::protocol::Op;
use mcp_types::CallToolResult;
use mcp_types::CallToolResultContent;
use mcp_types::JSONRPC_VERSION;
feat: make Codex available as a tool when running it as an MCP server (#811) This PR replaces the placeholder `"echo"` tool call in the MCP server with a `"codex"` tool that calls Codex. Events such as `ExecApprovalRequest` and `ApplyPatchApprovalRequest` are not handled properly yet, but I have `approval_policy = "never"` set in my `~/.codex/config.toml` such that those codepaths are not exercised. The schema for this MPC tool is defined by a new `CodexToolCallParam` struct introduced in this PR. It is fairly similar to `ConfigOverrides`, as the param is used to help create the `Config` used to start the Codex session, though it also includes the `prompt` used to kick off the session. This PR also introduces the use of the third-party `schemars` crate to generate the JSON schema, which is verified in the `verify_codex_tool_json_schema()` unit test. Events that are dispatched during the Codex session are sent back to the MCP client as MCP notifications. This gives the client a way to monitor progress as the tool call itself may take minutes to complete depending on the complexity of the task requested by the user. In the video below, I launched the server via: ```shell mcp-server$ RUST_LOG=debug npx @modelcontextprotocol/inspector cargo run -- ``` In the video, you can see the flow of: * requesting the list of tools * choosing the **codex** tool * entering a value for **prompt** and then making the tool call Note that I left the other fields blank because when unspecified, the values in my `~/.codex/config.toml` were used: https://github.com/user-attachments/assets/1975058c-b004-43ef-8c8d-800a953b8192 Note that while using the inspector, I did run into https://github.com/modelcontextprotocol/inspector/issues/293, though the tip about ensuring I had only one instance of the **MCP Inspector** tab open in my browser seemed to fix things.
2025-05-05 07:16:19 -07:00
use mcp_types::JSONRPCMessage;
use mcp_types::JSONRPCResponse;
use mcp_types::RequestId;
use mcp_types::TextContent;
use tokio::sync::mpsc::Sender;
/// Convert a Codex [`Event`] to an MCP notification.
fn codex_event_to_notification(event: &Event) -> JSONRPCMessage {
#[expect(clippy::expect_used)]
feat: make Codex available as a tool when running it as an MCP server (#811) This PR replaces the placeholder `"echo"` tool call in the MCP server with a `"codex"` tool that calls Codex. Events such as `ExecApprovalRequest` and `ApplyPatchApprovalRequest` are not handled properly yet, but I have `approval_policy = "never"` set in my `~/.codex/config.toml` such that those codepaths are not exercised. The schema for this MPC tool is defined by a new `CodexToolCallParam` struct introduced in this PR. It is fairly similar to `ConfigOverrides`, as the param is used to help create the `Config` used to start the Codex session, though it also includes the `prompt` used to kick off the session. This PR also introduces the use of the third-party `schemars` crate to generate the JSON schema, which is verified in the `verify_codex_tool_json_schema()` unit test. Events that are dispatched during the Codex session are sent back to the MCP client as MCP notifications. This gives the client a way to monitor progress as the tool call itself may take minutes to complete depending on the complexity of the task requested by the user. In the video below, I launched the server via: ```shell mcp-server$ RUST_LOG=debug npx @modelcontextprotocol/inspector cargo run -- ``` In the video, you can see the flow of: * requesting the list of tools * choosing the **codex** tool * entering a value for **prompt** and then making the tool call Note that I left the other fields blank because when unspecified, the values in my `~/.codex/config.toml` were used: https://github.com/user-attachments/assets/1975058c-b004-43ef-8c8d-800a953b8192 Note that while using the inspector, I did run into https://github.com/modelcontextprotocol/inspector/issues/293, though the tip about ensuring I had only one instance of the **MCP Inspector** tab open in my browser seemed to fix things.
2025-05-05 07:16:19 -07:00
JSONRPCMessage::Notification(mcp_types::JSONRPCNotification {
jsonrpc: JSONRPC_VERSION.into(),
method: "codex/event".into(),
params: Some(serde_json::to_value(event).expect("Event must serialize")),
})
}
/// Run a complete Codex session and stream events back to the client.
///
/// On completion (success or error) the function sends the appropriate
/// `tools/call` response so the LLM can continue the conversation.
pub async fn run_codex_tool_session(
id: RequestId,
initial_prompt: String,
config: CodexConfig,
outgoing: Sender<JSONRPCMessage>,
) {
let (codex, first_event, _ctrl_c) = match init_codex(config).await {
Ok(res) => res,
Err(e) => {
let result = CallToolResult {
content: vec![CallToolResultContent::TextContent(TextContent {
r#type: "text".to_string(),
text: format!("Failed to start Codex session: {e}"),
annotations: None,
})],
is_error: Some(true),
};
let _ = outgoing
.send(JSONRPCMessage::Response(JSONRPCResponse {
jsonrpc: JSONRPC_VERSION.into(),
id,
result: result.into(),
}))
.await;
return;
}
};
// Send initial SessionConfigured event.
let _ = outgoing
.send(codex_event_to_notification(&first_event))
.await;
if let Err(e) = codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
text: initial_prompt.clone(),
}],
})
.await
{
tracing::error!("Failed to submit initial prompt: {e}");
}
let mut last_agent_message: Option<String> = None;
// Stream events until the task needs to pause for user interaction or
// completes.
loop {
match codex.next_event().await {
Ok(event) => {
let _ = outgoing.send(codex_event_to_notification(&event)).await;
match &event.msg {
EventMsg::AgentMessage(AgentMessageEvent { message }) => {
feat: make Codex available as a tool when running it as an MCP server (#811) This PR replaces the placeholder `"echo"` tool call in the MCP server with a `"codex"` tool that calls Codex. Events such as `ExecApprovalRequest` and `ApplyPatchApprovalRequest` are not handled properly yet, but I have `approval_policy = "never"` set in my `~/.codex/config.toml` such that those codepaths are not exercised. The schema for this MPC tool is defined by a new `CodexToolCallParam` struct introduced in this PR. It is fairly similar to `ConfigOverrides`, as the param is used to help create the `Config` used to start the Codex session, though it also includes the `prompt` used to kick off the session. This PR also introduces the use of the third-party `schemars` crate to generate the JSON schema, which is verified in the `verify_codex_tool_json_schema()` unit test. Events that are dispatched during the Codex session are sent back to the MCP client as MCP notifications. This gives the client a way to monitor progress as the tool call itself may take minutes to complete depending on the complexity of the task requested by the user. In the video below, I launched the server via: ```shell mcp-server$ RUST_LOG=debug npx @modelcontextprotocol/inspector cargo run -- ``` In the video, you can see the flow of: * requesting the list of tools * choosing the **codex** tool * entering a value for **prompt** and then making the tool call Note that I left the other fields blank because when unspecified, the values in my `~/.codex/config.toml` were used: https://github.com/user-attachments/assets/1975058c-b004-43ef-8c8d-800a953b8192 Note that while using the inspector, I did run into https://github.com/modelcontextprotocol/inspector/issues/293, though the tip about ensuring I had only one instance of the **MCP Inspector** tab open in my browser seemed to fix things.
2025-05-05 07:16:19 -07:00
last_agent_message = Some(message.clone());
}
EventMsg::ExecApprovalRequest(_) => {
feat: make Codex available as a tool when running it as an MCP server (#811) This PR replaces the placeholder `"echo"` tool call in the MCP server with a `"codex"` tool that calls Codex. Events such as `ExecApprovalRequest` and `ApplyPatchApprovalRequest` are not handled properly yet, but I have `approval_policy = "never"` set in my `~/.codex/config.toml` such that those codepaths are not exercised. The schema for this MPC tool is defined by a new `CodexToolCallParam` struct introduced in this PR. It is fairly similar to `ConfigOverrides`, as the param is used to help create the `Config` used to start the Codex session, though it also includes the `prompt` used to kick off the session. This PR also introduces the use of the third-party `schemars` crate to generate the JSON schema, which is verified in the `verify_codex_tool_json_schema()` unit test. Events that are dispatched during the Codex session are sent back to the MCP client as MCP notifications. This gives the client a way to monitor progress as the tool call itself may take minutes to complete depending on the complexity of the task requested by the user. In the video below, I launched the server via: ```shell mcp-server$ RUST_LOG=debug npx @modelcontextprotocol/inspector cargo run -- ``` In the video, you can see the flow of: * requesting the list of tools * choosing the **codex** tool * entering a value for **prompt** and then making the tool call Note that I left the other fields blank because when unspecified, the values in my `~/.codex/config.toml` were used: https://github.com/user-attachments/assets/1975058c-b004-43ef-8c8d-800a953b8192 Note that while using the inspector, I did run into https://github.com/modelcontextprotocol/inspector/issues/293, though the tip about ensuring I had only one instance of the **MCP Inspector** tab open in my browser seemed to fix things.
2025-05-05 07:16:19 -07:00
let result = CallToolResult {
content: vec![CallToolResultContent::TextContent(TextContent {
r#type: "text".to_string(),
text: "EXEC_APPROVAL_REQUIRED".to_string(),
annotations: None,
})],
is_error: None,
};
let _ = outgoing
.send(JSONRPCMessage::Response(JSONRPCResponse {
jsonrpc: JSONRPC_VERSION.into(),
id: id.clone(),
result: result.into(),
}))
.await;
break;
}
EventMsg::ApplyPatchApprovalRequest(_) => {
feat: make Codex available as a tool when running it as an MCP server (#811) This PR replaces the placeholder `"echo"` tool call in the MCP server with a `"codex"` tool that calls Codex. Events such as `ExecApprovalRequest` and `ApplyPatchApprovalRequest` are not handled properly yet, but I have `approval_policy = "never"` set in my `~/.codex/config.toml` such that those codepaths are not exercised. The schema for this MPC tool is defined by a new `CodexToolCallParam` struct introduced in this PR. It is fairly similar to `ConfigOverrides`, as the param is used to help create the `Config` used to start the Codex session, though it also includes the `prompt` used to kick off the session. This PR also introduces the use of the third-party `schemars` crate to generate the JSON schema, which is verified in the `verify_codex_tool_json_schema()` unit test. Events that are dispatched during the Codex session are sent back to the MCP client as MCP notifications. This gives the client a way to monitor progress as the tool call itself may take minutes to complete depending on the complexity of the task requested by the user. In the video below, I launched the server via: ```shell mcp-server$ RUST_LOG=debug npx @modelcontextprotocol/inspector cargo run -- ``` In the video, you can see the flow of: * requesting the list of tools * choosing the **codex** tool * entering a value for **prompt** and then making the tool call Note that I left the other fields blank because when unspecified, the values in my `~/.codex/config.toml` were used: https://github.com/user-attachments/assets/1975058c-b004-43ef-8c8d-800a953b8192 Note that while using the inspector, I did run into https://github.com/modelcontextprotocol/inspector/issues/293, though the tip about ensuring I had only one instance of the **MCP Inspector** tab open in my browser seemed to fix things.
2025-05-05 07:16:19 -07:00
let result = CallToolResult {
content: vec![CallToolResultContent::TextContent(TextContent {
r#type: "text".to_string(),
text: "PATCH_APPROVAL_REQUIRED".to_string(),
annotations: None,
})],
is_error: None,
};
let _ = outgoing
.send(JSONRPCMessage::Response(JSONRPCResponse {
jsonrpc: JSONRPC_VERSION.into(),
id: id.clone(),
result: result.into(),
}))
.await;
break;
}
EventMsg::TaskComplete => {
let result = if let Some(msg) = last_agent_message {
CallToolResult {
content: vec![CallToolResultContent::TextContent(TextContent {
r#type: "text".to_string(),
text: msg,
annotations: None,
})],
is_error: None,
}
} else {
CallToolResult {
content: vec![CallToolResultContent::TextContent(TextContent {
r#type: "text".to_string(),
text: String::new(),
annotations: None,
})],
is_error: None,
}
};
let _ = outgoing
.send(JSONRPCMessage::Response(JSONRPCResponse {
jsonrpc: JSONRPC_VERSION.into(),
id: id.clone(),
result: result.into(),
}))
.await;
break;
}
EventMsg::SessionConfigured(_) => {
feat: make Codex available as a tool when running it as an MCP server (#811) This PR replaces the placeholder `"echo"` tool call in the MCP server with a `"codex"` tool that calls Codex. Events such as `ExecApprovalRequest` and `ApplyPatchApprovalRequest` are not handled properly yet, but I have `approval_policy = "never"` set in my `~/.codex/config.toml` such that those codepaths are not exercised. The schema for this MPC tool is defined by a new `CodexToolCallParam` struct introduced in this PR. It is fairly similar to `ConfigOverrides`, as the param is used to help create the `Config` used to start the Codex session, though it also includes the `prompt` used to kick off the session. This PR also introduces the use of the third-party `schemars` crate to generate the JSON schema, which is verified in the `verify_codex_tool_json_schema()` unit test. Events that are dispatched during the Codex session are sent back to the MCP client as MCP notifications. This gives the client a way to monitor progress as the tool call itself may take minutes to complete depending on the complexity of the task requested by the user. In the video below, I launched the server via: ```shell mcp-server$ RUST_LOG=debug npx @modelcontextprotocol/inspector cargo run -- ``` In the video, you can see the flow of: * requesting the list of tools * choosing the **codex** tool * entering a value for **prompt** and then making the tool call Note that I left the other fields blank because when unspecified, the values in my `~/.codex/config.toml` were used: https://github.com/user-attachments/assets/1975058c-b004-43ef-8c8d-800a953b8192 Note that while using the inspector, I did run into https://github.com/modelcontextprotocol/inspector/issues/293, though the tip about ensuring I had only one instance of the **MCP Inspector** tab open in my browser seemed to fix things.
2025-05-05 07:16:19 -07:00
tracing::error!("unexpected SessionConfigured event");
}
EventMsg::Error(_)
| EventMsg::TaskStarted
| EventMsg::AgentReasoning(_)
| EventMsg::McpToolCallBegin(_)
| EventMsg::McpToolCallEnd(_)
| EventMsg::ExecCommandBegin(_)
| EventMsg::ExecCommandEnd(_)
| EventMsg::BackgroundEvent(_)
| EventMsg::PatchApplyBegin(_)
feat: record messages from user in ~/.codex/history.jsonl (#939) This is a large change to support a "history" feature like you would expect in a shell like Bash. History events are recorded in `$CODEX_HOME/history.jsonl`. Because it is a JSONL file, it is straightforward to append new entries (as opposed to the TypeScript file that uses `$CODEX_HOME/history.json`, so to be valid JSON, each new entry entails rewriting the entire file). Because it is possible for there to be multiple instances of Codex CLI writing to `history.jsonl` at once, we use advisory file locking when working with `history.jsonl` in `codex-rs/core/src/message_history.rs`. Because we believe history is a sufficiently useful feature, we enable it by default. Though to provide some safety, we set the file permissions of `history.jsonl` to be `o600` so that other users on the system cannot read the user's history. We do not yet support a default list of `SENSITIVE_PATTERNS` as the TypeScript CLI does: https://github.com/openai/codex/blob/3fdf9df1335ac9501e3fb0e61715359145711e8b/codex-cli/src/utils/storage/command-history.ts#L10-L17 We are going to take a more conservative approach to this list in the Rust CLI. For example, while `/\b[A-Za-z0-9-_]{20,}\b/` might exclude sensitive information like API tokens, it would also exclude valuable information such as references to Git commits. As noted in the updated documentation, users can opt-out of history by adding the following to `config.toml`: ```toml [history] persistence = "none" ``` Because `history.jsonl` could, in theory, be quite large, we take a[n arguably overly pedantic] approach in reading history entries into memory. Specifically, we start by telling the client the current number of entries in the history file (`history_entry_count`) as well as the inode (`history_log_id`) of `history.jsonl` (see the new fields on `SessionConfiguredEvent`). The client is responsible for keeping new entries in memory to create a "local history," but if the user hits up enough times to go "past" the end of local history, then the client should use the new `GetHistoryEntryRequest` in the protocol to fetch older entries. Specifically, it should pass the `history_log_id` it was given originally and work backwards from `history_entry_count`. (It should really fetch history in batches rather than one-at-a-time, but that is something we can improve upon in subsequent PRs.) The motivation behind this crazy scheme is that it is designed to defend against: * The `history.jsonl` being truncated during the session such that the index into the history is no longer consistent with what had been read up to that point. We do not yet have logic to enforce a `max_bytes` for `history.jsonl`, but once we do, we will aspire to implement it in a way that should result in a new inode for the file on most systems. * New items from concurrent Codex CLI sessions amending to the history. Because, in absence of truncation, `history.jsonl` is an append-only log, so long as the client reads backwards from `history_entry_count`, it should always get a consistent view of history. (That said, it will not be able to read _new_ commands from concurrent sessions, but perhaps we will introduce a `/` command to reload latest history or something down the road.) Admittedly, my testing of this feature thus far has been fairly light. I expect we will find bugs and introduce enhancements/fixes going forward.
2025-05-15 16:26:23 -07:00
| EventMsg::PatchApplyEnd(_)
| EventMsg::GetHistoryEntryResponse(_) => {
// For now, we do not do anything extra for these
// events. Note that
// send(codex_event_to_notification(&event)) above has
// already dispatched these events as notifications,
// though we may want to do give different treatment to
// individual events in the future.
}
feat: make Codex available as a tool when running it as an MCP server (#811) This PR replaces the placeholder `"echo"` tool call in the MCP server with a `"codex"` tool that calls Codex. Events such as `ExecApprovalRequest` and `ApplyPatchApprovalRequest` are not handled properly yet, but I have `approval_policy = "never"` set in my `~/.codex/config.toml` such that those codepaths are not exercised. The schema for this MPC tool is defined by a new `CodexToolCallParam` struct introduced in this PR. It is fairly similar to `ConfigOverrides`, as the param is used to help create the `Config` used to start the Codex session, though it also includes the `prompt` used to kick off the session. This PR also introduces the use of the third-party `schemars` crate to generate the JSON schema, which is verified in the `verify_codex_tool_json_schema()` unit test. Events that are dispatched during the Codex session are sent back to the MCP client as MCP notifications. This gives the client a way to monitor progress as the tool call itself may take minutes to complete depending on the complexity of the task requested by the user. In the video below, I launched the server via: ```shell mcp-server$ RUST_LOG=debug npx @modelcontextprotocol/inspector cargo run -- ``` In the video, you can see the flow of: * requesting the list of tools * choosing the **codex** tool * entering a value for **prompt** and then making the tool call Note that I left the other fields blank because when unspecified, the values in my `~/.codex/config.toml` were used: https://github.com/user-attachments/assets/1975058c-b004-43ef-8c8d-800a953b8192 Note that while using the inspector, I did run into https://github.com/modelcontextprotocol/inspector/issues/293, though the tip about ensuring I had only one instance of the **MCP Inspector** tab open in my browser seemed to fix things.
2025-05-05 07:16:19 -07:00
}
}
Err(e) => {
let result = CallToolResult {
content: vec![CallToolResultContent::TextContent(TextContent {
r#type: "text".to_string(),
text: format!("Codex runtime error: {e}"),
annotations: None,
})],
is_error: Some(true),
};
let _ = outgoing
.send(JSONRPCMessage::Response(JSONRPCResponse {
jsonrpc: JSONRPC_VERSION.into(),
id: id.clone(),
result: result.into(),
}))
.await;
break;
}
}
}
}