chore: refactor tool handling (#4510)
# Tool System Refactor
- Centralizes tool definitions and execution in `core/src/tools/*`:
specs (`spec.rs`), handlers (`handlers/*`), router (`router.rs`),
registry/dispatch (`registry.rs`), and shared context (`context.rs`).
One registry now builds the model-visible tool list and binds handlers.
- Router converts model responses to tool calls; Registry dispatches
with consistent telemetry via `codex-rs/otel` and unified error
handling. Function, Local Shell, MCP, and experimental `unified_exec`
all flow through this path; legacy shell aliases still work.
- Rationale: reduce per‑tool boilerplate, keep spec/handler in sync, and
make adding tools predictable and testable.
Example: `read_file`
- Spec: `core/src/tools/spec.rs` (see `create_read_file_tool`,
registered by `build_specs`).
- Handler: `core/src/tools/handlers/read_file.rs` (absolute `file_path`,
1‑indexed `offset`, `limit`, `L#: ` prefixes, safe truncation).
- E2E test: `core/tests/suite/read_file.rs` validates the tool returns
the requested lines.
## Next steps:
- Decompose `handle_container_exec_with_params`
- Add parallel tool calls
2025-10-03 13:21:06 +01:00
|
|
|
use crate::client_common::tools::ResponsesApiTool;
|
|
|
|
|
use crate::client_common::tools::ToolSpec;
|
2025-07-29 11:22:02 -07:00
|
|
|
use crate::codex::Session;
|
2025-10-21 08:04:16 -07:00
|
|
|
use crate::codex::TurnContext;
|
2025-09-24 10:27:35 -07:00
|
|
|
use crate::function_tool::FunctionCallError;
|
chore: refactor tool handling (#4510)
# Tool System Refactor
- Centralizes tool definitions and execution in `core/src/tools/*`:
specs (`spec.rs`), handlers (`handlers/*`), router (`router.rs`),
registry/dispatch (`registry.rs`), and shared context (`context.rs`).
One registry now builds the model-visible tool list and binds handlers.
- Router converts model responses to tool calls; Registry dispatches
with consistent telemetry via `codex-rs/otel` and unified error
handling. Function, Local Shell, MCP, and experimental `unified_exec`
all flow through this path; legacy shell aliases still work.
- Rationale: reduce per‑tool boilerplate, keep spec/handler in sync, and
make adding tools predictable and testable.
Example: `read_file`
- Spec: `core/src/tools/spec.rs` (see `create_read_file_tool`,
registered by `build_specs`).
- Handler: `core/src/tools/handlers/read_file.rs` (absolute `file_path`,
1‑indexed `offset`, `limit`, `L#: ` prefixes, safe truncation).
- E2E test: `core/tests/suite/read_file.rs` validates the tool returns
the requested lines.
## Next steps:
- Decompose `handle_container_exec_with_params`
- Add parallel tool calls
2025-10-03 13:21:06 +01:00
|
|
|
use crate::tools::context::ToolInvocation;
|
|
|
|
|
use crate::tools::context::ToolOutput;
|
|
|
|
|
use crate::tools::context::ToolPayload;
|
|
|
|
|
use crate::tools::registry::ToolHandler;
|
|
|
|
|
use crate::tools::registry::ToolKind;
|
2025-10-20 20:57:37 +01:00
|
|
|
use crate::tools::spec::JsonSchema;
|
chore: refactor tool handling (#4510)
# Tool System Refactor
- Centralizes tool definitions and execution in `core/src/tools/*`:
specs (`spec.rs`), handlers (`handlers/*`), router (`router.rs`),
registry/dispatch (`registry.rs`), and shared context (`context.rs`).
One registry now builds the model-visible tool list and binds handlers.
- Router converts model responses to tool calls; Registry dispatches
with consistent telemetry via `codex-rs/otel` and unified error
handling. Function, Local Shell, MCP, and experimental `unified_exec`
all flow through this path; legacy shell aliases still work.
- Rationale: reduce per‑tool boilerplate, keep spec/handler in sync, and
make adding tools predictable and testable.
Example: `read_file`
- Spec: `core/src/tools/spec.rs` (see `create_read_file_tool`,
registered by `build_specs`).
- Handler: `core/src/tools/handlers/read_file.rs` (absolute `file_path`,
1‑indexed `offset`, `limit`, `L#: ` prefixes, safe truncation).
- E2E test: `core/tests/suite/read_file.rs` validates the tool returns
the requested lines.
## Next steps:
- Decompose `handle_container_exec_with_params`
- Add parallel tool calls
2025-10-03 13:21:06 +01:00
|
|
|
use async_trait::async_trait;
|
|
|
|
|
use codex_protocol::plan_tool::UpdatePlanArgs;
|
|
|
|
|
use codex_protocol::protocol::EventMsg;
|
|
|
|
|
use std::collections::BTreeMap;
|
|
|
|
|
use std::sync::LazyLock;
|
2025-07-29 11:22:02 -07:00
|
|
|
|
chore: refactor tool handling (#4510)
# Tool System Refactor
- Centralizes tool definitions and execution in `core/src/tools/*`:
specs (`spec.rs`), handlers (`handlers/*`), router (`router.rs`),
registry/dispatch (`registry.rs`), and shared context (`context.rs`).
One registry now builds the model-visible tool list and binds handlers.
- Router converts model responses to tool calls; Registry dispatches
with consistent telemetry via `codex-rs/otel` and unified error
handling. Function, Local Shell, MCP, and experimental `unified_exec`
all flow through this path; legacy shell aliases still work.
- Rationale: reduce per‑tool boilerplate, keep spec/handler in sync, and
make adding tools predictable and testable.
Example: `read_file`
- Spec: `core/src/tools/spec.rs` (see `create_read_file_tool`,
registered by `build_specs`).
- Handler: `core/src/tools/handlers/read_file.rs` (absolute `file_path`,
1‑indexed `offset`, `limit`, `L#: ` prefixes, safe truncation).
- E2E test: `core/tests/suite/read_file.rs` validates the tool returns
the requested lines.
## Next steps:
- Decompose `handle_container_exec_with_params`
- Add parallel tool calls
2025-10-03 13:21:06 +01:00
|
|
|
pub struct PlanHandler;
|
2025-07-29 11:22:02 -07:00
|
|
|
|
chore: refactor tool handling (#4510)
# Tool System Refactor
- Centralizes tool definitions and execution in `core/src/tools/*`:
specs (`spec.rs`), handlers (`handlers/*`), router (`router.rs`),
registry/dispatch (`registry.rs`), and shared context (`context.rs`).
One registry now builds the model-visible tool list and binds handlers.
- Router converts model responses to tool calls; Registry dispatches
with consistent telemetry via `codex-rs/otel` and unified error
handling. Function, Local Shell, MCP, and experimental `unified_exec`
all flow through this path; legacy shell aliases still work.
- Rationale: reduce per‑tool boilerplate, keep spec/handler in sync, and
make adding tools predictable and testable.
Example: `read_file`
- Spec: `core/src/tools/spec.rs` (see `create_read_file_tool`,
registered by `build_specs`).
- Handler: `core/src/tools/handlers/read_file.rs` (absolute `file_path`,
1‑indexed `offset`, `limit`, `L#: ` prefixes, safe truncation).
- E2E test: `core/tests/suite/read_file.rs` validates the tool returns
the requested lines.
## Next steps:
- Decompose `handle_container_exec_with_params`
- Add parallel tool calls
2025-10-03 13:21:06 +01:00
|
|
|
pub static PLAN_TOOL: LazyLock<ToolSpec> = LazyLock::new(|| {
|
2025-07-29 11:22:02 -07:00
|
|
|
let mut plan_item_props = BTreeMap::new();
|
2025-08-05 20:44:20 -07:00
|
|
|
plan_item_props.insert("step".to_string(), JsonSchema::String { description: None });
|
|
|
|
|
plan_item_props.insert(
|
|
|
|
|
"status".to_string(),
|
2025-08-13 12:05:13 -07:00
|
|
|
JsonSchema::String {
|
|
|
|
|
description: Some("One of: pending, in_progress, completed".to_string()),
|
|
|
|
|
},
|
2025-08-05 20:44:20 -07:00
|
|
|
);
|
2025-07-29 11:22:02 -07:00
|
|
|
|
|
|
|
|
let plan_items_schema = JsonSchema::Array {
|
2025-08-05 20:44:20 -07:00
|
|
|
description: Some("The list of steps".to_string()),
|
2025-07-29 11:22:02 -07:00
|
|
|
items: Box::new(JsonSchema::Object {
|
|
|
|
|
properties: plan_item_props,
|
2025-08-05 19:27:52 -07:00
|
|
|
required: Some(vec!["step".to_string(), "status".to_string()]),
|
2025-10-02 11:05:51 -06:00
|
|
|
additional_properties: Some(false.into()),
|
2025-07-29 11:22:02 -07:00
|
|
|
}),
|
|
|
|
|
};
|
|
|
|
|
|
|
|
|
|
let mut properties = BTreeMap::new();
|
2025-08-05 20:44:20 -07:00
|
|
|
properties.insert(
|
|
|
|
|
"explanation".to_string(),
|
|
|
|
|
JsonSchema::String { description: None },
|
|
|
|
|
);
|
2025-07-29 11:22:02 -07:00
|
|
|
properties.insert("plan".to_string(), plan_items_schema);
|
|
|
|
|
|
chore: refactor tool handling (#4510)
# Tool System Refactor
- Centralizes tool definitions and execution in `core/src/tools/*`:
specs (`spec.rs`), handlers (`handlers/*`), router (`router.rs`),
registry/dispatch (`registry.rs`), and shared context (`context.rs`).
One registry now builds the model-visible tool list and binds handlers.
- Router converts model responses to tool calls; Registry dispatches
with consistent telemetry via `codex-rs/otel` and unified error
handling. Function, Local Shell, MCP, and experimental `unified_exec`
all flow through this path; legacy shell aliases still work.
- Rationale: reduce per‑tool boilerplate, keep spec/handler in sync, and
make adding tools predictable and testable.
Example: `read_file`
- Spec: `core/src/tools/spec.rs` (see `create_read_file_tool`,
registered by `build_specs`).
- Handler: `core/src/tools/handlers/read_file.rs` (absolute `file_path`,
1‑indexed `offset`, `limit`, `L#: ` prefixes, safe truncation).
- E2E test: `core/tests/suite/read_file.rs` validates the tool returns
the requested lines.
## Next steps:
- Decompose `handle_container_exec_with_params`
- Add parallel tool calls
2025-10-03 13:21:06 +01:00
|
|
|
ToolSpec::Function(ResponsesApiTool {
|
2025-08-05 19:27:52 -07:00
|
|
|
name: "update_plan".to_string(),
|
2025-08-13 12:05:13 -07:00
|
|
|
description: r#"Updates the task plan.
|
|
|
|
|
Provide an optional explanation and a list of plan items, each with a step and status.
|
|
|
|
|
At most one step can be in_progress at a time.
|
|
|
|
|
"#
|
|
|
|
|
.to_string(),
|
2025-07-29 11:22:02 -07:00
|
|
|
strict: false,
|
|
|
|
|
parameters: JsonSchema::Object {
|
|
|
|
|
properties,
|
2025-08-05 19:27:52 -07:00
|
|
|
required: Some(vec!["plan".to_string()]),
|
2025-10-02 11:05:51 -06:00
|
|
|
additional_properties: Some(false.into()),
|
2025-07-29 11:22:02 -07:00
|
|
|
},
|
|
|
|
|
})
|
|
|
|
|
});
|
|
|
|
|
|
chore: refactor tool handling (#4510)
# Tool System Refactor
- Centralizes tool definitions and execution in `core/src/tools/*`:
specs (`spec.rs`), handlers (`handlers/*`), router (`router.rs`),
registry/dispatch (`registry.rs`), and shared context (`context.rs`).
One registry now builds the model-visible tool list and binds handlers.
- Router converts model responses to tool calls; Registry dispatches
with consistent telemetry via `codex-rs/otel` and unified error
handling. Function, Local Shell, MCP, and experimental `unified_exec`
all flow through this path; legacy shell aliases still work.
- Rationale: reduce per‑tool boilerplate, keep spec/handler in sync, and
make adding tools predictable and testable.
Example: `read_file`
- Spec: `core/src/tools/spec.rs` (see `create_read_file_tool`,
registered by `build_specs`).
- Handler: `core/src/tools/handlers/read_file.rs` (absolute `file_path`,
1‑indexed `offset`, `limit`, `L#: ` prefixes, safe truncation).
- E2E test: `core/tests/suite/read_file.rs` validates the tool returns
the requested lines.
## Next steps:
- Decompose `handle_container_exec_with_params`
- Add parallel tool calls
2025-10-03 13:21:06 +01:00
|
|
|
#[async_trait]
|
|
|
|
|
impl ToolHandler for PlanHandler {
|
|
|
|
|
fn kind(&self) -> ToolKind {
|
|
|
|
|
ToolKind::Function
|
|
|
|
|
}
|
|
|
|
|
|
2025-10-05 17:10:49 +01:00
|
|
|
async fn handle(&self, invocation: ToolInvocation) -> Result<ToolOutput, FunctionCallError> {
|
chore: refactor tool handling (#4510)
# Tool System Refactor
- Centralizes tool definitions and execution in `core/src/tools/*`:
specs (`spec.rs`), handlers (`handlers/*`), router (`router.rs`),
registry/dispatch (`registry.rs`), and shared context (`context.rs`).
One registry now builds the model-visible tool list and binds handlers.
- Router converts model responses to tool calls; Registry dispatches
with consistent telemetry via `codex-rs/otel` and unified error
handling. Function, Local Shell, MCP, and experimental `unified_exec`
all flow through this path; legacy shell aliases still work.
- Rationale: reduce per‑tool boilerplate, keep spec/handler in sync, and
make adding tools predictable and testable.
Example: `read_file`
- Spec: `core/src/tools/spec.rs` (see `create_read_file_tool`,
registered by `build_specs`).
- Handler: `core/src/tools/handlers/read_file.rs` (absolute `file_path`,
1‑indexed `offset`, `limit`, `L#: ` prefixes, safe truncation).
- E2E test: `core/tests/suite/read_file.rs` validates the tool returns
the requested lines.
## Next steps:
- Decompose `handle_container_exec_with_params`
- Add parallel tool calls
2025-10-03 13:21:06 +01:00
|
|
|
let ToolInvocation {
|
|
|
|
|
session,
|
2025-10-21 08:04:16 -07:00
|
|
|
turn,
|
chore: refactor tool handling (#4510)
# Tool System Refactor
- Centralizes tool definitions and execution in `core/src/tools/*`:
specs (`spec.rs`), handlers (`handlers/*`), router (`router.rs`),
registry/dispatch (`registry.rs`), and shared context (`context.rs`).
One registry now builds the model-visible tool list and binds handlers.
- Router converts model responses to tool calls; Registry dispatches
with consistent telemetry via `codex-rs/otel` and unified error
handling. Function, Local Shell, MCP, and experimental `unified_exec`
all flow through this path; legacy shell aliases still work.
- Rationale: reduce per‑tool boilerplate, keep spec/handler in sync, and
make adding tools predictable and testable.
Example: `read_file`
- Spec: `core/src/tools/spec.rs` (see `create_read_file_tool`,
registered by `build_specs`).
- Handler: `core/src/tools/handlers/read_file.rs` (absolute `file_path`,
1‑indexed `offset`, `limit`, `L#: ` prefixes, safe truncation).
- E2E test: `core/tests/suite/read_file.rs` validates the tool returns
the requested lines.
## Next steps:
- Decompose `handle_container_exec_with_params`
- Add parallel tool calls
2025-10-03 13:21:06 +01:00
|
|
|
call_id,
|
|
|
|
|
payload,
|
|
|
|
|
..
|
|
|
|
|
} = invocation;
|
|
|
|
|
|
|
|
|
|
let arguments = match payload {
|
|
|
|
|
ToolPayload::Function { arguments } => arguments,
|
|
|
|
|
_ => {
|
|
|
|
|
return Err(FunctionCallError::RespondToModel(
|
|
|
|
|
"update_plan handler received unsupported payload".to_string(),
|
|
|
|
|
));
|
|
|
|
|
}
|
|
|
|
|
};
|
|
|
|
|
|
2025-10-05 17:10:49 +01:00
|
|
|
let content =
|
2025-10-21 08:04:16 -07:00
|
|
|
handle_update_plan(session.as_ref(), turn.as_ref(), arguments, call_id).await?;
|
chore: refactor tool handling (#4510)
# Tool System Refactor
- Centralizes tool definitions and execution in `core/src/tools/*`:
specs (`spec.rs`), handlers (`handlers/*`), router (`router.rs`),
registry/dispatch (`registry.rs`), and shared context (`context.rs`).
One registry now builds the model-visible tool list and binds handlers.
- Router converts model responses to tool calls; Registry dispatches
with consistent telemetry via `codex-rs/otel` and unified error
handling. Function, Local Shell, MCP, and experimental `unified_exec`
all flow through this path; legacy shell aliases still work.
- Rationale: reduce per‑tool boilerplate, keep spec/handler in sync, and
make adding tools predictable and testable.
Example: `read_file`
- Spec: `core/src/tools/spec.rs` (see `create_read_file_tool`,
registered by `build_specs`).
- Handler: `core/src/tools/handlers/read_file.rs` (absolute `file_path`,
1‑indexed `offset`, `limit`, `L#: ` prefixes, safe truncation).
- E2E test: `core/tests/suite/read_file.rs` validates the tool returns
the requested lines.
## Next steps:
- Decompose `handle_container_exec_with_params`
- Add parallel tool calls
2025-10-03 13:21:06 +01:00
|
|
|
|
|
|
|
|
Ok(ToolOutput::Function {
|
|
|
|
|
content,
|
[MCP] Render MCP tool call result images to the model (#5600)
It's pretty amazing we have gotten here without the ability for the
model to see image content from MCP tool calls.
This PR builds off of 4391 and fixes #4819. I would like @KKcorps to get
adequete credit here but I also want to get this fix in ASAP so I gave
him a week to update it and haven't gotten a response so I'm going to
take it across the finish line.
This test highlights how absured the current situation is. I asked the
model to read this image using the Chrome MCP
<img width="2378" height="674" alt="image"
src="https://github.com/user-attachments/assets/9ef52608-72a2-4423-9f5e-7ae36b2b56e0"
/>
After this change, it correctly outputs:
> Captured the page: image dhows a dark terminal-style UI labeled
`OpenAI Codex (v0.0.0)` with prompt `model: gpt-5-codex medium` and
working directory `/codex/codex-rs`
(and more)
Before this change, it said:
> Took the full-page screenshot you asked for. It shows a long,
horizontally repeating pattern of stylized people in orange, light-blue,
and mustard clothing, holding hands in alternating poses against a white
background. No text or other graphics-just rows of flat illustration
stretching off to the right.
Without this change, the Figma, Playwright, Chrome, and other visual MCP
servers are pretty much entirely useless.
I tested this change with the openai respones api as well as a third
party completions api
2025-10-27 14:55:57 -07:00
|
|
|
content_items: None,
|
chore: refactor tool handling (#4510)
# Tool System Refactor
- Centralizes tool definitions and execution in `core/src/tools/*`:
specs (`spec.rs`), handlers (`handlers/*`), router (`router.rs`),
registry/dispatch (`registry.rs`), and shared context (`context.rs`).
One registry now builds the model-visible tool list and binds handlers.
- Router converts model responses to tool calls; Registry dispatches
with consistent telemetry via `codex-rs/otel` and unified error
handling. Function, Local Shell, MCP, and experimental `unified_exec`
all flow through this path; legacy shell aliases still work.
- Rationale: reduce per‑tool boilerplate, keep spec/handler in sync, and
make adding tools predictable and testable.
Example: `read_file`
- Spec: `core/src/tools/spec.rs` (see `create_read_file_tool`,
registered by `build_specs`).
- Handler: `core/src/tools/handlers/read_file.rs` (absolute `file_path`,
1‑indexed `offset`, `limit`, `L#: ` prefixes, safe truncation).
- E2E test: `core/tests/suite/read_file.rs` validates the tool returns
the requested lines.
## Next steps:
- Decompose `handle_container_exec_with_params`
- Add parallel tool calls
2025-10-03 13:21:06 +01:00
|
|
|
success: Some(true),
|
|
|
|
|
})
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2025-07-29 11:22:02 -07:00
|
|
|
/// This function doesn't do anything useful. However, it gives the model a structured way to record its plan that clients can read and render.
|
|
|
|
|
/// So it's the _inputs_ to this function that are useful to clients, not the outputs and neither are actually useful for the model other
|
|
|
|
|
/// than forcing it to come up and document a plan (TBD how that affects performance).
|
|
|
|
|
pub(crate) async fn handle_update_plan(
|
|
|
|
|
session: &Session,
|
2025-10-21 08:04:16 -07:00
|
|
|
turn_context: &TurnContext,
|
2025-07-29 11:22:02 -07:00
|
|
|
arguments: String,
|
2025-09-24 10:27:35 -07:00
|
|
|
_call_id: String,
|
|
|
|
|
) -> Result<String, FunctionCallError> {
|
|
|
|
|
let args = parse_update_plan_arguments(&arguments)?;
|
|
|
|
|
session
|
2025-10-21 08:04:16 -07:00
|
|
|
.send_event(turn_context, EventMsg::PlanUpdate(args))
|
2025-09-24 10:27:35 -07:00
|
|
|
.await;
|
|
|
|
|
Ok("Plan updated".to_string())
|
2025-07-29 11:22:02 -07:00
|
|
|
}
|
|
|
|
|
|
2025-09-24 10:27:35 -07:00
|
|
|
fn parse_update_plan_arguments(arguments: &str) -> Result<UpdatePlanArgs, FunctionCallError> {
|
|
|
|
|
serde_json::from_str::<UpdatePlanArgs>(arguments).map_err(|e| {
|
|
|
|
|
FunctionCallError::RespondToModel(format!("failed to parse function arguments: {e}"))
|
|
|
|
|
})
|
2025-07-29 11:22:02 -07:00
|
|
|
}
|