chore: refactor tool handling (#4510)
# Tool System Refactor
- Centralizes tool definitions and execution in `core/src/tools/*`:
specs (`spec.rs`), handlers (`handlers/*`), router (`router.rs`),
registry/dispatch (`registry.rs`), and shared context (`context.rs`).
One registry now builds the model-visible tool list and binds handlers.
- Router converts model responses to tool calls; Registry dispatches
with consistent telemetry via `codex-rs/otel` and unified error
handling. Function, Local Shell, MCP, and experimental `unified_exec`
all flow through this path; legacy shell aliases still work.
- Rationale: reduce per‑tool boilerplate, keep spec/handler in sync, and
make adding tools predictable and testable.
Example: `read_file`
- Spec: `core/src/tools/spec.rs` (see `create_read_file_tool`,
registered by `build_specs`).
- Handler: `core/src/tools/handlers/read_file.rs` (absolute `file_path`,
1‑indexed `offset`, `limit`, `L#: ` prefixes, safe truncation).
- E2E test: `core/tests/suite/read_file.rs` validates the tool returns
the requested lines.
## Next steps:
- Decompose `handle_container_exec_with_params`
- Add parallel tool calls
2025-10-03 13:21:06 +01:00
|
|
|
#![allow(clippy::unwrap_used)]
|
|
|
|
|
|
|
|
|
|
use codex_core::CodexAuth;
|
|
|
|
|
use codex_core::ConversationManager;
|
|
|
|
|
use codex_core::ModelProviderInfo;
|
|
|
|
|
use codex_core::built_in_model_providers;
|
2025-10-14 18:50:00 +01:00
|
|
|
use codex_core::features::Feature;
|
chore: refactor tool handling (#4510)
# Tool System Refactor
- Centralizes tool definitions and execution in `core/src/tools/*`:
specs (`spec.rs`), handlers (`handlers/*`), router (`router.rs`),
registry/dispatch (`registry.rs`), and shared context (`context.rs`).
One registry now builds the model-visible tool list and binds handlers.
- Router converts model responses to tool calls; Registry dispatches
with consistent telemetry via `codex-rs/otel` and unified error
handling. Function, Local Shell, MCP, and experimental `unified_exec`
all flow through this path; legacy shell aliases still work.
- Rationale: reduce per‑tool boilerplate, keep spec/handler in sync, and
make adding tools predictable and testable.
Example: `read_file`
- Spec: `core/src/tools/spec.rs` (see `create_read_file_tool`,
registered by `build_specs`).
- Handler: `core/src/tools/handlers/read_file.rs` (absolute `file_path`,
1‑indexed `offset`, `limit`, `L#: ` prefixes, safe truncation).
- E2E test: `core/tests/suite/read_file.rs` validates the tool returns
the requested lines.
## Next steps:
- Decompose `handle_container_exec_with_params`
- Add parallel tool calls
2025-10-03 13:21:06 +01:00
|
|
|
use codex_core::model_family::find_family_for_model;
|
|
|
|
|
use codex_core::protocol::EventMsg;
|
|
|
|
|
use codex_core::protocol::Op;
|
2025-10-20 13:34:44 -07:00
|
|
|
use codex_protocol::user_input::UserInput;
|
chore: refactor tool handling (#4510)
# Tool System Refactor
- Centralizes tool definitions and execution in `core/src/tools/*`:
specs (`spec.rs`), handlers (`handlers/*`), router (`router.rs`),
registry/dispatch (`registry.rs`), and shared context (`context.rs`).
One registry now builds the model-visible tool list and binds handlers.
- Router converts model responses to tool calls; Registry dispatches
with consistent telemetry via `codex-rs/otel` and unified error
handling. Function, Local Shell, MCP, and experimental `unified_exec`
all flow through this path; legacy shell aliases still work.
- Rationale: reduce per‑tool boilerplate, keep spec/handler in sync, and
make adding tools predictable and testable.
Example: `read_file`
- Spec: `core/src/tools/spec.rs` (see `create_read_file_tool`,
registered by `build_specs`).
- Handler: `core/src/tools/handlers/read_file.rs` (absolute `file_path`,
1‑indexed `offset`, `limit`, `L#: ` prefixes, safe truncation).
- E2E test: `core/tests/suite/read_file.rs` validates the tool returns
the requested lines.
## Next steps:
- Decompose `handle_container_exec_with_params`
- Add parallel tool calls
2025-10-03 13:21:06 +01:00
|
|
|
use core_test_support::load_default_config_for_test;
|
|
|
|
|
use core_test_support::load_sse_fixture_with_id;
|
2025-10-07 01:56:39 -07:00
|
|
|
use core_test_support::responses;
|
chore: refactor tool handling (#4510)
# Tool System Refactor
- Centralizes tool definitions and execution in `core/src/tools/*`:
specs (`spec.rs`), handlers (`handlers/*`), router (`router.rs`),
registry/dispatch (`registry.rs`), and shared context (`context.rs`).
One registry now builds the model-visible tool list and binds handlers.
- Router converts model responses to tool calls; Registry dispatches
with consistent telemetry via `codex-rs/otel` and unified error
handling. Function, Local Shell, MCP, and experimental `unified_exec`
all flow through this path; legacy shell aliases still work.
- Rationale: reduce per‑tool boilerplate, keep spec/handler in sync, and
make adding tools predictable and testable.
Example: `read_file`
- Spec: `core/src/tools/spec.rs` (see `create_read_file_tool`,
registered by `build_specs`).
- Handler: `core/src/tools/handlers/read_file.rs` (absolute `file_path`,
1‑indexed `offset`, `limit`, `L#: ` prefixes, safe truncation).
- E2E test: `core/tests/suite/read_file.rs` validates the tool returns
the requested lines.
## Next steps:
- Decompose `handle_container_exec_with_params`
- Add parallel tool calls
2025-10-03 13:21:06 +01:00
|
|
|
use core_test_support::skip_if_no_network;
|
|
|
|
|
use core_test_support::wait_for_event;
|
|
|
|
|
use tempfile::TempDir;
|
|
|
|
|
use wiremock::MockServer;
|
|
|
|
|
|
|
|
|
|
fn sse_completed(id: &str) -> String {
|
|
|
|
|
load_sse_fixture_with_id("tests/fixtures/completed_template.json", id)
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
#[allow(clippy::expect_used)]
|
|
|
|
|
fn tool_identifiers(body: &serde_json::Value) -> Vec<String> {
|
|
|
|
|
body["tools"]
|
|
|
|
|
.as_array()
|
|
|
|
|
.unwrap()
|
|
|
|
|
.iter()
|
|
|
|
|
.map(|tool| {
|
|
|
|
|
tool.get("name")
|
|
|
|
|
.and_then(|v| v.as_str())
|
|
|
|
|
.or_else(|| tool.get("type").and_then(|v| v.as_str()))
|
|
|
|
|
.map(std::string::ToString::to_string)
|
|
|
|
|
.expect("tool should have either name or type")
|
|
|
|
|
})
|
|
|
|
|
.collect()
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
#[allow(clippy::expect_used)]
|
|
|
|
|
async fn collect_tool_identifiers_for_model(model: &str) -> Vec<String> {
|
|
|
|
|
let server = MockServer::start().await;
|
|
|
|
|
|
|
|
|
|
let sse = sse_completed(model);
|
2025-10-07 01:56:39 -07:00
|
|
|
let resp_mock = responses::mount_sse_once_match(&server, wiremock::matchers::any(), sse).await;
|
chore: refactor tool handling (#4510)
# Tool System Refactor
- Centralizes tool definitions and execution in `core/src/tools/*`:
specs (`spec.rs`), handlers (`handlers/*`), router (`router.rs`),
registry/dispatch (`registry.rs`), and shared context (`context.rs`).
One registry now builds the model-visible tool list and binds handlers.
- Router converts model responses to tool calls; Registry dispatches
with consistent telemetry via `codex-rs/otel` and unified error
handling. Function, Local Shell, MCP, and experimental `unified_exec`
all flow through this path; legacy shell aliases still work.
- Rationale: reduce per‑tool boilerplate, keep spec/handler in sync, and
make adding tools predictable and testable.
Example: `read_file`
- Spec: `core/src/tools/spec.rs` (see `create_read_file_tool`,
registered by `build_specs`).
- Handler: `core/src/tools/handlers/read_file.rs` (absolute `file_path`,
1‑indexed `offset`, `limit`, `L#: ` prefixes, safe truncation).
- E2E test: `core/tests/suite/read_file.rs` validates the tool returns
the requested lines.
## Next steps:
- Decompose `handle_container_exec_with_params`
- Add parallel tool calls
2025-10-03 13:21:06 +01:00
|
|
|
|
|
|
|
|
let model_provider = ModelProviderInfo {
|
|
|
|
|
base_url: Some(format!("{}/v1", server.uri())),
|
|
|
|
|
..built_in_model_providers()["openai"].clone()
|
|
|
|
|
};
|
|
|
|
|
|
|
|
|
|
let cwd = TempDir::new().unwrap();
|
|
|
|
|
let codex_home = TempDir::new().unwrap();
|
|
|
|
|
let mut config = load_default_config_for_test(&codex_home);
|
|
|
|
|
config.cwd = cwd.path().to_path_buf();
|
|
|
|
|
config.model_provider = model_provider;
|
|
|
|
|
config.model = model.to_string();
|
|
|
|
|
config.model_family =
|
|
|
|
|
find_family_for_model(model).unwrap_or_else(|| panic!("unknown model family for {model}"));
|
2025-10-14 18:50:00 +01:00
|
|
|
config.features.disable(Feature::ApplyPatchFreeform);
|
|
|
|
|
config.features.disable(Feature::ViewImageTool);
|
|
|
|
|
config.features.disable(Feature::WebSearchRequest);
|
|
|
|
|
config.features.disable(Feature::StreamableShell);
|
|
|
|
|
config.features.disable(Feature::UnifiedExec);
|
chore: refactor tool handling (#4510)
# Tool System Refactor
- Centralizes tool definitions and execution in `core/src/tools/*`:
specs (`spec.rs`), handlers (`handlers/*`), router (`router.rs`),
registry/dispatch (`registry.rs`), and shared context (`context.rs`).
One registry now builds the model-visible tool list and binds handlers.
- Router converts model responses to tool calls; Registry dispatches
with consistent telemetry via `codex-rs/otel` and unified error
handling. Function, Local Shell, MCP, and experimental `unified_exec`
all flow through this path; legacy shell aliases still work.
- Rationale: reduce per‑tool boilerplate, keep spec/handler in sync, and
make adding tools predictable and testable.
Example: `read_file`
- Spec: `core/src/tools/spec.rs` (see `create_read_file_tool`,
registered by `build_specs`).
- Handler: `core/src/tools/handlers/read_file.rs` (absolute `file_path`,
1‑indexed `offset`, `limit`, `L#: ` prefixes, safe truncation).
- E2E test: `core/tests/suite/read_file.rs` validates the tool returns
the requested lines.
## Next steps:
- Decompose `handle_container_exec_with_params`
- Add parallel tool calls
2025-10-03 13:21:06 +01:00
|
|
|
|
|
|
|
|
let conversation_manager =
|
|
|
|
|
ConversationManager::with_auth(CodexAuth::from_api_key("Test API Key"));
|
|
|
|
|
let codex = conversation_manager
|
|
|
|
|
.new_conversation(config)
|
|
|
|
|
.await
|
|
|
|
|
.expect("create new conversation")
|
|
|
|
|
.conversation;
|
|
|
|
|
|
|
|
|
|
codex
|
|
|
|
|
.submit(Op::UserInput {
|
2025-10-20 13:34:44 -07:00
|
|
|
items: vec![UserInput::Text {
|
chore: refactor tool handling (#4510)
# Tool System Refactor
- Centralizes tool definitions and execution in `core/src/tools/*`:
specs (`spec.rs`), handlers (`handlers/*`), router (`router.rs`),
registry/dispatch (`registry.rs`), and shared context (`context.rs`).
One registry now builds the model-visible tool list and binds handlers.
- Router converts model responses to tool calls; Registry dispatches
with consistent telemetry via `codex-rs/otel` and unified error
handling. Function, Local Shell, MCP, and experimental `unified_exec`
all flow through this path; legacy shell aliases still work.
- Rationale: reduce per‑tool boilerplate, keep spec/handler in sync, and
make adding tools predictable and testable.
Example: `read_file`
- Spec: `core/src/tools/spec.rs` (see `create_read_file_tool`,
registered by `build_specs`).
- Handler: `core/src/tools/handlers/read_file.rs` (absolute `file_path`,
1‑indexed `offset`, `limit`, `L#: ` prefixes, safe truncation).
- E2E test: `core/tests/suite/read_file.rs` validates the tool returns
the requested lines.
## Next steps:
- Decompose `handle_container_exec_with_params`
- Add parallel tool calls
2025-10-03 13:21:06 +01:00
|
|
|
text: "hello tools".into(),
|
|
|
|
|
}],
|
|
|
|
|
})
|
|
|
|
|
.await
|
|
|
|
|
.unwrap();
|
|
|
|
|
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TaskComplete(_))).await;
|
|
|
|
|
|
2025-10-07 01:56:39 -07:00
|
|
|
let body = resp_mock.single_request().body_json();
|
chore: refactor tool handling (#4510)
# Tool System Refactor
- Centralizes tool definitions and execution in `core/src/tools/*`:
specs (`spec.rs`), handlers (`handlers/*`), router (`router.rs`),
registry/dispatch (`registry.rs`), and shared context (`context.rs`).
One registry now builds the model-visible tool list and binds handlers.
- Router converts model responses to tool calls; Registry dispatches
with consistent telemetry via `codex-rs/otel` and unified error
handling. Function, Local Shell, MCP, and experimental `unified_exec`
all flow through this path; legacy shell aliases still work.
- Rationale: reduce per‑tool boilerplate, keep spec/handler in sync, and
make adding tools predictable and testable.
Example: `read_file`
- Spec: `core/src/tools/spec.rs` (see `create_read_file_tool`,
registered by `build_specs`).
- Handler: `core/src/tools/handlers/read_file.rs` (absolute `file_path`,
1‑indexed `offset`, `limit`, `L#: ` prefixes, safe truncation).
- E2E test: `core/tests/suite/read_file.rs` validates the tool returns
the requested lines.
## Next steps:
- Decompose `handle_container_exec_with_params`
- Add parallel tool calls
2025-10-03 13:21:06 +01:00
|
|
|
tool_identifiers(&body)
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
|
|
|
|
|
async fn model_selects_expected_tools() {
|
|
|
|
|
skip_if_no_network!();
|
|
|
|
|
use pretty_assertions::assert_eq;
|
|
|
|
|
|
|
|
|
|
let codex_tools = collect_tool_identifiers_for_model("codex-mini-latest").await;
|
|
|
|
|
assert_eq!(
|
|
|
|
|
codex_tools,
|
2025-10-16 22:05:15 -07:00
|
|
|
vec![
|
|
|
|
|
"local_shell".to_string(),
|
|
|
|
|
"list_mcp_resources".to_string(),
|
|
|
|
|
"list_mcp_resource_templates".to_string(),
|
2025-10-21 09:25:05 -07:00
|
|
|
"read_mcp_resource".to_string(),
|
|
|
|
|
"update_plan".to_string()
|
2025-10-16 22:05:15 -07:00
|
|
|
],
|
chore: refactor tool handling (#4510)
# Tool System Refactor
- Centralizes tool definitions and execution in `core/src/tools/*`:
specs (`spec.rs`), handlers (`handlers/*`), router (`router.rs`),
registry/dispatch (`registry.rs`), and shared context (`context.rs`).
One registry now builds the model-visible tool list and binds handlers.
- Router converts model responses to tool calls; Registry dispatches
with consistent telemetry via `codex-rs/otel` and unified error
handling. Function, Local Shell, MCP, and experimental `unified_exec`
all flow through this path; legacy shell aliases still work.
- Rationale: reduce per‑tool boilerplate, keep spec/handler in sync, and
make adding tools predictable and testable.
Example: `read_file`
- Spec: `core/src/tools/spec.rs` (see `create_read_file_tool`,
registered by `build_specs`).
- Handler: `core/src/tools/handlers/read_file.rs` (absolute `file_path`,
1‑indexed `offset`, `limit`, `L#: ` prefixes, safe truncation).
- E2E test: `core/tests/suite/read_file.rs` validates the tool returns
the requested lines.
## Next steps:
- Decompose `handle_container_exec_with_params`
- Add parallel tool calls
2025-10-03 13:21:06 +01:00
|
|
|
"codex-mini-latest should expose the local shell tool",
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
let o3_tools = collect_tool_identifiers_for_model("o3").await;
|
|
|
|
|
assert_eq!(
|
|
|
|
|
o3_tools,
|
2025-10-16 22:05:15 -07:00
|
|
|
vec![
|
|
|
|
|
"shell".to_string(),
|
|
|
|
|
"list_mcp_resources".to_string(),
|
|
|
|
|
"list_mcp_resource_templates".to_string(),
|
2025-10-21 09:25:05 -07:00
|
|
|
"read_mcp_resource".to_string(),
|
|
|
|
|
"update_plan".to_string()
|
2025-10-16 22:05:15 -07:00
|
|
|
],
|
chore: refactor tool handling (#4510)
# Tool System Refactor
- Centralizes tool definitions and execution in `core/src/tools/*`:
specs (`spec.rs`), handlers (`handlers/*`), router (`router.rs`),
registry/dispatch (`registry.rs`), and shared context (`context.rs`).
One registry now builds the model-visible tool list and binds handlers.
- Router converts model responses to tool calls; Registry dispatches
with consistent telemetry via `codex-rs/otel` and unified error
handling. Function, Local Shell, MCP, and experimental `unified_exec`
all flow through this path; legacy shell aliases still work.
- Rationale: reduce per‑tool boilerplate, keep spec/handler in sync, and
make adding tools predictable and testable.
Example: `read_file`
- Spec: `core/src/tools/spec.rs` (see `create_read_file_tool`,
registered by `build_specs`).
- Handler: `core/src/tools/handlers/read_file.rs` (absolute `file_path`,
1‑indexed `offset`, `limit`, `L#: ` prefixes, safe truncation).
- E2E test: `core/tests/suite/read_file.rs` validates the tool returns
the requested lines.
## Next steps:
- Decompose `handle_container_exec_with_params`
- Add parallel tool calls
2025-10-03 13:21:06 +01:00
|
|
|
"o3 should expose the generic shell tool",
|
|
|
|
|
);
|
2025-10-03 17:58:03 +01:00
|
|
|
|
|
|
|
|
let gpt5_codex_tools = collect_tool_identifiers_for_model("gpt-5-codex").await;
|
|
|
|
|
assert_eq!(
|
|
|
|
|
gpt5_codex_tools,
|
2025-10-16 22:05:15 -07:00
|
|
|
vec![
|
|
|
|
|
"shell".to_string(),
|
|
|
|
|
"list_mcp_resources".to_string(),
|
|
|
|
|
"list_mcp_resource_templates".to_string(),
|
|
|
|
|
"read_mcp_resource".to_string(),
|
2025-10-21 09:25:05 -07:00
|
|
|
"update_plan".to_string(),
|
2025-10-16 22:05:15 -07:00
|
|
|
"apply_patch".to_string()
|
|
|
|
|
],
|
2025-10-04 22:47:26 -07:00
|
|
|
"gpt-5-codex should expose the apply_patch tool",
|
2025-10-03 17:58:03 +01:00
|
|
|
);
|
chore: refactor tool handling (#4510)
# Tool System Refactor
- Centralizes tool definitions and execution in `core/src/tools/*`:
specs (`spec.rs`), handlers (`handlers/*`), router (`router.rs`),
registry/dispatch (`registry.rs`), and shared context (`context.rs`).
One registry now builds the model-visible tool list and binds handlers.
- Router converts model responses to tool calls; Registry dispatches
with consistent telemetry via `codex-rs/otel` and unified error
handling. Function, Local Shell, MCP, and experimental `unified_exec`
all flow through this path; legacy shell aliases still work.
- Rationale: reduce per‑tool boilerplate, keep spec/handler in sync, and
make adding tools predictable and testable.
Example: `read_file`
- Spec: `core/src/tools/spec.rs` (see `create_read_file_tool`,
registered by `build_specs`).
- Handler: `core/src/tools/handlers/read_file.rs` (absolute `file_path`,
1‑indexed `offset`, `limit`, `L#: ` prefixes, safe truncation).
- E2E test: `core/tests/suite/read_file.rs` validates the tool returns
the requested lines.
## Next steps:
- Decompose `handle_container_exec_with_params`
- Add parallel tool calls
2025-10-03 13:21:06 +01:00
|
|
|
}
|