2025-05-02 17:25:58 -07:00
|
|
|
[package]
|
2025-07-30 18:37:00 -07:00
|
|
|
edition = "2024"
|
2025-05-02 17:25:58 -07:00
|
|
|
name = "codex-mcp-server"
|
2025-05-07 10:08:06 -07:00
|
|
|
version = { workspace = true }
|
2025-05-02 17:25:58 -07:00
|
|
|
|
2025-05-14 13:15:41 -07:00
|
|
|
[[bin]]
|
|
|
|
|
name = "codex-mcp-server"
|
|
|
|
|
path = "src/main.rs"
|
|
|
|
|
|
|
|
|
|
[lib]
|
|
|
|
|
name = "codex_mcp_server"
|
|
|
|
|
path = "src/lib.rs"
|
|
|
|
|
|
2025-05-08 09:46:18 -07:00
|
|
|
[lints]
|
|
|
|
|
workspace = true
|
|
|
|
|
|
2025-05-02 17:25:58 -07:00
|
|
|
[dependencies]
|
fix: overhaul how we spawn commands under seccomp/landlock on Linux (#1086)
Historically, we spawned the Seatbelt and Landlock sandboxes in
substantially different ways:
For **Seatbelt**, we would run `/usr/bin/sandbox-exec` with our policy
specified as an arg followed by the original command:
https://github.com/openai/codex/blob/d1de7bb383552e8fadd94be79d65d188e00fd562/codex-rs/core/src/exec.rs#L147-L219
For **Landlock/Seccomp**, we would do
`tokio::runtime::Builder::new_current_thread()`, _invoke
Landlock/Seccomp APIs to modify the permissions of that new thread_, and
then spawn the command:
https://github.com/openai/codex/blob/d1de7bb383552e8fadd94be79d65d188e00fd562/codex-rs/core/src/exec_linux.rs#L28-L49
While it is neat that Landlock/Seccomp supports applying a policy to
only one thread without having to apply it to the entire process, it
requires us to maintain two different codepaths and is a bit harder to
reason about. The tipping point was
https://github.com/openai/codex/pull/1061, in which we had to start
building up the `env` in an unexpected way for the existing
Landlock/Seccomp approach to continue to work.
This PR overhauls things so that we do similar things for Mac and Linux.
It turned out that we were already building our own "helper binary"
comparable to Mac's `sandbox-exec` as part of the `cli` crate:
https://github.com/openai/codex/blob/d1de7bb383552e8fadd94be79d65d188e00fd562/codex-rs/cli/Cargo.toml#L10-L12
We originally created this to build a small binary to include with the
Node.js version of the Codex CLI to provide support for Linux
sandboxing.
Though the sticky bit is that, at this point, we still want to deploy
the Rust version of Codex as a single, standalone binary rather than a
CLI and a supporting sandboxing binary. To satisfy this goal, we use
"the arg0 trick," in which we:
* use `std::env::current_exe()` to get the path to the CLI that is
currently running
* use the CLI as the `program` for the `Command`
* set `"codex-linux-sandbox"` as arg0 for the `Command`
A CLI that supports sandboxing should check arg0 at the start of the
program. If it is `"codex-linux-sandbox"`, it must invoke
`codex_linux_sandbox::run_main()`, which runs the CLI as if it were
`codex-linux-sandbox`. When acting as `codex-linux-sandbox`, we make the
appropriate Landlock/Seccomp API calls and then use `execvp(3)` to spawn
the original command, so do _replace_ the process rather than spawn a
subprocess. Incidentally, we do this before starting the Tokio runtime,
so the process should only have one thread when `execvp(3)` is called.
Because the `core` crate that needs to spawn the Linux sandboxing is not
a CLI in its own right, this means that every CLI that includes `core`
and relies on this behavior has to (1) implement it and (2) provide the
path to the sandboxing executable. While the path is almost always
`std::env::current_exe()`, we needed to make this configurable for
integration tests, so `Config` now has a `codex_linux_sandbox_exe:
Option<PathBuf>` property to facilitate threading this through,
introduced in https://github.com/openai/codex/pull/1089.
This common pattern is now captured in
`codex_linux_sandbox::run_with_sandbox()` and all of the `main.rs`
functions that should use it have been updated as part of this PR.
The `codex-linux-sandbox` crate added to the Cargo workspace as part of
this PR now has the bulk of the Landlock/Seccomp logic, which makes
`core` a bit simpler. Indeed, `core/src/exec_linux.rs` and
`core/src/landlock.rs` were removed/ported as part of this PR. I also
moved the unit tests for this code into an integration test,
`linux-sandbox/tests/landlock.rs`, in which I use
`env!("CARGO_BIN_EXE_codex-linux-sandbox")` as the value for
`codex_linux_sandbox_exe` since `std::env::current_exe()` is not
appropriate in that case.
2025-05-23 11:37:07 -07:00
|
|
|
anyhow = "1"
|
2025-07-28 08:31:24 -07:00
|
|
|
codex-arg0 = { path = "../arg0" }
|
2025-08-20 20:36:34 -07:00
|
|
|
codex-common = { path = "../common", features = ["cli"] }
|
2025-05-06 17:38:56 -07:00
|
|
|
codex-core = { path = "../core" }
|
2025-08-17 10:03:52 -07:00
|
|
|
codex-login = { path = "../login" }
|
2025-08-18 09:36:57 -07:00
|
|
|
codex-protocol = { path = "../protocol" }
|
2025-05-02 17:25:58 -07:00
|
|
|
mcp-types = { path = "../mcp-types" }
|
2025-05-05 07:16:19 -07:00
|
|
|
schemars = "0.8.22"
|
2025-05-02 17:25:58 -07:00
|
|
|
serde = { version = "1", features = ["derive"] }
|
|
|
|
|
serde_json = "1"
|
2025-07-19 01:32:03 -04:00
|
|
|
shlex = "1.3.0"
|
2025-07-30 18:37:00 -07:00
|
|
|
strum_macros = "0.27.2"
|
2025-05-02 17:25:58 -07:00
|
|
|
tokio = { version = "1", features = [
|
|
|
|
|
"io-std",
|
|
|
|
|
"macros",
|
|
|
|
|
"process",
|
|
|
|
|
"rt-multi-thread",
|
|
|
|
|
"signal",
|
|
|
|
|
] }
|
2025-07-30 18:37:00 -07:00
|
|
|
toml = "0.9"
|
|
|
|
|
tracing = { version = "0.1.41", features = ["log"] }
|
|
|
|
|
tracing-subscriber = { version = "0.3", features = ["env-filter", "fmt"] }
|
2025-07-21 21:01:56 -07:00
|
|
|
uuid = { version = "1", features = ["serde", "v4"] }
|
2025-05-05 07:16:19 -07:00
|
|
|
|
|
|
|
|
[dev-dependencies]
|
test: add integration test for MCP server (#1633)
This PR introduces a single integration test for `cargo mcp`, though it
also introduces a number of reusable components so that it should be
easier to introduce more integration tests going forward.
The new test is introduced in `codex-rs/mcp-server/tests/elicitation.rs`
and the reusable pieces are in `codex-rs/mcp-server/tests/common`.
The test itself verifies new functionality around elicitations
introduced in https://github.com/openai/codex/pull/1623 (and the fix
introduced in https://github.com/openai/codex/pull/1629) by doing the
following:
- starts a mock model provider with canned responses for
`/v1/chat/completions`
- starts the MCP server with a `config.toml` to use that model provider
(and `approval_policy = "untrusted"`)
- sends the `codex` tool call which causes the mock model provider to
request a shell call for `git init`
- the MCP server sends an elicitation to the client to approve the
request
- the client replies to the elicitation with `"approved"`
- the MCP server runs the command and re-samples the model, getting a
`"finish_reason": "stop"`
- in turn, the MCP server sends the final response to the original
`codex` tool call
- verifies that `git init` ran as expected
To test:
```
cargo test shell_command_approval_triggers_elicitation
```
In writing this test, I discovered that `ExecApprovalResponse` does not
conform to `ElicitResult`, so I added a TODO to fix that, since I think
that should be updated in a separate PR. As it stands, this PR does not
update any business logic, though it does make a number of members of
the `mcp-server` crate `pub` so they can be used in the test.
One additional learning from this PR is that
`std::process::Command::cargo_bin()` from the `assert_cmd` trait is only
available for `std::process::Command`, but we really want to use
`tokio::process::Command` so that everything is async and we can
leverage utilities like `tokio::time::timeout()`. The trick I came up
with was to use `cargo_bin()` to locate the program, and then to use
`std::process::Command::get_program()` when constructing the
`tokio::process::Command`.
2025-07-21 10:27:07 -07:00
|
|
|
assert_cmd = "2"
|
2025-07-24 12:19:46 -07:00
|
|
|
mcp_test_support = { path = "tests/common" }
|
2025-05-05 07:16:19 -07:00
|
|
|
pretty_assertions = "1.4.1"
|
test: add integration test for MCP server (#1633)
This PR introduces a single integration test for `cargo mcp`, though it
also introduces a number of reusable components so that it should be
easier to introduce more integration tests going forward.
The new test is introduced in `codex-rs/mcp-server/tests/elicitation.rs`
and the reusable pieces are in `codex-rs/mcp-server/tests/common`.
The test itself verifies new functionality around elicitations
introduced in https://github.com/openai/codex/pull/1623 (and the fix
introduced in https://github.com/openai/codex/pull/1629) by doing the
following:
- starts a mock model provider with canned responses for
`/v1/chat/completions`
- starts the MCP server with a `config.toml` to use that model provider
(and `approval_policy = "untrusted"`)
- sends the `codex` tool call which causes the mock model provider to
request a shell call for `git init`
- the MCP server sends an elicitation to the client to approve the
request
- the client replies to the elicitation with `"approved"`
- the MCP server runs the command and re-samples the model, getting a
`"finish_reason": "stop"`
- in turn, the MCP server sends the final response to the original
`codex` tool call
- verifies that `git init` ran as expected
To test:
```
cargo test shell_command_approval_triggers_elicitation
```
In writing this test, I discovered that `ExecApprovalResponse` does not
conform to `ElicitResult`, so I added a TODO to fix that, since I think
that should be updated in a separate PR. As it stands, this PR does not
update any business logic, though it does make a number of members of
the `mcp-server` crate `pub` so they can be used in the test.
One additional learning from this PR is that
`std::process::Command::cargo_bin()` from the `assert_cmd` trait is only
available for `std::process::Command`, but we really want to use
`tokio::process::Command` so that everything is async and we can
leverage utilities like `tokio::time::timeout()`. The trick I came up
with was to use `cargo_bin()` to locate the program, and then to use
`std::process::Command::get_program()` when constructing the
`tokio::process::Command`.
2025-07-21 10:27:07 -07:00
|
|
|
tempfile = "3"
|
|
|
|
|
tokio-test = "0.4"
|
|
|
|
|
wiremock = "0.6"
|