Previously, running Codex as an MCP server required a standalone binary
in our Cargo workspace, but this PR makes it available as a subcommand
(`mcp`) of the main CLI.
Ran this with:
```
RUST_LOG=debug npx @modelcontextprotocol/inspector cargo run --bin codex -- mcp
```
and verified it worked as expected in the inspector at
`http://127.0.0.1:6274/`.
Introduces support for slash commands like in the TypeScript CLI. We do
not support the full set of commands yet, but the core abstraction is
there now.
In particular, we have a `SlashCommand` enum and due to thoughtful use
of the [strum](https://crates.io/crates/strum) crate, it requires
minimal boilerplate to add a new command to the list.
The key new piece of UI is `CommandPopup`, though the keyboard events
are still handled by `ChatComposer`. The behavior is roughly as follows:
* if the first character in the composer is `/`, the command popup is
displayed (if you really want to send a message to Codex that starts
with a `/`, simply put a space before the `/`)
* while the popup is displayed, up/down can be used to change the
selection of the popup
* if there is a selection, hitting tab completes the command, but does
not send it
* if there is a selection, hitting enter sends the command
* if the prefix of the composer matches a command, the command will be
visible in the popup so the user can see the description (commands could
take arguments, so additional text may appear after the command name
itself)
https://github.com/user-attachments/assets/39c3e6ee-eeb7-4ef7-a911-466d8184975f
Incidentally, Codex wrote almost all the code for this PR!
* update `SessionConfigured` event to include the UUID for the session
* show the UUID in the Rust TUI
* use local timestamps in log files instead of UTC
* include timestamps in log file names for easier discovery
This introduces a much-needed "profile" concept where users can specify
a collection of options under one name and then pass that via
`--profile` to the CLI.
This PR introduces the `ConfigProfile` struct and makes it a field of
`CargoToml`. It further updates
`Config::load_from_base_config_with_overrides()` to respect
`ConfigProfile`, overriding default values where appropriate. A detailed
unit test is added at the end of `config.rs` to verify this behavior.
Details on how to use this feature have also been added to
`codex-rs/README.md`.
This adds support for saving transcripts when using the Rust CLI. Like
the TypeScript CLI, it saves the transcript to `~/.codex/sessions`,
though it uses JSONL for the file format (and `.jsonl` for the file
extension) so that even if Codex crashes, what was written to the
`.jsonl` file should generally still be valid JSONL content.
This introduces the use of the `tui-markdown` crate to parse an
assistant message as Markdown and style it using ANSI for a better user
experience. As shown in the screenshot below, it has support for syntax
highlighting for _tagged_ fenced code blocks:
<img width="907" alt="image"
src="https://github.com/user-attachments/assets/900dc229-80bb-46e8-b1bb-efee4c70ba3c"
/>
That said, `tui-markdown` is not as configurable (or stylish!) as
https://www.npmjs.com/package/marked-terminal, which is what we use in
the TypeScript CLI. In particular:
* The styles are hardcoded and `tui_markdown::from_str()` does not take
any options whatsoever. It uses "bold white" for inline code style which
does not stand out as much as the yellow used by `marked-terminal`:
65402cbda7/tui-markdown/src/lib.rs (L464)
I asked Codex to take a first pass at this and it came up with:
https://github.com/joshka/tui-markdown/pull/80
* If a fenced code block is not tagged, then it does not get
highlighted. I would rather add some logic here:
65402cbda7/tui-markdown/src/lib.rs (L262)
that uses something like https://pypi.org/project/guesslang/ to examine
the value of `text` and try to use the appropriate syntax highlighter.
* When we have a fenced code block, we do not want to show the opening
and closing triple backticks in the output.
To unblock ourselves, we might want to bundle our own fork of
`tui-markdown` temporarily until we figure out what the shape of the API
should be and then try to upstream it.
I started this PR because I wanted to share the `format_duration()`
utility function in `codex-rs/exec/src/event_processor.rs` with the TUI.
The question was: where to put it?
`core` should have as few dependencies as possible, so moving it there
would introduce a dependency on `chrono`, which seemed undesirable.
`core` already had this `cli` feature to deal with a similar situation
around sharing common utility functions, so I decided to:
* make `core` feature-free
* introduce `common`
* `common` can have as many "special interest" features as it needs,
each of which can declare their own deps
* the first two features of common are `cli` and `elapsed`
In practice, this meant updating a number of `Cargo.toml` files,
replacing this line:
```toml
codex-core = { path = "../core", features = ["cli"] }
```
with these:
```toml
codex-core = { path = "../core" }
codex-common = { path = "../common", features = ["cli"] }
```
Moving `format_duration()` into its own file gave it some "breathing
room" to add a unit test, so I had Codex generate some tests and new
support for durations over 1 minute.
This adds initial support for MCP servers in the style of Claude Desktop
and Cursor. Note this PR is the bare minimum to get things working end
to end: all configured MCP servers are launched every time Codex is run,
there is no recovery for MCP servers that crash, etc.
(Also, I took some shortcuts to change some fields of `Session` to be
`pub(crate)`, which also means there are circular deps between
`codex.rs` and `mcp_tool_call.rs`, but I will clean that up in a
subsequent PR.)
`codex-rs/README.md` is updated as part of this PR to explain how to use
this feature. There is a bit of plumbing to route the new settings from
`Config` to the business logic in `codex.rs`. The most significant
chunks for new code are in `mcp_connection_manager.rs` (which defines
the `McpConnectionManager` struct) and `mcp_tool_call.rs`, which is
responsible for tool calls.
This PR also introduces new `McpToolCallBegin` and `McpToolCallEnd`
event types to the protocol, but does not add any handlers for them.
(See https://github.com/openai/codex/pull/836 for initial usage.)
To test, I added the following to my `~/.codex/config.toml`:
```toml
# Local build of https://github.com/hideya/mcp-server-weather-js
[mcp_servers.weather]
command = "/Users/mbolin/code/mcp-server-weather-js/dist/index.js"
args = []
```
And then I ran the following:
```
codex-rs$ cargo run --bin codex exec 'what is the weather in san francisco'
[2025-05-06T22:40:05] Task started: 1
[2025-05-06T22:40:18] Agent message: Here’s the latest National Weather Service forecast for San Francisco (downtown, near 37.77° N, 122.42° W):
This Afternoon (Tue):
• Sunny, high near 69 °F
• West-southwest wind around 12 mph
Tonight:
• Partly cloudy, low around 52 °F
• SW wind 7–10 mph
...
```
Note that Codex itself is not able to make network calls, so it would
not normally be able to get live weather information like this. However,
the weather MCP is [currently] not run under the Codex sandbox, so it is
able to hit `api.weather.gov` and fetch current weather information.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/829).
* #836
* __->__ #829
Cleans up the signature for `new_stdio_client()` to more closely mirror
how MCP servers are declared in config files (`command`, `args`, `env`).
Also takes a cue from Claude Code where the MCP server is launched with
a restricted `env` so that it only includes "safe" things like `USER`
and `PATH` (see the `create_env_for_mcp_server()` function introduced in
this PR for details) by default, as it is common for developers to have
sensitive API keys present in their environment that should only be
forwarded to the MCP server when the user has explicitly configured it
to do so.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/831).
* #829
* __->__ #831
This PR introduces an initial `McpClient` that we will use to give Codex
itself programmatic access to foreign MCPs. This does not wire it up in
Codex itself yet, but the new `mcp-client` crate includes a `main.rs`
for basic testing for now.
Manually tested by sending a `tools/list` request to Codex's own MCP
server:
```
codex-rs$ cargo build
codex-rs$ cargo run --bin codex-mcp-client ./target/debug/codex-mcp-server
{
"tools": [
{
"description": "Run a Codex session. Accepts configuration parameters matching the Codex Config struct.",
"inputSchema": {
"properties": {
"approval-policy": {
"description": "Execution approval policy expressed as the kebab-case variant name (`unless-allow-listed`, `auto-edit`, `on-failure`, `never`).",
"enum": [
"auto-edit",
"unless-allow-listed",
"on-failure",
"never"
],
"type": "string"
},
"cwd": {
"description": "Working directory for the session. If relative, it is resolved against the server process's current working directory.",
"type": "string"
},
"disable-response-storage": {
"description": "Disable server-side response storage.",
"type": "boolean"
},
"model": {
"description": "Optional override for the model name (e.g. \"o3\", \"o4-mini\")",
"type": "string"
},
"prompt": {
"description": "The *initial user prompt* to start the Codex conversation.",
"type": "string"
},
"sandbox-permissions": {
"description": "Sandbox permissions using the same string values accepted by the CLI (e.g. \"disk-write-cwd\", \"network-full-access\").",
"items": {
"enum": [
"disk-full-read-access",
"disk-write-cwd",
"disk-write-platform-user-temp-folder",
"disk-write-platform-global-temp-folder",
"disk-full-write-access",
"network-full-access"
],
"type": "string"
},
"type": "array"
}
},
"required": [
"prompt"
],
"type": "object"
},
"name": "codex"
}
]
}
```
This PR replaces the placeholder `"echo"` tool call in the MCP server
with a `"codex"` tool that calls Codex. Events such as
`ExecApprovalRequest` and `ApplyPatchApprovalRequest` are not handled
properly yet, but I have `approval_policy = "never"` set in my
`~/.codex/config.toml` such that those codepaths are not exercised.
The schema for this MPC tool is defined by a new `CodexToolCallParam`
struct introduced in this PR. It is fairly similar to `ConfigOverrides`,
as the param is used to help create the `Config` used to start the Codex
session, though it also includes the `prompt` used to kick off the
session.
This PR also introduces the use of the third-party `schemars` crate to
generate the JSON schema, which is verified in the
`verify_codex_tool_json_schema()` unit test.
Events that are dispatched during the Codex session are sent back to the
MCP client as MCP notifications. This gives the client a way to monitor
progress as the tool call itself may take minutes to complete depending
on the complexity of the task requested by the user.
In the video below, I launched the server via:
```shell
mcp-server$ RUST_LOG=debug npx @modelcontextprotocol/inspector cargo run --
```
In the video, you can see the flow of:
* requesting the list of tools
* choosing the **codex** tool
* entering a value for **prompt** and then making the tool call
Note that I left the other fields blank because when unspecified, the
values in my `~/.codex/config.toml` were used:
https://github.com/user-attachments/assets/1975058c-b004-43ef-8c8d-800a953b8192
Note that while using the inspector, I did run into
https://github.com/modelcontextprotocol/inspector/issues/293, though the
tip about ensuring I had only one instance of the **MCP Inspector** tab
open in my browser seemed to fix things.
This adds our own `mcp-types` crate to our Cargo workspace. We vendor in
the
[`2025-03-26/schema.json`](05f2045136/schema/2025-03-26/schema.json)
from the MCP repo and introduce a `generate_mcp_types.py` script to
codegen the `lib.rs` from the JSON schema.
Test coverage is currently light, but I plan to refine things as we
start making use of this crate.
And yes, I am aware that
https://github.com/modelcontextprotocol/rust-sdk exists, though the
published https://crates.io/crates/rmcp appears to be a competing
effort. While things are up in the air, it seems better for us to
control our own version of this code.
Incidentally, Codex did a lot of the work for this PR. I told it to
never edit `lib.rs` directly and instead to update
`generate_mcp_types.py` and then re-run it to update `lib.rs`. It
followed these instructions and once things were working end-to-end, I
iteratively asked for changes to the tests until the API looked
reasonable (and the code worked). Codex was responsible for figuring out
what to do to `generate_mcp_types.py` to achieve the requested test/API
changes.
For now, keep things simple such that we never update the `version` in
the `Cargo.toml` for the workspace root on the `main` branch. Instead,
create a new branch for a release, push one commit that updates the
`version`, and then tag that branch to kick off a release.
To test, I ran this script and created this release job:
https://github.com/openai/codex/actions/runs/14762580641
The generated DotSlash file has URLs that refer to
`https://github.com/openai/codex/releases/`, so let's set
`prerelease:false` (but keep `draft:true` for now) so those URLs should
work.
Also updated `version` in Cargo workspace so I will kick off a build
once this lands.
@oai-ragona and I discussed it, and we feel the REPL crate has served
its purpose, so we're going to delete the code and future archaeologists
can find it in Git history.
Apparently I made two key mistakes in
https://github.com/openai/codex/pull/740 (fixed in this PR):
* I forgot to redefine `$dest` in the `Stage Linux-only artifacts` step
* I did not define the `if` check correctly in the `Stage Linux-only
artifacts` step
This fixes both of those issues and bumps the workspace version to
`0.0.2504292006` in preparation for another release attempt.
Originally, the `interactive` crate was going to be a placeholder for
building out a UX that was comparable to that of the existing TypeScript
CLI. Though after researching how Ratatui works, that seems difficult to
do because it is designed around the idea that it will redraw the full
screen buffer each time (and so any scrolling should be "internal" to
your Ratatui app) whereas the TypeScript CLI expects to render the full
history of the conversation every time(*) (which is why you can use your
terminal scrollbar to scroll it).
While it is possible to use Ratatui in a way that acts more like what
the TypeScript CLI is doing, it is awkward and seemingly results in
tedious code, so I think we should abandon that approach. As such, this
PR deletes the `interactive/` folder and the code that depended on it.
Further, since we added support for mousewheel scrolling in the TUI in
https://github.com/openai/codex/pull/641, it certainly feels much better
and the need for scroll support via the terminal scrollbar is greatly
diminished. This is now a more appropriate default UX for the
"multitool" CLI.
(*) Incidentally, I haven't verified this, but I think this results in
O(N^2) work in rendering, which seems potentially problematic for long
conversations.
In putting up https://github.com/openai/codex/pull/665, I discovered
that the `expanduser` crate does not compile on Windows. Looking into
it, we do not seem to need it because we were only using it with a value
that was passed in via a command-line flag, so the shell expands `~` for
us before we see it, anyway. (I changed the type in `Cli` from `String`
to `PathBuf`, to boot.)
If we do need this sort of functionality in the future,
https://docs.rs/shellexpand/latest/shellexpand/fn.tilde.html seems
promising.
As described in detail in `codex-rs/execpolicy/README.md` introduced in
this PR, `execpolicy` is a tool that lets you define a set of _patterns_
used to match [`execv(3)`](https://linux.die.net/man/3/execv)
invocations. When a pattern is matched, `execpolicy` returns the parsed
version in a structured form that is amenable to static analysis.
The primary use case is to define patterns match commands that should be
auto-approved by a tool such as Codex. This supports a richer pattern
matching mechanism that the sort of prefix-matching we have done to
date, e.g.:
5e40d9d221/codex-cli/src/approvals.ts (L333-L354)
Note we are still playing with the API and the `system_path` option in
particular still needs some work.
As stated in `codex-rs/README.md`:
Today, Codex CLI is written in TypeScript and requires Node.js 22+ to
run it. For a number of users, this runtime requirement inhibits
adoption: they would be better served by a standalone executable. As
maintainers, we want Codex to run efficiently in a wide range of
environments with minimal overhead. We also want to take advantage of
operating system-specific APIs to provide better sandboxing, where
possible.
To that end, we are moving forward with a Rust implementation of Codex
CLI contained in this folder, which has the following benefits:
- The CLI compiles to small, standalone, platform-specific binaries.
- Can make direct, native calls to
[seccomp](https://man7.org/linux/man-pages/man2/seccomp.2.html) and
[landlock](https://man7.org/linux/man-pages/man7/landlock.7.html) in
order to support sandboxing on Linux.
- No runtime garbage collection, resulting in lower memory consumption
and better, more predictable performance.
Currently, the Rust implementation is materially behind the TypeScript
implementation in functionality, so continue to use the TypeScript
implmentation for the time being. We will publish native executables via
GitHub Releases as soon as we feel the Rust version is usable.