When using the OpenAI Responses API, we now record the `usage` field for
a `"response.completed"` event, which includes metrics about the number
of tokens consumed. We also introduce `openai_model_info.rs`, which
includes current data about the most common OpenAI models available via
the API (specifically `context_window` and `max_output_tokens`). If
Codex does not recognize the model, you can set `model_context_window`
and `model_max_output_tokens` explicitly in `config.toml`.
When then introduce a new event type to `protocol.rs`, `TokenCount`,
which includes the `TokenUsage` for the most recent turn.
Finally, we update the TUI to record the running sum of tokens used so
the percentage of available context window remaining can be reported via
the placeholder text for the composer:

We could certainly get much fancier with this (such as reporting the
estimated cost of the conversation), but for now, we are just trying to
achieve feature parity with the TypeScript CLI.
Though arguably this improves upon the TypeScript CLI, as the TypeScript
CLI uses heuristics to estimate the number of tokens used rather than
using the `usage` information directly:
296996d74e/codex-cli/src/utils/approximate-tokens-used.ts (L3-L16)
Fixes https://github.com/openai/codex/issues/1242
This PR reworks `assess_command_safety()` so that the combination of
`AskForApproval::Never` and `SandboxPolicy::DangerFullAccess` ensures
that commands are run without _any_ sandbox and the user should never be
prompted. In turn, it adds support for a new
`--dangerously-bypass-approvals-and-sandbox` flag (that cannot be used
with `--approval-policy` or `--full-auto`) that sets both of those
options.
Fixes https://github.com/openai/codex/issues/1254
This is a major redesign of how sandbox configuration works and aims to
fix https://github.com/openai/codex/issues/1248. Specifically, it
replaces `sandbox_permissions` in `config.toml` (and the
`-s`/`--sandbox-permission` CLI flags) with a "table" with effectively
three variants:
```toml
# Safest option: full disk is read-only, but writes and network access are disallowed.
[sandbox]
mode = "read-only"
# The cwd of the Codex task is writable, as well as $TMPDIR on macOS.
# writable_roots can be used to specify additional writable folders.
[sandbox]
mode = "workspace-write"
writable_roots = [] # Optional, defaults to the empty list.
network_access = false # Optional, defaults to false.
# Disable sandboxing: use at your own risk!!!
[sandbox]
mode = "danger-full-access"
```
This should make sandboxing easier to reason about. While we have
dropped support for `-s`, the way it works now is:
- no flags => `read-only`
- `--full-auto` => `workspace-write`
- currently, there is no way to specify `danger-full-access` via a CLI
flag, but we will revisit that as part of
https://github.com/openai/codex/issues/1254
Outstanding issue:
- As noted in the `TODO` on `SandboxPolicy::is_unrestricted()`, we are
still conflating sandbox preferences with approval preferences in that
case, which needs to be cleaned up.
Previous to this PR, we always set `reasoning` when making a request
using the Responses API:
d7245cbbc9/codex-rs/core/src/client.rs (L108-L111)
Though if you tried to use the Rust CLI with `--model gpt-4.1`, this
would fail with:
```shell
"Unsupported parameter: 'reasoning.effort' is not supported with this model."
```
We take a cue from the TypeScript CLI, which does a check on the model
name:
d7245cbbc9/codex-cli/src/utils/agent/agent-loop.ts (L786-L789)
This PR does a similar check, though also adds support for the following
config options:
```
model_reasoning_effort = "low" | "medium" | "high" | "none"
model_reasoning_summary = "auto" | "concise" | "detailed" | "none"
```
This way, if you have a model whose name happens to start with `"o"` (or
`"codex"`?), you can set these to `"none"` to explicitly disable
reasoning, if necessary. (That said, it seems unlikely anyone would use
the Responses API with non-OpenAI models, but we provide an escape
hatch, anyway.)
This PR also updates both the TUI and `codex exec` to show `reasoning
effort` and `reasoning summaries` in the header.
This PR introduces a `hide_agent_reasoning` config option (that defaults
to `false`) that users can enable to make the output less verbose by
suppressing reasoning output.
To test, verified that this includes agent reasoning in the output:
```
echo hello | just exec
```
whereas this does not:
```
echo hello | just exec --config hide_agent_reasoning=false
```
This required changing `ts_println!()` to take `$self:ident`, which is a
bit more verbose, but the usability improvement seems worth it.
Also eliminated an unnecessary `.to_string()` while here.
Fixes:
* Instantiate `EventProcessor` earlier in `lib.rs` so
`print_config_summary()` can be an instance method of it and leverage
its various `Style` fields to ensure it honors `with_ansi` properly.
* After printing the config summary, print out user's prompt with the
heading `User instructions:`. As noted in the comment, now that we can
read the instructions via stdin as of #1178, it is helpful to the user
to ensure they know what instructions were given to Codex.
* Use same colors/bold/italic settings for headers as the TUI, making
the output a bit easier to read.
This attempts to make `codex exec` more flexible in how the prompt can
be passed:
* as before, it can be passed as a single string argument
* if `-` is passed as the value, the prompt is read from stdin
* if no argument is passed _and stdin is a tty_, prints a warning to
stderr that no prompt was specified an exits non-zero.
* if no argument is passed _and stdin is NOT a tty_, prints `Reading
prompt from stdin...` to stderr to let the user know that Codex will
wait until it reads EOF from stdin to proceed. (You can repro this case
by doing `yes | just exec` since stdin is not a TTY in that case but it
also never reaches EOF).
The output of an MCP server tool call can be one of several types, but
to date, we treated all outputs as text by showing the serialized JSON
as the "tool output" in Codex:
25a9949c49/codex-rs/mcp-types/src/lib.rs (L96-L101)
This PR adds support for the `ImageContent` variant so we can now
display an image output from an MCP tool call.
In making this change, we introduce a new
`ResponseInputItem::McpToolCallOutput` variant so that we can work with
the `mcp_types::CallToolResult` directly when the function call is made
to an MCP server.
Though arguably the more significant change is the introduction of
`HistoryCell::CompletedMcpToolCallWithImageOutput`, which is a cell that
uses `ratatui_image` to render an image into the terminal. To support
this, we introduce `ImageRenderCache`, cache a
`ratatui_image::picker::Picker`, and `ensure_image_cache()` to cache the
appropriate scaled image data and dimensions based on the current
terminal size.
To test, I created a minimal `package.json`:
```json
{
"name": "kitty-mcp",
"version": "1.0.0",
"type": "module",
"description": "MCP that returns image of kitty",
"main": "index.js",
"dependencies": {
"@modelcontextprotocol/sdk": "^1.12.0"
}
}
```
with the following `index.js` to define the MCP server:
```js
#!/usr/bin/env node
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { readFile } from "node:fs/promises";
import { join } from "node:path";
const IMAGE_URI = "image://Ada.png";
const server = new McpServer({
name: "Demo",
version: "1.0.0",
});
server.tool(
"get-cat-image",
"If you need a cat image, this tool will provide one.",
async () => ({
content: [
{ type: "image", data: await getAdaPngBase64(), mimeType: "image/png" },
],
})
);
server.resource("Ada the Cat", IMAGE_URI, async (uri) => {
const base64Image = await getAdaPngBase64();
return {
contents: [
{
uri: uri.href,
mimeType: "image/png",
blob: base64Image,
},
],
};
});
async function getAdaPngBase64() {
const __dirname = new URL(".", import.meta.url).pathname;
// From 9705ce2c59/assets/Ada.png
const filePath = join(__dirname, "Ada.png");
const imageData = await readFile(filePath);
const base64Image = imageData.toString("base64");
return base64Image;
}
const transport = new StdioServerTransport();
await server.connect(transport);
```
With the local changes from this PR, I added the following to my
`config.toml`:
```toml
[mcp_servers.kitty]
command = "node"
args = ["/Users/mbolin/code/kitty-mcp/index.js"]
```
Running the TUI from source:
```
cargo run --bin codex -- --model o3 'I need a picture of a cat'
```
I get:
<img width="732" alt="image"
src="https://github.com/user-attachments/assets/bf80b721-9ca0-4d81-aec7-77d6899e2869"
/>
Now, that said, I have only tested in iTerm and there is definitely some
funny business with getting an accurate character-to-pixel ratio
(sometimes the `CompletedMcpToolCallWithImageOutput` thinks it needs 10
rows to render instead of 4), so there is still work to be done here.
This PR introduces support for `-c`/`--config` so users can override
individual config values on the command line using `--config
name=value`. Example:
```
codex --config model=o4-mini
```
Making it possible to set arbitrary config values on the command line
results in a more flexible configuration scheme and makes it easier to
provide single-line examples that can be copy-pasted from documentation.
Effectively, it means there are four levels of configuration for some
values:
- Default value (e.g., `model` currently defaults to `o4-mini`)
- Value in `config.toml` (e.g., user could override the default to be
`model = "o3"` in their `config.toml`)
- Specifying `-c` or `--config` to override `model` (e.g., user can
include `-c model=o3` in their list of args to Codex)
- If available, a config-specific flag can be used, which takes
precedence over `-c` (e.g., user can specify `--model o3` in their list
of args to Codex)
Now that it is possible to specify anything that could be configured in
`config.toml` on the command line using `-c`, we do not need to have a
custom flag for every possible config option (which can clutter the
output of `--help`). To that end, as part of this PR, we drop support
for the `--disable-response-storage` flag, as users can now specify `-c
disable_response_storage=true` to get the equivalent functionality.
Under the hood, this works by loading the `config.toml` into a
`toml::Value`. Then for each `key=value`, we create a small synthetic
TOML file with `value` so that we can run the TOML parser to get the
equivalent `toml::Value`. We then parse `key` to determine the point in
the original `toml::Value` to do the insert/replace. Once all of the
overrides from `-c` args have been applied, the `toml::Value` is
deserialized into a `ConfigToml` and then the `ConfigOverrides` are
applied, as before.
Historically, we spawned the Seatbelt and Landlock sandboxes in
substantially different ways:
For **Seatbelt**, we would run `/usr/bin/sandbox-exec` with our policy
specified as an arg followed by the original command:
d1de7bb383/codex-rs/core/src/exec.rs (L147-L219)
For **Landlock/Seccomp**, we would do
`tokio::runtime::Builder::new_current_thread()`, _invoke
Landlock/Seccomp APIs to modify the permissions of that new thread_, and
then spawn the command:
d1de7bb383/codex-rs/core/src/exec_linux.rs (L28-L49)
While it is neat that Landlock/Seccomp supports applying a policy to
only one thread without having to apply it to the entire process, it
requires us to maintain two different codepaths and is a bit harder to
reason about. The tipping point was
https://github.com/openai/codex/pull/1061, in which we had to start
building up the `env` in an unexpected way for the existing
Landlock/Seccomp approach to continue to work.
This PR overhauls things so that we do similar things for Mac and Linux.
It turned out that we were already building our own "helper binary"
comparable to Mac's `sandbox-exec` as part of the `cli` crate:
d1de7bb383/codex-rs/cli/Cargo.toml (L10-L12)
We originally created this to build a small binary to include with the
Node.js version of the Codex CLI to provide support for Linux
sandboxing.
Though the sticky bit is that, at this point, we still want to deploy
the Rust version of Codex as a single, standalone binary rather than a
CLI and a supporting sandboxing binary. To satisfy this goal, we use
"the arg0 trick," in which we:
* use `std::env::current_exe()` to get the path to the CLI that is
currently running
* use the CLI as the `program` for the `Command`
* set `"codex-linux-sandbox"` as arg0 for the `Command`
A CLI that supports sandboxing should check arg0 at the start of the
program. If it is `"codex-linux-sandbox"`, it must invoke
`codex_linux_sandbox::run_main()`, which runs the CLI as if it were
`codex-linux-sandbox`. When acting as `codex-linux-sandbox`, we make the
appropriate Landlock/Seccomp API calls and then use `execvp(3)` to spawn
the original command, so do _replace_ the process rather than spawn a
subprocess. Incidentally, we do this before starting the Tokio runtime,
so the process should only have one thread when `execvp(3)` is called.
Because the `core` crate that needs to spawn the Linux sandboxing is not
a CLI in its own right, this means that every CLI that includes `core`
and relies on this behavior has to (1) implement it and (2) provide the
path to the sandboxing executable. While the path is almost always
`std::env::current_exe()`, we needed to make this configurable for
integration tests, so `Config` now has a `codex_linux_sandbox_exe:
Option<PathBuf>` property to facilitate threading this through,
introduced in https://github.com/openai/codex/pull/1089.
This common pattern is now captured in
`codex_linux_sandbox::run_with_sandbox()` and all of the `main.rs`
functions that should use it have been updated as part of this PR.
The `codex-linux-sandbox` crate added to the Cargo workspace as part of
this PR now has the bulk of the Landlock/Seccomp logic, which makes
`core` a bit simpler. Indeed, `core/src/exec_linux.rs` and
`core/src/landlock.rs` were removed/ported as part of this PR. I also
moved the unit tests for this code into an integration test,
`linux-sandbox/tests/landlock.rs`, in which I use
`env!("CARGO_BIN_EXE_codex-linux-sandbox")` as the value for
`codex_linux_sandbox_exe` since `std::env::current_exe()` is not
appropriate in that case.
https://github.com/openai/codex/pull/1086 is a work-in-progress to make
Linux sandboxing work more like Seatbelt where, for the command we want
to sandbox, we build up the command and then hand it, and some sandbox
configuration flags, to another command to set up the sandbox and then
run it.
In the case of Seatbelt, macOS provides this helper binary and provides
it at `/usr/bin/sandbox-exec`. For Linux, we have to build our own and
pass it through (which is what #1086 does), so this makes the new
`codex_linux_sandbox_exe` available on `Config` so that it will later be
available in `exec.rs` when we need it in #1086.
Now the `exec` output starts with something like:
```
--------
workdir: /Users/mbolin/code/codex/codex-rs
model: o3
provider: openai
approval: Never
sandbox: SandboxPolicy { permissions: [DiskFullReadAccess, DiskWritePlatformUserTempFolder, DiskWritePlatformGlobalTempFolder, DiskWriteCwd, DiskWriteFolder { folder: "/Users/mbolin/.pyenv/shims" }] }
--------
```
which makes it easier to reason about when looking at logs.
This introduces an experimental `--output-last-message` flag that can be
used to identify a file where the final message from the agent will be
written. Two use cases:
- Ultimately, we will likely add a `--quiet` option to `exec`, but even
if the user does not want any output written to the terminal, they
probably want to know what the agent did. Writing the output to a file
makes it possible to get that information in a clean way.
- Relatedly, when using `exec` in CI, it is easier to review the
transcript written "normally," (i.e., not as JSON or something with
extra escapes), but getting programmatic access to the last message is
likely helpful, so writing the last message to a file gets the best of
both worlds.
I am calling this "experimental" because it is possible that we are
overfitting and will want a more general solution to this problem that
would justify removing this flag.
When I originally wrote `elapsed.rs`, I realized we were using both
`std::time` and `chrono` with no real benefit of having both. We should
try to keep the `exec` subcommand trim (as it also buildable as a
standalone executable), so this helps tighten things up.
This is a large change to support a "history" feature like you would
expect in a shell like Bash.
History events are recorded in `$CODEX_HOME/history.jsonl`. Because it
is a JSONL file, it is straightforward to append new entries (as opposed
to the TypeScript file that uses `$CODEX_HOME/history.json`, so to be
valid JSON, each new entry entails rewriting the entire file). Because
it is possible for there to be multiple instances of Codex CLI writing
to `history.jsonl` at once, we use advisory file locking when working
with `history.jsonl` in `codex-rs/core/src/message_history.rs`.
Because we believe history is a sufficiently useful feature, we enable
it by default. Though to provide some safety, we set the file
permissions of `history.jsonl` to be `o600` so that other users on the
system cannot read the user's history. We do not yet support a default
list of `SENSITIVE_PATTERNS` as the TypeScript CLI does:
3fdf9df133/codex-cli/src/utils/storage/command-history.ts (L10-L17)
We are going to take a more conservative approach to this list in the
Rust CLI. For example, while `/\b[A-Za-z0-9-_]{20,}\b/` might exclude
sensitive information like API tokens, it would also exclude valuable
information such as references to Git commits.
As noted in the updated documentation, users can opt-out of history by
adding the following to `config.toml`:
```toml
[history]
persistence = "none"
```
Because `history.jsonl` could, in theory, be quite large, we take a[n
arguably overly pedantic] approach in reading history entries into
memory. Specifically, we start by telling the client the current number
of entries in the history file (`history_entry_count`) as well as the
inode (`history_log_id`) of `history.jsonl` (see the new fields on
`SessionConfiguredEvent`).
The client is responsible for keeping new entries in memory to create a
"local history," but if the user hits up enough times to go "past" the
end of local history, then the client should use the new
`GetHistoryEntryRequest` in the protocol to fetch older entries.
Specifically, it should pass the `history_log_id` it was given
originally and work backwards from `history_entry_count`. (It should
really fetch history in batches rather than one-at-a-time, but that is
something we can improve upon in subsequent PRs.)
The motivation behind this crazy scheme is that it is designed to defend
against:
* The `history.jsonl` being truncated during the session such that the
index into the history is no longer consistent with what had been read
up to that point. We do not yet have logic to enforce a `max_bytes` for
`history.jsonl`, but once we do, we will aspire to implement it in a way
that should result in a new inode for the file on most systems.
* New items from concurrent Codex CLI sessions amending to the history.
Because, in absence of truncation, `history.jsonl` is an append-only
log, so long as the client reads backwards from `history_entry_count`,
it should always get a consistent view of history. (That said, it will
not be able to read _new_ commands from concurrent sessions, but perhaps
we will introduce a `/` command to reload latest history or something
down the road.)
Admittedly, my testing of this feature thus far has been fairly light. I
expect we will find bugs and introduce enhancements/fixes going forward.
For now, this removes the `#[non_exhaustive]` directive on `EventMsg` so
that we are forced to handle all `EventMsg` by default. (We may revisit
this if/when we publish `core/` as a `lib` crate.) For now, it is
helpful to have this as a forcing function because we have effectively
two UIs (`tui` and `exec`) and usually when we add a new variant to
`EventMsg`, we want to be sure that we update both.
https://github.com/openai/codex/pull/922 did this for the
`SessionConfigured` enum variant, and I think it is generally helpful to
be able to work with the values as each enum variant as their own type,
so this converts the remaining variants and updates all of the
callsites.
Added a simple unit test to verify that the JSON-serialized version of
`Event` does not have any unexpected nesting.
This introduces a much-needed "profile" concept where users can specify
a collection of options under one name and then pass that via
`--profile` to the CLI.
This PR introduces the `ConfigProfile` struct and makes it a field of
`CargoToml`. It further updates
`Config::load_from_base_config_with_overrides()` to respect
`ConfigProfile`, overriding default values where appropriate. A detailed
unit test is added at the end of `config.rs` to verify this behavior.
Details on how to use this feature have also been added to
`codex-rs/README.md`.
This is a substantial PR to add support for the chat completions API,
which in turn makes it possible to use non-OpenAI model providers (just
like in the TypeScript CLI):
* It moves a number of structs from `client.rs` to `client_common.rs` so
they can be shared.
* It introduces support for the chat completions API in
`chat_completions.rs`.
* It updates `ModelProviderInfo` so that `env_key` is `Option<String>`
instead of `String` (for e.g., ollama) and adds a `wire_api` field
* It updates `client.rs` to choose between `stream_responses()` and
`stream_chat_completions()` based on the `wire_api` for the
`ModelProviderInfo`
* It updates the `exec` and TUI CLIs to no longer fail if the
`OPENAI_API_KEY` environment variable is not set
* It updates the TUI so that `EventMsg::Error` is displayed more
prominently when it occurs, particularly now that it is important to
alert users to the `CodexErr::EnvVar` variant.
* `CodexErr::EnvVar` was updated to include an optional `instructions`
field so we can preserve the behavior where we direct users to
https://platform.openai.com if `OPENAI_API_KEY` is not set.
* Cleaned up the "welcome message" in the TUI to ensure the model
provider is displayed.
* Updated the docs in `codex-rs/README.md`.
To exercise the chat completions API from OpenAI models, I added the
following to my `config.toml`:
```toml
model = "gpt-4o"
model_provider = "openai-chat-completions"
[model_providers.openai-chat-completions]
name = "OpenAI using Chat Completions"
base_url = "https://api.openai.com/v1"
env_key = "OPENAI_API_KEY"
wire_api = "chat"
```
Though to test a non-OpenAI provider, I installed ollama with mistral
locally on my Mac because ChatGPT said that would be a good match for my
hardware:
```shell
brew install ollama
ollama serve
ollama pull mistral
```
Then I added the following to my `~/.codex/config.toml`:
```toml
model = "mistral"
model_provider = "ollama"
```
Note this code could certainly use more test coverage, but I want to get
this in so folks can start playing with it.
For reference, I believe https://github.com/openai/codex/pull/247 was
roughly the comparable PR on the TypeScript side.
Sets submodules to use workspace lints. Added denying unwrap as a
workspace level lint, which found a couple of cases where we could have
propagated errors. Also manually labeled ones that were fine by my eye.
This is the first step in supporting other model providers in the Rust
CLI. Specifically, this PR adds support for the new entries in `Config`
and `ConfigOverrides` to specify a `ModelProviderInfo`, which is the
basic config needed for an LLM provider. This PR does not get us all the
way there yet because `client.rs` still categorically appends
`/responses` to the URL and expects the endpoint to support the OpenAI
Responses API. Will fix that next!
Some effects of this change:
- New formatting changes across many files. No functionality changes
should occur from that.
- Calls to `set_env` are considered unsafe, since this only happens in
tests we wrap them in `unsafe` blocks
I started this PR because I wanted to share the `format_duration()`
utility function in `codex-rs/exec/src/event_processor.rs` with the TUI.
The question was: where to put it?
`core` should have as few dependencies as possible, so moving it there
would introduce a dependency on `chrono`, which seemed undesirable.
`core` already had this `cli` feature to deal with a similar situation
around sharing common utility functions, so I decided to:
* make `core` feature-free
* introduce `common`
* `common` can have as many "special interest" features as it needs,
each of which can declare their own deps
* the first two features of common are `cli` and `elapsed`
In practice, this meant updating a number of `Cargo.toml` files,
replacing this line:
```toml
codex-core = { path = "../core", features = ["cli"] }
```
with these:
```toml
codex-core = { path = "../core" }
codex-common = { path = "../common", features = ["cli"] }
```
Moving `format_duration()` into its own file gave it some "breathing
room" to add a unit test, so I had Codex generate some tests and new
support for durations over 1 minute.
https://github.com/openai/codex/pull/800 made `cwd` a property of
`Config` and made it so the `cwd` is not necessarily
`std::env::current_dir()`. As such, `is_inside_git_repo()` should check
`Config.cwd` rather than `std::env::current_dir()`.
This PR updates `is_inside_git_repo()` to take `Config` instead of an
arbitrary `PathBuf` to force the check to operate on a `Config` where
`cwd` has been resolved to what the user specified.
In order to expose Codex via an MCP server, I realized that we should be
taking `cwd` as a parameter rather than assuming
`std::env::current_dir()` as the `cwd`. Specifically, the user may want
to start a session in a directory other than the one where the MCP
server has been started.
This PR makes `cwd: PathBuf` a required field of `Session` and threads
it all the way through, though I think there is still an issue with not
honoring `workdir` for `apply_patch`, which is something we also had to
fix in the TypeScript version: https://github.com/openai/codex/pull/556.
This also adds `-C`/`--cd` to change the cwd via the command line.
To test, I ran:
```
cargo run --bin codex -- exec -C /tmp 'show the output of ls'
```
and verified it showed the contents of my `/tmp` folder instead of
`$PWD`.
Previous to this PR, `SandboxPolicy` was a bit difficult to work with:
237f8a11e1/codex-rs/core/src/protocol.rs (L98-L108)
Specifically:
* It was an `enum` and therefore options were mutually exclusive as
opposed to additive.
* It defined things in terms of what the agent _could not_ do as opposed
to what they _could_ do. This made things hard to support because we
would prefer to build up a sandbox config by starting with something
extremely restrictive and only granting permissions for things the user
as explicitly allowed.
This PR changes things substantially by redefining the policy in terms
of two concepts:
* A `SandboxPermission` enum that defines permissions that can be
granted to the agent/sandbox.
* A `SandboxPolicy` that internally stores a `Vec<SandboxPermission>`,
but externally exposes a simpler API that can be used to configure
Seatbelt/Landlock.
Previous to this PR, we supported a `--sandbox` flag that effectively
mapped to an enum value in `SandboxPolicy`. Though now that
`SandboxPolicy` is a wrapper around `Vec<SandboxPermission>`, the single
`--sandbox` flag no longer makes sense. While I could have turned it
into a flag that the user can specify multiple times, I think the
current values to use with such a flag are long and potentially messy,
so for the moment, I have dropped support for `--sandbox` altogether and
we can bring it back once we have figured out the naming thing.
Since `--sandbox` is gone, users now have to specify `--full-auto` to
get a sandbox that allows writes in `cwd`. Admittedly, there is no clean
way to specify the equivalent of `--full-auto` in your `config.toml`
right now, so we will have to revisit that, as well.
Because `Config` presents a `SandboxPolicy` field and `SandboxPolicy`
changed considerably, I had to overhaul how config loading works, as
well. There are now two distinct concepts, `ConfigToml` and `Config`:
* `ConfigToml` is the deserialization of `~/.codex/config.toml`. As one
might expect, every field is `Optional` and it is `#[derive(Deserialize,
Default)]`. Consistent use of `Optional` makes it clear what the user
has specified explicitly.
* `Config` is the "normalized config" and is produced by merging
`ConfigToml` with `ConfigOverrides`. Where `ConfigToml` contains a raw
`Option<Vec<SandboxPermission>>`, `Config` presents only the final
`SandboxPolicy`.
The changes to `core/src/exec.rs` and `core/src/linux.rs` merit extra
special attention to ensure we are faithfully mapping the
`SandboxPolicy` to the Seatbelt and Landlock configs, respectively.
Also, take note that `core/src/seatbelt_readonly_policy.sbpl` has been
renamed to `codex-rs/core/src/seatbelt_base_policy.sbpl` and that
`(allow file-read*)` has been removed from the `.sbpl` file as now this
is added to the policy in `core/src/exec.rs` when
`sandbox_policy.has_full_disk_read_access()` is `true`.
https://github.com/openai/codex/pull/642 introduced support for the
`--disable-response-storage` flag, but if you are a ZDR customer, it is
tedious to set this every time, so this PR makes it possible to set this
once in `config.toml` and be done with it.
Incidentally, this tidies things up such that now `init_codex()` takes
only one parameter: `Config`.
This changes how instantiating `Config` works and also adds
`approval_policy` and `sandbox_policy` as fields. The idea is:
* All fields of `Config` have appropriate default values.
* `Config` is initially loaded from `~/.codex/config.toml`, so values in
`config.toml` will override those defaults.
* Clients must instantiate `Config` via
`Config::load_with_overrides(ConfigOverrides)` where `ConfigOverrides`
has optional overrides that are expected to be settable based on CLI
flags.
The `Config` should be defined early in the program and then passed
down. Now functions like `init_codex()` take fewer individual parameters
because they can just take a `Config`.
Also, `Config::load()` used to fail silently if `~/.codex/config.toml`
had a parse error and fell back to the default config. This seemed
really bad because it wasn't clear why the values in my `config.toml`
weren't getting picked up. I changed things so that
`load_with_overrides()` returns `Result<Config>` and verified that the
various CLIs print a reasonable error if `config.toml` is malformed.
Finally, I also updated the TUI to show which **sandbox** value is being
used, as we do for other key values like **model** and **approval**.
This was also a reminder that the various values of `--sandbox` are
honored on Linux but not macOS today, so I added some TODOs about fixing
that.
This adds support for the `--disable-response-storage` flag across our
multiple Rust CLIs to support customers who have opted into Zero-Data
Retention (ZDR). The analogous changes to the TypeScript CLI were:
* https://github.com/openai/codex/pull/481
* https://github.com/openai/codex/pull/543
For a client using ZDR, `previous_response_id` will never be available,
so the `input` field of an API request must include the full transcript
of the conversation thus far. As such, this PR changes the type of
`Prompt.input` from `Vec<ResponseInputItem>` to `Vec<ResponseItem>`.
Practically speaking, `ResponseItem` was effectively a "superset" of
`ResponseInputItem` already. The main difference for us is that
`ResponseItem` includes the `FunctionCall` variant that we have to
include as part of the conversation history in the ZDR case.
Another key change in this PR is modifying `try_run_turn()` so that it
returns the `Vec<ResponseItem>` for the turn in addition to the
`Vec<ResponseInputItem>` produced by `try_run_turn()`. This is because
the caller of `run_turn()` needs to record the `Vec<ResponseItem>` when
ZDR is enabled.
To that end, this PR introduces `ZdrTranscript` (and adds
`zdr_transcript: Option<ZdrTranscript>` to `struct State` in `codex.rs`)
to take responsibility for maintaining the conversation transcript in
the ZDR case.
As stated in `codex-rs/README.md`:
Today, Codex CLI is written in TypeScript and requires Node.js 22+ to
run it. For a number of users, this runtime requirement inhibits
adoption: they would be better served by a standalone executable. As
maintainers, we want Codex to run efficiently in a wide range of
environments with minimal overhead. We also want to take advantage of
operating system-specific APIs to provide better sandboxing, where
possible.
To that end, we are moving forward with a Rust implementation of Codex
CLI contained in this folder, which has the following benefits:
- The CLI compiles to small, standalone, platform-specific binaries.
- Can make direct, native calls to
[seccomp](https://man7.org/linux/man-pages/man2/seccomp.2.html) and
[landlock](https://man7.org/linux/man-pages/man7/landlock.7.html) in
order to support sandboxing on Linux.
- No runtime garbage collection, resulting in lower memory consumption
and better, more predictable performance.
Currently, the Rust implementation is materially behind the TypeScript
implementation in functionality, so continue to use the TypeScript
implmentation for the time being. We will publish native executables via
GitHub Releases as soon as we feel the Rust version is usable.