2025-05-13 16:52:52 -07:00
|
|
|
use crate::config_profile::ConfigProfile;
|
2025-05-20 11:55:25 -07:00
|
|
|
use crate::config_types::History;
|
|
|
|
|
use crate::config_types::McpServerConfig;
|
feat: introduce support for shell_environment_policy in config.toml (#1061)
To date, when handling `shell` and `local_shell` tool calls, we were
spawning new processes using the environment inherited from the Codex
process itself. This means that the sensitive `OPENAI_API_KEY` that
Codex needs to talk to OpenAI models was made available to everything
run by `shell` and `local_shell`. While there are cases where that might
be useful, it does not seem like a good default.
This PR introduces a complex `shell_environment_policy` config option to
control the `env` used with these tool calls. It is inevitably a bit
complex so that it is possible to override individual components of the
policy so without having to restate the entire thing.
Details are in the updated `README.md` in this PR, but here is the
relevant bit that explains the individual fields of
`shell_environment_policy`:
| Field | Type | Default | Description |
| ------------------------- | -------------------------- | ------- |
-----------------------------------------------------------------------------------------------------------------------------------------------
|
| `inherit` | string | `core` | Starting template for the
environment:<br>`core` (`HOME`, `PATH`, `USER`, …), `all` (clone full
parent env), or `none` (start empty). |
| `ignore_default_excludes` | boolean | `false` | When `false`, Codex
removes any var whose **name** contains `KEY`, `SECRET`, or `TOKEN`
(case-insensitive) before other rules run. |
| `exclude` | array<string> | `[]` | Case-insensitive glob
patterns to drop after the default filter.<br>Examples: `"AWS_*"`,
`"AZURE_*"`. |
| `set` | table<string,string> | `{}` | Explicit key/value
overrides or additions – always win over inherited values. |
| `include_only` | array<string> | `[]` | If non-empty, a
whitelist of patterns; only variables that match _one_ pattern survive
the final step. (Generally used with `inherit = "all"`.) |
In particular, note that the default is `inherit = "core"`, so:
* if you have extra env variables that you want to inherit from the
parent process, use `inherit = "all"` and then specify `include_only`
* if you have extra env variables where you want to hardcode the values,
the default `inherit = "core"` will work fine, but then you need to
specify `set`
This configuration is not battle-tested, so we will probably still have
to play with it a bit. `core/src/exec_env.rs` has the critical business
logic as well as unit tests.
Though if nothing else, previous to this change:
```
$ cargo run --bin codex -- debug seatbelt -- printenv OPENAI_API_KEY
# ...prints OPENAI_API_KEY...
```
But after this change it does not print anything (as desired).
One final thing to call out about this PR is that the
`configure_command!` macro we use in `core/src/exec.rs` has to do some
complex logic with respect to how it builds up the `env` for the process
being spawned under Landlock/seccomp. Specifically, doing
`cmd.env_clear()` followed by `cmd.envs(&$env_map)` (which is arguably
the most intuitive way to do it) caused the Landlock unit tests to fail
because the processes spawned by the unit tests started failing in
unexpected ways! If we forgo `env_clear()` in favor of updating env vars
one at a time, the tests still pass. The comment in the code talks about
this a bit, and while I would like to investigate this more, I need to
move on for the moment, but I do plan to come back to it to fully
understand what is going on. For example, this suggests that we might
not be able to spawn a C program that calls `env_clear()`, which would
be...weird. We may still have to fiddle with our Landlock config if that
is the case.
2025-05-22 09:51:19 -07:00
|
|
|
use crate::config_types::ShellEnvironmentPolicy;
|
|
|
|
|
use crate::config_types::ShellEnvironmentPolicyToml;
|
2025-05-20 11:55:25 -07:00
|
|
|
use crate::config_types::Tui;
|
|
|
|
|
use crate::config_types::UriBasedFileOpener;
|
2025-04-27 21:47:50 -07:00
|
|
|
use crate::flags::OPENAI_DEFAULT_MODEL;
|
2025-05-07 17:38:28 -07:00
|
|
|
use crate::model_provider_info::ModelProviderInfo;
|
|
|
|
|
use crate::model_provider_info::built_in_model_providers;
|
2025-04-27 21:47:50 -07:00
|
|
|
use crate::protocol::AskForApproval;
|
fix: overhaul SandboxPolicy and config loading in Rust (#732)
Previous to this PR, `SandboxPolicy` was a bit difficult to work with:
https://github.com/openai/codex/blob/237f8a11e11fdcc793a09e787e48215676d9b95b/codex-rs/core/src/protocol.rs#L98-L108
Specifically:
* It was an `enum` and therefore options were mutually exclusive as
opposed to additive.
* It defined things in terms of what the agent _could not_ do as opposed
to what they _could_ do. This made things hard to support because we
would prefer to build up a sandbox config by starting with something
extremely restrictive and only granting permissions for things the user
as explicitly allowed.
This PR changes things substantially by redefining the policy in terms
of two concepts:
* A `SandboxPermission` enum that defines permissions that can be
granted to the agent/sandbox.
* A `SandboxPolicy` that internally stores a `Vec<SandboxPermission>`,
but externally exposes a simpler API that can be used to configure
Seatbelt/Landlock.
Previous to this PR, we supported a `--sandbox` flag that effectively
mapped to an enum value in `SandboxPolicy`. Though now that
`SandboxPolicy` is a wrapper around `Vec<SandboxPermission>`, the single
`--sandbox` flag no longer makes sense. While I could have turned it
into a flag that the user can specify multiple times, I think the
current values to use with such a flag are long and potentially messy,
so for the moment, I have dropped support for `--sandbox` altogether and
we can bring it back once we have figured out the naming thing.
Since `--sandbox` is gone, users now have to specify `--full-auto` to
get a sandbox that allows writes in `cwd`. Admittedly, there is no clean
way to specify the equivalent of `--full-auto` in your `config.toml`
right now, so we will have to revisit that, as well.
Because `Config` presents a `SandboxPolicy` field and `SandboxPolicy`
changed considerably, I had to overhaul how config loading works, as
well. There are now two distinct concepts, `ConfigToml` and `Config`:
* `ConfigToml` is the deserialization of `~/.codex/config.toml`. As one
might expect, every field is `Optional` and it is `#[derive(Deserialize,
Default)]`. Consistent use of `Optional` makes it clear what the user
has specified explicitly.
* `Config` is the "normalized config" and is produced by merging
`ConfigToml` with `ConfigOverrides`. Where `ConfigToml` contains a raw
`Option<Vec<SandboxPermission>>`, `Config` presents only the final
`SandboxPolicy`.
The changes to `core/src/exec.rs` and `core/src/linux.rs` merit extra
special attention to ensure we are faithfully mapping the
`SandboxPolicy` to the Seatbelt and Landlock configs, respectively.
Also, take note that `core/src/seatbelt_readonly_policy.sbpl` has been
renamed to `codex-rs/core/src/seatbelt_base_policy.sbpl` and that
`(allow file-read*)` has been removed from the `.sbpl` file as now this
is added to the policy in `core/src/exec.rs` when
`sandbox_policy.has_full_disk_read_access()` is `true`.
2025-04-29 15:01:16 -07:00
|
|
|
use crate::protocol::SandboxPermission;
|
2025-04-27 21:47:50 -07:00
|
|
|
use crate::protocol::SandboxPolicy;
|
feat: initial import of Rust implementation of Codex CLI in codex-rs/ (#629)
As stated in `codex-rs/README.md`:
Today, Codex CLI is written in TypeScript and requires Node.js 22+ to
run it. For a number of users, this runtime requirement inhibits
adoption: they would be better served by a standalone executable. As
maintainers, we want Codex to run efficiently in a wide range of
environments with minimal overhead. We also want to take advantage of
operating system-specific APIs to provide better sandboxing, where
possible.
To that end, we are moving forward with a Rust implementation of Codex
CLI contained in this folder, which has the following benefits:
- The CLI compiles to small, standalone, platform-specific binaries.
- Can make direct, native calls to
[seccomp](https://man7.org/linux/man-pages/man2/seccomp.2.html) and
[landlock](https://man7.org/linux/man-pages/man7/landlock.7.html) in
order to support sandboxing on Linux.
- No runtime garbage collection, resulting in lower memory consumption
and better, more predictable performance.
Currently, the Rust implementation is materially behind the TypeScript
implementation in functionality, so continue to use the TypeScript
implmentation for the time being. We will publish native executables via
GitHub Releases as soon as we feel the Rust version is usable.
2025-04-24 13:31:40 -07:00
|
|
|
use dirs::home_dir;
|
|
|
|
|
use serde::Deserialize;
|
feat: support mcp_servers in config.toml (#829)
This adds initial support for MCP servers in the style of Claude Desktop
and Cursor. Note this PR is the bare minimum to get things working end
to end: all configured MCP servers are launched every time Codex is run,
there is no recovery for MCP servers that crash, etc.
(Also, I took some shortcuts to change some fields of `Session` to be
`pub(crate)`, which also means there are circular deps between
`codex.rs` and `mcp_tool_call.rs`, but I will clean that up in a
subsequent PR.)
`codex-rs/README.md` is updated as part of this PR to explain how to use
this feature. There is a bit of plumbing to route the new settings from
`Config` to the business logic in `codex.rs`. The most significant
chunks for new code are in `mcp_connection_manager.rs` (which defines
the `McpConnectionManager` struct) and `mcp_tool_call.rs`, which is
responsible for tool calls.
This PR also introduces new `McpToolCallBegin` and `McpToolCallEnd`
event types to the protocol, but does not add any handlers for them.
(See https://github.com/openai/codex/pull/836 for initial usage.)
To test, I added the following to my `~/.codex/config.toml`:
```toml
# Local build of https://github.com/hideya/mcp-server-weather-js
[mcp_servers.weather]
command = "/Users/mbolin/code/mcp-server-weather-js/dist/index.js"
args = []
```
And then I ran the following:
```
codex-rs$ cargo run --bin codex exec 'what is the weather in san francisco'
[2025-05-06T22:40:05] Task started: 1
[2025-05-06T22:40:18] Agent message: Here’s the latest National Weather Service forecast for San Francisco (downtown, near 37.77° N, 122.42° W):
This Afternoon (Tue):
• Sunny, high near 69 °F
• West-southwest wind around 12 mph
Tonight:
• Partly cloudy, low around 52 °F
• SW wind 7–10 mph
...
```
Note that Codex itself is not able to make network calls, so it would
not normally be able to get live weather information like this. However,
the weather MCP is [currently] not run under the Codex sandbox, so it is
able to hit `api.weather.gov` and fetch current weather information.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/829).
* #836
* __->__ #829
2025-05-06 15:47:59 -07:00
|
|
|
use std::collections::HashMap;
|
2025-05-13 16:52:52 -07:00
|
|
|
use std::path::Path;
|
2025-04-27 21:47:50 -07:00
|
|
|
use std::path::PathBuf;
|
feat: add support for -c/--config to override individual config items (#1137)
This PR introduces support for `-c`/`--config` so users can override
individual config values on the command line using `--config
name=value`. Example:
```
codex --config model=o4-mini
```
Making it possible to set arbitrary config values on the command line
results in a more flexible configuration scheme and makes it easier to
provide single-line examples that can be copy-pasted from documentation.
Effectively, it means there are four levels of configuration for some
values:
- Default value (e.g., `model` currently defaults to `o4-mini`)
- Value in `config.toml` (e.g., user could override the default to be
`model = "o3"` in their `config.toml`)
- Specifying `-c` or `--config` to override `model` (e.g., user can
include `-c model=o3` in their list of args to Codex)
- If available, a config-specific flag can be used, which takes
precedence over `-c` (e.g., user can specify `--model o3` in their list
of args to Codex)
Now that it is possible to specify anything that could be configured in
`config.toml` on the command line using `-c`, we do not need to have a
custom flag for every possible config option (which can clutter the
output of `--help`). To that end, as part of this PR, we drop support
for the `--disable-response-storage` flag, as users can now specify `-c
disable_response_storage=true` to get the equivalent functionality.
Under the hood, this works by loading the `config.toml` into a
`toml::Value`. Then for each `key=value`, we create a small synthetic
TOML file with `value` so that we can run the TOML parser to get the
equivalent `toml::Value`. We then parse `key` to determine the point in
the original `toml::Value` to do the insert/replace. Once all of the
overrides from `-c` args have been applied, the `toml::Value` is
deserialized into a `ConfigToml` and then the `ConfigOverrides` are
applied, as before.
2025-05-27 23:11:44 -07:00
|
|
|
use toml::Value as TomlValue;
|
feat: initial import of Rust implementation of Codex CLI in codex-rs/ (#629)
As stated in `codex-rs/README.md`:
Today, Codex CLI is written in TypeScript and requires Node.js 22+ to
run it. For a number of users, this runtime requirement inhibits
adoption: they would be better served by a standalone executable. As
maintainers, we want Codex to run efficiently in a wide range of
environments with minimal overhead. We also want to take advantage of
operating system-specific APIs to provide better sandboxing, where
possible.
To that end, we are moving forward with a Rust implementation of Codex
CLI contained in this folder, which has the following benefits:
- The CLI compiles to small, standalone, platform-specific binaries.
- Can make direct, native calls to
[seccomp](https://man7.org/linux/man-pages/man2/seccomp.2.html) and
[landlock](https://man7.org/linux/man-pages/man7/landlock.7.html) in
order to support sandboxing on Linux.
- No runtime garbage collection, resulting in lower memory consumption
and better, more predictable performance.
Currently, the Rust implementation is materially behind the TypeScript
implementation in functionality, so continue to use the TypeScript
implmentation for the time being. We will publish native executables via
GitHub Releases as soon as we feel the Rust version is usable.
2025-04-24 13:31:40 -07:00
|
|
|
|
2025-05-10 17:52:59 -07:00
|
|
|
/// Maximum number of bytes of the documentation that will be embedded. Larger
|
|
|
|
|
/// files are *silently truncated* to this size so we do not take up too much of
|
|
|
|
|
/// the context window.
|
|
|
|
|
pub(crate) const PROJECT_DOC_MAX_BYTES: usize = 32 * 1024; // 32 KiB
|
|
|
|
|
|
2025-04-27 21:47:50 -07:00
|
|
|
/// Application configuration loaded from disk and merged with overrides.
|
2025-05-13 16:52:52 -07:00
|
|
|
#[derive(Debug, Clone, PartialEq)]
|
feat: initial import of Rust implementation of Codex CLI in codex-rs/ (#629)
As stated in `codex-rs/README.md`:
Today, Codex CLI is written in TypeScript and requires Node.js 22+ to
run it. For a number of users, this runtime requirement inhibits
adoption: they would be better served by a standalone executable. As
maintainers, we want Codex to run efficiently in a wide range of
environments with minimal overhead. We also want to take advantage of
operating system-specific APIs to provide better sandboxing, where
possible.
To that end, we are moving forward with a Rust implementation of Codex
CLI contained in this folder, which has the following benefits:
- The CLI compiles to small, standalone, platform-specific binaries.
- Can make direct, native calls to
[seccomp](https://man7.org/linux/man-pages/man2/seccomp.2.html) and
[landlock](https://man7.org/linux/man-pages/man7/landlock.7.html) in
order to support sandboxing on Linux.
- No runtime garbage collection, resulting in lower memory consumption
and better, more predictable performance.
Currently, the Rust implementation is materially behind the TypeScript
implementation in functionality, so continue to use the TypeScript
implmentation for the time being. We will publish native executables via
GitHub Releases as soon as we feel the Rust version is usable.
2025-04-24 13:31:40 -07:00
|
|
|
pub struct Config {
|
2025-04-27 21:47:50 -07:00
|
|
|
/// Optional override of model selection.
|
|
|
|
|
pub model: String,
|
fix: overhaul SandboxPolicy and config loading in Rust (#732)
Previous to this PR, `SandboxPolicy` was a bit difficult to work with:
https://github.com/openai/codex/blob/237f8a11e11fdcc793a09e787e48215676d9b95b/codex-rs/core/src/protocol.rs#L98-L108
Specifically:
* It was an `enum` and therefore options were mutually exclusive as
opposed to additive.
* It defined things in terms of what the agent _could not_ do as opposed
to what they _could_ do. This made things hard to support because we
would prefer to build up a sandbox config by starting with something
extremely restrictive and only granting permissions for things the user
as explicitly allowed.
This PR changes things substantially by redefining the policy in terms
of two concepts:
* A `SandboxPermission` enum that defines permissions that can be
granted to the agent/sandbox.
* A `SandboxPolicy` that internally stores a `Vec<SandboxPermission>`,
but externally exposes a simpler API that can be used to configure
Seatbelt/Landlock.
Previous to this PR, we supported a `--sandbox` flag that effectively
mapped to an enum value in `SandboxPolicy`. Though now that
`SandboxPolicy` is a wrapper around `Vec<SandboxPermission>`, the single
`--sandbox` flag no longer makes sense. While I could have turned it
into a flag that the user can specify multiple times, I think the
current values to use with such a flag are long and potentially messy,
so for the moment, I have dropped support for `--sandbox` altogether and
we can bring it back once we have figured out the naming thing.
Since `--sandbox` is gone, users now have to specify `--full-auto` to
get a sandbox that allows writes in `cwd`. Admittedly, there is no clean
way to specify the equivalent of `--full-auto` in your `config.toml`
right now, so we will have to revisit that, as well.
Because `Config` presents a `SandboxPolicy` field and `SandboxPolicy`
changed considerably, I had to overhaul how config loading works, as
well. There are now two distinct concepts, `ConfigToml` and `Config`:
* `ConfigToml` is the deserialization of `~/.codex/config.toml`. As one
might expect, every field is `Optional` and it is `#[derive(Deserialize,
Default)]`. Consistent use of `Optional` makes it clear what the user
has specified explicitly.
* `Config` is the "normalized config" and is produced by merging
`ConfigToml` with `ConfigOverrides`. Where `ConfigToml` contains a raw
`Option<Vec<SandboxPermission>>`, `Config` presents only the final
`SandboxPolicy`.
The changes to `core/src/exec.rs` and `core/src/linux.rs` merit extra
special attention to ensure we are faithfully mapping the
`SandboxPolicy` to the Seatbelt and Landlock configs, respectively.
Also, take note that `core/src/seatbelt_readonly_policy.sbpl` has been
renamed to `codex-rs/core/src/seatbelt_base_policy.sbpl` and that
`(allow file-read*)` has been removed from the `.sbpl` file as now this
is added to the policy in `core/src/exec.rs` when
`sandbox_policy.has_full_disk_read_access()` is `true`.
2025-04-29 15:01:16 -07:00
|
|
|
|
2025-05-08 21:46:06 -07:00
|
|
|
/// Key into the model_providers map that specifies which provider to use.
|
|
|
|
|
pub model_provider_id: String,
|
|
|
|
|
|
2025-05-07 17:38:28 -07:00
|
|
|
/// Info needed to make an API request to the model.
|
|
|
|
|
pub model_provider: ModelProviderInfo,
|
|
|
|
|
|
fix: overhaul SandboxPolicy and config loading in Rust (#732)
Previous to this PR, `SandboxPolicy` was a bit difficult to work with:
https://github.com/openai/codex/blob/237f8a11e11fdcc793a09e787e48215676d9b95b/codex-rs/core/src/protocol.rs#L98-L108
Specifically:
* It was an `enum` and therefore options were mutually exclusive as
opposed to additive.
* It defined things in terms of what the agent _could not_ do as opposed
to what they _could_ do. This made things hard to support because we
would prefer to build up a sandbox config by starting with something
extremely restrictive and only granting permissions for things the user
as explicitly allowed.
This PR changes things substantially by redefining the policy in terms
of two concepts:
* A `SandboxPermission` enum that defines permissions that can be
granted to the agent/sandbox.
* A `SandboxPolicy` that internally stores a `Vec<SandboxPermission>`,
but externally exposes a simpler API that can be used to configure
Seatbelt/Landlock.
Previous to this PR, we supported a `--sandbox` flag that effectively
mapped to an enum value in `SandboxPolicy`. Though now that
`SandboxPolicy` is a wrapper around `Vec<SandboxPermission>`, the single
`--sandbox` flag no longer makes sense. While I could have turned it
into a flag that the user can specify multiple times, I think the
current values to use with such a flag are long and potentially messy,
so for the moment, I have dropped support for `--sandbox` altogether and
we can bring it back once we have figured out the naming thing.
Since `--sandbox` is gone, users now have to specify `--full-auto` to
get a sandbox that allows writes in `cwd`. Admittedly, there is no clean
way to specify the equivalent of `--full-auto` in your `config.toml`
right now, so we will have to revisit that, as well.
Because `Config` presents a `SandboxPolicy` field and `SandboxPolicy`
changed considerably, I had to overhaul how config loading works, as
well. There are now two distinct concepts, `ConfigToml` and `Config`:
* `ConfigToml` is the deserialization of `~/.codex/config.toml`. As one
might expect, every field is `Optional` and it is `#[derive(Deserialize,
Default)]`. Consistent use of `Optional` makes it clear what the user
has specified explicitly.
* `Config` is the "normalized config" and is produced by merging
`ConfigToml` with `ConfigOverrides`. Where `ConfigToml` contains a raw
`Option<Vec<SandboxPermission>>`, `Config` presents only the final
`SandboxPolicy`.
The changes to `core/src/exec.rs` and `core/src/linux.rs` merit extra
special attention to ensure we are faithfully mapping the
`SandboxPolicy` to the Seatbelt and Landlock configs, respectively.
Also, take note that `core/src/seatbelt_readonly_policy.sbpl` has been
renamed to `codex-rs/core/src/seatbelt_base_policy.sbpl` and that
`(allow file-read*)` has been removed from the `.sbpl` file as now this
is added to the policy in `core/src/exec.rs` when
`sandbox_policy.has_full_disk_read_access()` is `true`.
2025-04-29 15:01:16 -07:00
|
|
|
/// Approval policy for executing commands.
|
2025-04-27 21:47:50 -07:00
|
|
|
pub approval_policy: AskForApproval,
|
fix: overhaul SandboxPolicy and config loading in Rust (#732)
Previous to this PR, `SandboxPolicy` was a bit difficult to work with:
https://github.com/openai/codex/blob/237f8a11e11fdcc793a09e787e48215676d9b95b/codex-rs/core/src/protocol.rs#L98-L108
Specifically:
* It was an `enum` and therefore options were mutually exclusive as
opposed to additive.
* It defined things in terms of what the agent _could not_ do as opposed
to what they _could_ do. This made things hard to support because we
would prefer to build up a sandbox config by starting with something
extremely restrictive and only granting permissions for things the user
as explicitly allowed.
This PR changes things substantially by redefining the policy in terms
of two concepts:
* A `SandboxPermission` enum that defines permissions that can be
granted to the agent/sandbox.
* A `SandboxPolicy` that internally stores a `Vec<SandboxPermission>`,
but externally exposes a simpler API that can be used to configure
Seatbelt/Landlock.
Previous to this PR, we supported a `--sandbox` flag that effectively
mapped to an enum value in `SandboxPolicy`. Though now that
`SandboxPolicy` is a wrapper around `Vec<SandboxPermission>`, the single
`--sandbox` flag no longer makes sense. While I could have turned it
into a flag that the user can specify multiple times, I think the
current values to use with such a flag are long and potentially messy,
so for the moment, I have dropped support for `--sandbox` altogether and
we can bring it back once we have figured out the naming thing.
Since `--sandbox` is gone, users now have to specify `--full-auto` to
get a sandbox that allows writes in `cwd`. Admittedly, there is no clean
way to specify the equivalent of `--full-auto` in your `config.toml`
right now, so we will have to revisit that, as well.
Because `Config` presents a `SandboxPolicy` field and `SandboxPolicy`
changed considerably, I had to overhaul how config loading works, as
well. There are now two distinct concepts, `ConfigToml` and `Config`:
* `ConfigToml` is the deserialization of `~/.codex/config.toml`. As one
might expect, every field is `Optional` and it is `#[derive(Deserialize,
Default)]`. Consistent use of `Optional` makes it clear what the user
has specified explicitly.
* `Config` is the "normalized config" and is produced by merging
`ConfigToml` with `ConfigOverrides`. Where `ConfigToml` contains a raw
`Option<Vec<SandboxPermission>>`, `Config` presents only the final
`SandboxPolicy`.
The changes to `core/src/exec.rs` and `core/src/linux.rs` merit extra
special attention to ensure we are faithfully mapping the
`SandboxPolicy` to the Seatbelt and Landlock configs, respectively.
Also, take note that `core/src/seatbelt_readonly_policy.sbpl` has been
renamed to `codex-rs/core/src/seatbelt_base_policy.sbpl` and that
`(allow file-read*)` has been removed from the `.sbpl` file as now this
is added to the policy in `core/src/exec.rs` when
`sandbox_policy.has_full_disk_read_access()` is `true`.
2025-04-29 15:01:16 -07:00
|
|
|
|
2025-04-27 21:47:50 -07:00
|
|
|
pub sandbox_policy: SandboxPolicy,
|
2025-04-28 15:39:34 -07:00
|
|
|
|
feat: introduce support for shell_environment_policy in config.toml (#1061)
To date, when handling `shell` and `local_shell` tool calls, we were
spawning new processes using the environment inherited from the Codex
process itself. This means that the sensitive `OPENAI_API_KEY` that
Codex needs to talk to OpenAI models was made available to everything
run by `shell` and `local_shell`. While there are cases where that might
be useful, it does not seem like a good default.
This PR introduces a complex `shell_environment_policy` config option to
control the `env` used with these tool calls. It is inevitably a bit
complex so that it is possible to override individual components of the
policy so without having to restate the entire thing.
Details are in the updated `README.md` in this PR, but here is the
relevant bit that explains the individual fields of
`shell_environment_policy`:
| Field | Type | Default | Description |
| ------------------------- | -------------------------- | ------- |
-----------------------------------------------------------------------------------------------------------------------------------------------
|
| `inherit` | string | `core` | Starting template for the
environment:<br>`core` (`HOME`, `PATH`, `USER`, …), `all` (clone full
parent env), or `none` (start empty). |
| `ignore_default_excludes` | boolean | `false` | When `false`, Codex
removes any var whose **name** contains `KEY`, `SECRET`, or `TOKEN`
(case-insensitive) before other rules run. |
| `exclude` | array<string> | `[]` | Case-insensitive glob
patterns to drop after the default filter.<br>Examples: `"AWS_*"`,
`"AZURE_*"`. |
| `set` | table<string,string> | `{}` | Explicit key/value
overrides or additions – always win over inherited values. |
| `include_only` | array<string> | `[]` | If non-empty, a
whitelist of patterns; only variables that match _one_ pattern survive
the final step. (Generally used with `inherit = "all"`.) |
In particular, note that the default is `inherit = "core"`, so:
* if you have extra env variables that you want to inherit from the
parent process, use `inherit = "all"` and then specify `include_only`
* if you have extra env variables where you want to hardcode the values,
the default `inherit = "core"` will work fine, but then you need to
specify `set`
This configuration is not battle-tested, so we will probably still have
to play with it a bit. `core/src/exec_env.rs` has the critical business
logic as well as unit tests.
Though if nothing else, previous to this change:
```
$ cargo run --bin codex -- debug seatbelt -- printenv OPENAI_API_KEY
# ...prints OPENAI_API_KEY...
```
But after this change it does not print anything (as desired).
One final thing to call out about this PR is that the
`configure_command!` macro we use in `core/src/exec.rs` has to do some
complex logic with respect to how it builds up the `env` for the process
being spawned under Landlock/seccomp. Specifically, doing
`cmd.env_clear()` followed by `cmd.envs(&$env_map)` (which is arguably
the most intuitive way to do it) caused the Landlock unit tests to fail
because the processes spawned by the unit tests started failing in
unexpected ways! If we forgo `env_clear()` in favor of updating env vars
one at a time, the tests still pass. The comment in the code talks about
this a bit, and while I would like to investigate this more, I need to
move on for the moment, but I do plan to come back to it to fully
understand what is going on. For example, this suggests that we might
not be able to spawn a C program that calls `env_clear()`, which would
be...weird. We may still have to fiddle with our Landlock config if that
is the case.
2025-05-22 09:51:19 -07:00
|
|
|
pub shell_environment_policy: ShellEnvironmentPolicy,
|
|
|
|
|
|
2025-04-28 15:39:34 -07:00
|
|
|
/// Disable server-side response storage (sends the full conversation
|
|
|
|
|
/// context with every request). Currently necessary for OpenAI customers
|
|
|
|
|
/// who have opted into Zero Data Retention (ZDR).
|
|
|
|
|
pub disable_response_storage: bool,
|
|
|
|
|
|
2025-05-12 17:24:44 -07:00
|
|
|
/// User-provided instructions from instructions.md.
|
feat: initial import of Rust implementation of Codex CLI in codex-rs/ (#629)
As stated in `codex-rs/README.md`:
Today, Codex CLI is written in TypeScript and requires Node.js 22+ to
run it. For a number of users, this runtime requirement inhibits
adoption: they would be better served by a standalone executable. As
maintainers, we want Codex to run efficiently in a wide range of
environments with minimal overhead. We also want to take advantage of
operating system-specific APIs to provide better sandboxing, where
possible.
To that end, we are moving forward with a Rust implementation of Codex
CLI contained in this folder, which has the following benefits:
- The CLI compiles to small, standalone, platform-specific binaries.
- Can make direct, native calls to
[seccomp](https://man7.org/linux/man-pages/man2/seccomp.2.html) and
[landlock](https://man7.org/linux/man-pages/man7/landlock.7.html) in
order to support sandboxing on Linux.
- No runtime garbage collection, resulting in lower memory consumption
and better, more predictable performance.
Currently, the Rust implementation is materially behind the TypeScript
implementation in functionality, so continue to use the TypeScript
implmentation for the time being. We will publish native executables via
GitHub Releases as soon as we feel the Rust version is usable.
2025-04-24 13:31:40 -07:00
|
|
|
pub instructions: Option<String>,
|
feat: configurable notifications in the Rust CLI (#793)
With this change, you can specify a program that will be executed to get
notified about events generated by Codex. The notification info will be
packaged as a JSON object. The supported notification types are defined
by the `UserNotification` enum introduced in this PR. Initially, it
contains only one variant, `AgentTurnComplete`:
```rust
pub(crate) enum UserNotification {
#[serde(rename_all = "kebab-case")]
AgentTurnComplete {
turn_id: String,
/// Messages that the user sent to the agent to initiate the turn.
input_messages: Vec<String>,
/// The last message sent by the assistant in the turn.
last_assistant_message: Option<String>,
},
}
```
This is intended to support the common case when a "turn" ends, which
often means it is now your chance to give Codex further instructions.
For example, I have the following in my `~/.codex/config.toml`:
```toml
notify = ["python3", "/Users/mbolin/.codex/notify.py"]
```
I created my own custom notifier script that calls out to
[terminal-notifier](https://github.com/julienXX/terminal-notifier) to
show a desktop push notification on macOS. Contents of `notify.py`:
```python
#!/usr/bin/env python3
import json
import subprocess
import sys
def main() -> int:
if len(sys.argv) != 2:
print("Usage: notify.py <NOTIFICATION_JSON>")
return 1
try:
notification = json.loads(sys.argv[1])
except json.JSONDecodeError:
return 1
match notification_type := notification.get("type"):
case "agent-turn-complete":
assistant_message = notification.get("last-assistant-message")
if assistant_message:
title = f"Codex: {assistant_message}"
else:
title = "Codex: Turn Complete!"
input_messages = notification.get("input_messages", [])
message = " ".join(input_messages)
title += message
case _:
print(f"not sending a push notification for: {notification_type}")
return 0
subprocess.check_output(
[
"terminal-notifier",
"-title",
title,
"-message",
message,
"-group",
"codex",
"-ignoreDnD",
"-activate",
"com.googlecode.iterm2",
]
)
return 0
if __name__ == "__main__":
sys.exit(main())
```
For reference, here are related PRs that tried to add this functionality
to the TypeScript version of the Codex CLI:
* https://github.com/openai/codex/pull/160
* https://github.com/openai/codex/pull/498
2025-05-02 19:48:13 -07:00
|
|
|
|
|
|
|
|
/// Optional external notifier command. When set, Codex will spawn this
|
|
|
|
|
/// program after each completed *turn* (i.e. when the agent finishes
|
|
|
|
|
/// processing a user submission). The value must be the full command
|
|
|
|
|
/// broken into argv tokens **without** the trailing JSON argument - Codex
|
|
|
|
|
/// appends one extra argument containing a JSON payload describing the
|
|
|
|
|
/// event.
|
|
|
|
|
///
|
|
|
|
|
/// Example `~/.codex/config.toml` snippet:
|
|
|
|
|
///
|
|
|
|
|
/// ```toml
|
|
|
|
|
/// notify = ["notify-send", "Codex"]
|
|
|
|
|
/// ```
|
|
|
|
|
///
|
|
|
|
|
/// which will be invoked as:
|
|
|
|
|
///
|
|
|
|
|
/// ```shell
|
|
|
|
|
/// notify-send Codex '{"type":"agent-turn-complete","turn-id":"12345"}'
|
|
|
|
|
/// ```
|
|
|
|
|
///
|
|
|
|
|
/// If unset the feature is disabled.
|
|
|
|
|
pub notify: Option<Vec<String>>,
|
2025-05-04 10:57:12 -07:00
|
|
|
|
|
|
|
|
/// The directory that should be treated as the current working directory
|
|
|
|
|
/// for the session. All relative paths inside the business-logic layer are
|
|
|
|
|
/// resolved against this path.
|
|
|
|
|
pub cwd: PathBuf,
|
feat: support mcp_servers in config.toml (#829)
This adds initial support for MCP servers in the style of Claude Desktop
and Cursor. Note this PR is the bare minimum to get things working end
to end: all configured MCP servers are launched every time Codex is run,
there is no recovery for MCP servers that crash, etc.
(Also, I took some shortcuts to change some fields of `Session` to be
`pub(crate)`, which also means there are circular deps between
`codex.rs` and `mcp_tool_call.rs`, but I will clean that up in a
subsequent PR.)
`codex-rs/README.md` is updated as part of this PR to explain how to use
this feature. There is a bit of plumbing to route the new settings from
`Config` to the business logic in `codex.rs`. The most significant
chunks for new code are in `mcp_connection_manager.rs` (which defines
the `McpConnectionManager` struct) and `mcp_tool_call.rs`, which is
responsible for tool calls.
This PR also introduces new `McpToolCallBegin` and `McpToolCallEnd`
event types to the protocol, but does not add any handlers for them.
(See https://github.com/openai/codex/pull/836 for initial usage.)
To test, I added the following to my `~/.codex/config.toml`:
```toml
# Local build of https://github.com/hideya/mcp-server-weather-js
[mcp_servers.weather]
command = "/Users/mbolin/code/mcp-server-weather-js/dist/index.js"
args = []
```
And then I ran the following:
```
codex-rs$ cargo run --bin codex exec 'what is the weather in san francisco'
[2025-05-06T22:40:05] Task started: 1
[2025-05-06T22:40:18] Agent message: Here’s the latest National Weather Service forecast for San Francisco (downtown, near 37.77° N, 122.42° W):
This Afternoon (Tue):
• Sunny, high near 69 °F
• West-southwest wind around 12 mph
Tonight:
• Partly cloudy, low around 52 °F
• SW wind 7–10 mph
...
```
Note that Codex itself is not able to make network calls, so it would
not normally be able to get live weather information like this. However,
the weather MCP is [currently] not run under the Codex sandbox, so it is
able to hit `api.weather.gov` and fetch current weather information.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/829).
* #836
* __->__ #829
2025-05-06 15:47:59 -07:00
|
|
|
|
|
|
|
|
/// Definition for MCP servers that Codex can reach out to for tool calls.
|
|
|
|
|
pub mcp_servers: HashMap<String, McpServerConfig>,
|
2025-05-07 17:38:28 -07:00
|
|
|
|
|
|
|
|
/// Combined provider map (defaults merged with user-defined overrides).
|
|
|
|
|
pub model_providers: HashMap<String, ModelProviderInfo>,
|
2025-05-10 17:52:59 -07:00
|
|
|
|
|
|
|
|
/// Maximum number of bytes to include from an AGENTS.md project doc file.
|
|
|
|
|
pub project_doc_max_bytes: usize,
|
2025-05-15 00:30:13 -07:00
|
|
|
|
|
|
|
|
/// Directory containing all Codex state (defaults to `~/.codex` but can be
|
|
|
|
|
/// overridden by the `CODEX_HOME` environment variable).
|
|
|
|
|
pub codex_home: PathBuf,
|
feat: record messages from user in ~/.codex/history.jsonl (#939)
This is a large change to support a "history" feature like you would
expect in a shell like Bash.
History events are recorded in `$CODEX_HOME/history.jsonl`. Because it
is a JSONL file, it is straightforward to append new entries (as opposed
to the TypeScript file that uses `$CODEX_HOME/history.json`, so to be
valid JSON, each new entry entails rewriting the entire file). Because
it is possible for there to be multiple instances of Codex CLI writing
to `history.jsonl` at once, we use advisory file locking when working
with `history.jsonl` in `codex-rs/core/src/message_history.rs`.
Because we believe history is a sufficiently useful feature, we enable
it by default. Though to provide some safety, we set the file
permissions of `history.jsonl` to be `o600` so that other users on the
system cannot read the user's history. We do not yet support a default
list of `SENSITIVE_PATTERNS` as the TypeScript CLI does:
https://github.com/openai/codex/blob/3fdf9df1335ac9501e3fb0e61715359145711e8b/codex-cli/src/utils/storage/command-history.ts#L10-L17
We are going to take a more conservative approach to this list in the
Rust CLI. For example, while `/\b[A-Za-z0-9-_]{20,}\b/` might exclude
sensitive information like API tokens, it would also exclude valuable
information such as references to Git commits.
As noted in the updated documentation, users can opt-out of history by
adding the following to `config.toml`:
```toml
[history]
persistence = "none"
```
Because `history.jsonl` could, in theory, be quite large, we take a[n
arguably overly pedantic] approach in reading history entries into
memory. Specifically, we start by telling the client the current number
of entries in the history file (`history_entry_count`) as well as the
inode (`history_log_id`) of `history.jsonl` (see the new fields on
`SessionConfiguredEvent`).
The client is responsible for keeping new entries in memory to create a
"local history," but if the user hits up enough times to go "past" the
end of local history, then the client should use the new
`GetHistoryEntryRequest` in the protocol to fetch older entries.
Specifically, it should pass the `history_log_id` it was given
originally and work backwards from `history_entry_count`. (It should
really fetch history in batches rather than one-at-a-time, but that is
something we can improve upon in subsequent PRs.)
The motivation behind this crazy scheme is that it is designed to defend
against:
* The `history.jsonl` being truncated during the session such that the
index into the history is no longer consistent with what had been read
up to that point. We do not yet have logic to enforce a `max_bytes` for
`history.jsonl`, but once we do, we will aspire to implement it in a way
that should result in a new inode for the file on most systems.
* New items from concurrent Codex CLI sessions amending to the history.
Because, in absence of truncation, `history.jsonl` is an append-only
log, so long as the client reads backwards from `history_entry_count`,
it should always get a consistent view of history. (That said, it will
not be able to read _new_ commands from concurrent sessions, but perhaps
we will introduce a `/` command to reload latest history or something
down the road.)
Admittedly, my testing of this feature thus far has been fairly light. I
expect we will find bugs and introduce enhancements/fixes going forward.
2025-05-15 16:26:23 -07:00
|
|
|
|
|
|
|
|
/// Settings that govern if and what will be written to `~/.codex/history.jsonl`.
|
|
|
|
|
pub history: History,
|
2025-05-16 11:33:08 -07:00
|
|
|
|
|
|
|
|
/// Optional URI-based file opener. If set, citations to files in the model
|
|
|
|
|
/// output will be hyperlinked using the specified URI scheme.
|
|
|
|
|
pub file_opener: UriBasedFileOpener,
|
2025-05-16 16:16:50 -07:00
|
|
|
|
|
|
|
|
/// Collection of settings that are specific to the TUI.
|
|
|
|
|
pub tui: Tui,
|
2025-05-22 21:52:28 -07:00
|
|
|
|
|
|
|
|
/// Path to the `codex-linux-sandbox` executable. This must be set if
|
|
|
|
|
/// [`crate::exec::SandboxType::LinuxSeccomp`] is used. Note that this
|
|
|
|
|
/// cannot be set in the config file: it must be set in code via
|
|
|
|
|
/// [`ConfigOverrides`].
|
|
|
|
|
///
|
|
|
|
|
/// When this program is invoked, arg0 will be set to `codex-linux-sandbox`.
|
|
|
|
|
pub codex_linux_sandbox_exe: Option<PathBuf>,
|
feat: record messages from user in ~/.codex/history.jsonl (#939)
This is a large change to support a "history" feature like you would
expect in a shell like Bash.
History events are recorded in `$CODEX_HOME/history.jsonl`. Because it
is a JSONL file, it is straightforward to append new entries (as opposed
to the TypeScript file that uses `$CODEX_HOME/history.json`, so to be
valid JSON, each new entry entails rewriting the entire file). Because
it is possible for there to be multiple instances of Codex CLI writing
to `history.jsonl` at once, we use advisory file locking when working
with `history.jsonl` in `codex-rs/core/src/message_history.rs`.
Because we believe history is a sufficiently useful feature, we enable
it by default. Though to provide some safety, we set the file
permissions of `history.jsonl` to be `o600` so that other users on the
system cannot read the user's history. We do not yet support a default
list of `SENSITIVE_PATTERNS` as the TypeScript CLI does:
https://github.com/openai/codex/blob/3fdf9df1335ac9501e3fb0e61715359145711e8b/codex-cli/src/utils/storage/command-history.ts#L10-L17
We are going to take a more conservative approach to this list in the
Rust CLI. For example, while `/\b[A-Za-z0-9-_]{20,}\b/` might exclude
sensitive information like API tokens, it would also exclude valuable
information such as references to Git commits.
As noted in the updated documentation, users can opt-out of history by
adding the following to `config.toml`:
```toml
[history]
persistence = "none"
```
Because `history.jsonl` could, in theory, be quite large, we take a[n
arguably overly pedantic] approach in reading history entries into
memory. Specifically, we start by telling the client the current number
of entries in the history file (`history_entry_count`) as well as the
inode (`history_log_id`) of `history.jsonl` (see the new fields on
`SessionConfiguredEvent`).
The client is responsible for keeping new entries in memory to create a
"local history," but if the user hits up enough times to go "past" the
end of local history, then the client should use the new
`GetHistoryEntryRequest` in the protocol to fetch older entries.
Specifically, it should pass the `history_log_id` it was given
originally and work backwards from `history_entry_count`. (It should
really fetch history in batches rather than one-at-a-time, but that is
something we can improve upon in subsequent PRs.)
The motivation behind this crazy scheme is that it is designed to defend
against:
* The `history.jsonl` being truncated during the session such that the
index into the history is no longer consistent with what had been read
up to that point. We do not yet have logic to enforce a `max_bytes` for
`history.jsonl`, but once we do, we will aspire to implement it in a way
that should result in a new inode for the file on most systems.
* New items from concurrent Codex CLI sessions amending to the history.
Because, in absence of truncation, `history.jsonl` is an append-only
log, so long as the client reads backwards from `history_entry_count`,
it should always get a consistent view of history. (That said, it will
not be able to read _new_ commands from concurrent sessions, but perhaps
we will introduce a `/` command to reload latest history or something
down the road.)
Admittedly, my testing of this feature thus far has been fairly light. I
expect we will find bugs and introduce enhancements/fixes going forward.
2025-05-15 16:26:23 -07:00
|
|
|
}
|
|
|
|
|
|
feat: add support for -c/--config to override individual config items (#1137)
This PR introduces support for `-c`/`--config` so users can override
individual config values on the command line using `--config
name=value`. Example:
```
codex --config model=o4-mini
```
Making it possible to set arbitrary config values on the command line
results in a more flexible configuration scheme and makes it easier to
provide single-line examples that can be copy-pasted from documentation.
Effectively, it means there are four levels of configuration for some
values:
- Default value (e.g., `model` currently defaults to `o4-mini`)
- Value in `config.toml` (e.g., user could override the default to be
`model = "o3"` in their `config.toml`)
- Specifying `-c` or `--config` to override `model` (e.g., user can
include `-c model=o3` in their list of args to Codex)
- If available, a config-specific flag can be used, which takes
precedence over `-c` (e.g., user can specify `--model o3` in their list
of args to Codex)
Now that it is possible to specify anything that could be configured in
`config.toml` on the command line using `-c`, we do not need to have a
custom flag for every possible config option (which can clutter the
output of `--help`). To that end, as part of this PR, we drop support
for the `--disable-response-storage` flag, as users can now specify `-c
disable_response_storage=true` to get the equivalent functionality.
Under the hood, this works by loading the `config.toml` into a
`toml::Value`. Then for each `key=value`, we create a small synthetic
TOML file with `value` so that we can run the TOML parser to get the
equivalent `toml::Value`. We then parse `key` to determine the point in
the original `toml::Value` to do the insert/replace. Once all of the
overrides from `-c` args have been applied, the `toml::Value` is
deserialized into a `ConfigToml` and then the `ConfigOverrides` are
applied, as before.
2025-05-27 23:11:44 -07:00
|
|
|
impl Config {
|
|
|
|
|
/// Load configuration with *generic* CLI overrides (`-c key=value`) applied
|
|
|
|
|
/// **in between** the values parsed from `config.toml` and the
|
|
|
|
|
/// strongly-typed overrides specified via [`ConfigOverrides`].
|
|
|
|
|
///
|
|
|
|
|
/// The precedence order is therefore: `config.toml` < `-c` overrides <
|
|
|
|
|
/// `ConfigOverrides`.
|
|
|
|
|
pub fn load_with_cli_overrides(
|
|
|
|
|
cli_overrides: Vec<(String, TomlValue)>,
|
|
|
|
|
overrides: ConfigOverrides,
|
|
|
|
|
) -> std::io::Result<Self> {
|
|
|
|
|
// Resolve the directory that stores Codex state (e.g. ~/.codex or the
|
|
|
|
|
// value of $CODEX_HOME) so we can embed it into the resulting
|
|
|
|
|
// `Config` instance.
|
|
|
|
|
let codex_home = find_codex_home()?;
|
|
|
|
|
|
|
|
|
|
// Step 1: parse `config.toml` into a generic JSON value.
|
|
|
|
|
let mut root_value = load_config_as_toml(&codex_home)?;
|
|
|
|
|
|
|
|
|
|
// Step 2: apply the `-c` overrides.
|
|
|
|
|
for (path, value) in cli_overrides.into_iter() {
|
|
|
|
|
apply_toml_override(&mut root_value, &path, value);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Step 3: deserialize into `ConfigToml` so that Serde can enforce the
|
|
|
|
|
// correct types.
|
|
|
|
|
let cfg: ConfigToml = root_value.try_into().map_err(|e| {
|
|
|
|
|
tracing::error!("Failed to deserialize overridden config: {e}");
|
|
|
|
|
std::io::Error::new(std::io::ErrorKind::InvalidData, e)
|
|
|
|
|
})?;
|
|
|
|
|
|
|
|
|
|
// Step 4: merge with the strongly-typed overrides.
|
|
|
|
|
Self::load_from_base_config_with_overrides(cfg, overrides, codex_home)
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Read `CODEX_HOME/config.toml` and return it as a generic TOML value. Returns
|
|
|
|
|
/// an empty TOML table when the file does not exist.
|
|
|
|
|
fn load_config_as_toml(codex_home: &Path) -> std::io::Result<TomlValue> {
|
|
|
|
|
let config_path = codex_home.join("config.toml");
|
|
|
|
|
match std::fs::read_to_string(&config_path) {
|
|
|
|
|
Ok(contents) => match toml::from_str::<TomlValue>(&contents) {
|
|
|
|
|
Ok(val) => Ok(val),
|
|
|
|
|
Err(e) => {
|
|
|
|
|
tracing::error!("Failed to parse config.toml: {e}");
|
|
|
|
|
Err(std::io::Error::new(std::io::ErrorKind::InvalidData, e))
|
|
|
|
|
}
|
|
|
|
|
},
|
|
|
|
|
Err(e) if e.kind() == std::io::ErrorKind::NotFound => {
|
|
|
|
|
tracing::info!("config.toml not found, using defaults");
|
|
|
|
|
Ok(TomlValue::Table(Default::default()))
|
|
|
|
|
}
|
|
|
|
|
Err(e) => {
|
|
|
|
|
tracing::error!("Failed to read config.toml: {e}");
|
|
|
|
|
Err(e)
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Apply a single dotted-path override onto a TOML value.
|
|
|
|
|
fn apply_toml_override(root: &mut TomlValue, path: &str, value: TomlValue) {
|
|
|
|
|
use toml::value::Table;
|
|
|
|
|
|
|
|
|
|
let segments: Vec<&str> = path.split('.').collect();
|
|
|
|
|
let mut current = root;
|
|
|
|
|
|
|
|
|
|
for (idx, segment) in segments.iter().enumerate() {
|
|
|
|
|
let is_last = idx == segments.len() - 1;
|
|
|
|
|
|
|
|
|
|
if is_last {
|
|
|
|
|
match current {
|
|
|
|
|
TomlValue::Table(table) => {
|
|
|
|
|
table.insert(segment.to_string(), value);
|
|
|
|
|
}
|
|
|
|
|
_ => {
|
|
|
|
|
let mut table = Table::new();
|
|
|
|
|
table.insert(segment.to_string(), value);
|
|
|
|
|
*current = TomlValue::Table(table);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
return;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Traverse or create intermediate object.
|
|
|
|
|
match current {
|
|
|
|
|
TomlValue::Table(table) => {
|
|
|
|
|
current = table
|
|
|
|
|
.entry(segment.to_string())
|
|
|
|
|
.or_insert_with(|| TomlValue::Table(Table::new()));
|
|
|
|
|
}
|
|
|
|
|
_ => {
|
|
|
|
|
*current = TomlValue::Table(Table::new());
|
|
|
|
|
if let TomlValue::Table(tbl) = current {
|
|
|
|
|
current = tbl
|
|
|
|
|
.entry(segment.to_string())
|
|
|
|
|
.or_insert_with(|| TomlValue::Table(Table::new()));
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
fix: overhaul SandboxPolicy and config loading in Rust (#732)
Previous to this PR, `SandboxPolicy` was a bit difficult to work with:
https://github.com/openai/codex/blob/237f8a11e11fdcc793a09e787e48215676d9b95b/codex-rs/core/src/protocol.rs#L98-L108
Specifically:
* It was an `enum` and therefore options were mutually exclusive as
opposed to additive.
* It defined things in terms of what the agent _could not_ do as opposed
to what they _could_ do. This made things hard to support because we
would prefer to build up a sandbox config by starting with something
extremely restrictive and only granting permissions for things the user
as explicitly allowed.
This PR changes things substantially by redefining the policy in terms
of two concepts:
* A `SandboxPermission` enum that defines permissions that can be
granted to the agent/sandbox.
* A `SandboxPolicy` that internally stores a `Vec<SandboxPermission>`,
but externally exposes a simpler API that can be used to configure
Seatbelt/Landlock.
Previous to this PR, we supported a `--sandbox` flag that effectively
mapped to an enum value in `SandboxPolicy`. Though now that
`SandboxPolicy` is a wrapper around `Vec<SandboxPermission>`, the single
`--sandbox` flag no longer makes sense. While I could have turned it
into a flag that the user can specify multiple times, I think the
current values to use with such a flag are long and potentially messy,
so for the moment, I have dropped support for `--sandbox` altogether and
we can bring it back once we have figured out the naming thing.
Since `--sandbox` is gone, users now have to specify `--full-auto` to
get a sandbox that allows writes in `cwd`. Admittedly, there is no clean
way to specify the equivalent of `--full-auto` in your `config.toml`
right now, so we will have to revisit that, as well.
Because `Config` presents a `SandboxPolicy` field and `SandboxPolicy`
changed considerably, I had to overhaul how config loading works, as
well. There are now two distinct concepts, `ConfigToml` and `Config`:
* `ConfigToml` is the deserialization of `~/.codex/config.toml`. As one
might expect, every field is `Optional` and it is `#[derive(Deserialize,
Default)]`. Consistent use of `Optional` makes it clear what the user
has specified explicitly.
* `Config` is the "normalized config" and is produced by merging
`ConfigToml` with `ConfigOverrides`. Where `ConfigToml` contains a raw
`Option<Vec<SandboxPermission>>`, `Config` presents only the final
`SandboxPolicy`.
The changes to `core/src/exec.rs` and `core/src/linux.rs` merit extra
special attention to ensure we are faithfully mapping the
`SandboxPolicy` to the Seatbelt and Landlock configs, respectively.
Also, take note that `core/src/seatbelt_readonly_policy.sbpl` has been
renamed to `codex-rs/core/src/seatbelt_base_policy.sbpl` and that
`(allow file-read*)` has been removed from the `.sbpl` file as now this
is added to the policy in `core/src/exec.rs` when
`sandbox_policy.has_full_disk_read_access()` is `true`.
2025-04-29 15:01:16 -07:00
|
|
|
/// Base config deserialized from ~/.codex/config.toml.
|
|
|
|
|
#[derive(Deserialize, Debug, Clone, Default)]
|
|
|
|
|
pub struct ConfigToml {
|
|
|
|
|
/// Optional override of model selection.
|
|
|
|
|
pub model: Option<String>,
|
|
|
|
|
|
2025-05-07 17:38:28 -07:00
|
|
|
/// Provider to use from the model_providers map.
|
|
|
|
|
pub model_provider: Option<String>,
|
|
|
|
|
|
fix: overhaul SandboxPolicy and config loading in Rust (#732)
Previous to this PR, `SandboxPolicy` was a bit difficult to work with:
https://github.com/openai/codex/blob/237f8a11e11fdcc793a09e787e48215676d9b95b/codex-rs/core/src/protocol.rs#L98-L108
Specifically:
* It was an `enum` and therefore options were mutually exclusive as
opposed to additive.
* It defined things in terms of what the agent _could not_ do as opposed
to what they _could_ do. This made things hard to support because we
would prefer to build up a sandbox config by starting with something
extremely restrictive and only granting permissions for things the user
as explicitly allowed.
This PR changes things substantially by redefining the policy in terms
of two concepts:
* A `SandboxPermission` enum that defines permissions that can be
granted to the agent/sandbox.
* A `SandboxPolicy` that internally stores a `Vec<SandboxPermission>`,
but externally exposes a simpler API that can be used to configure
Seatbelt/Landlock.
Previous to this PR, we supported a `--sandbox` flag that effectively
mapped to an enum value in `SandboxPolicy`. Though now that
`SandboxPolicy` is a wrapper around `Vec<SandboxPermission>`, the single
`--sandbox` flag no longer makes sense. While I could have turned it
into a flag that the user can specify multiple times, I think the
current values to use with such a flag are long and potentially messy,
so for the moment, I have dropped support for `--sandbox` altogether and
we can bring it back once we have figured out the naming thing.
Since `--sandbox` is gone, users now have to specify `--full-auto` to
get a sandbox that allows writes in `cwd`. Admittedly, there is no clean
way to specify the equivalent of `--full-auto` in your `config.toml`
right now, so we will have to revisit that, as well.
Because `Config` presents a `SandboxPolicy` field and `SandboxPolicy`
changed considerably, I had to overhaul how config loading works, as
well. There are now two distinct concepts, `ConfigToml` and `Config`:
* `ConfigToml` is the deserialization of `~/.codex/config.toml`. As one
might expect, every field is `Optional` and it is `#[derive(Deserialize,
Default)]`. Consistent use of `Optional` makes it clear what the user
has specified explicitly.
* `Config` is the "normalized config" and is produced by merging
`ConfigToml` with `ConfigOverrides`. Where `ConfigToml` contains a raw
`Option<Vec<SandboxPermission>>`, `Config` presents only the final
`SandboxPolicy`.
The changes to `core/src/exec.rs` and `core/src/linux.rs` merit extra
special attention to ensure we are faithfully mapping the
`SandboxPolicy` to the Seatbelt and Landlock configs, respectively.
Also, take note that `core/src/seatbelt_readonly_policy.sbpl` has been
renamed to `codex-rs/core/src/seatbelt_base_policy.sbpl` and that
`(allow file-read*)` has been removed from the `.sbpl` file as now this
is added to the policy in `core/src/exec.rs` when
`sandbox_policy.has_full_disk_read_access()` is `true`.
2025-04-29 15:01:16 -07:00
|
|
|
/// Default approval policy for executing commands.
|
|
|
|
|
pub approval_policy: Option<AskForApproval>,
|
|
|
|
|
|
feat: introduce support for shell_environment_policy in config.toml (#1061)
To date, when handling `shell` and `local_shell` tool calls, we were
spawning new processes using the environment inherited from the Codex
process itself. This means that the sensitive `OPENAI_API_KEY` that
Codex needs to talk to OpenAI models was made available to everything
run by `shell` and `local_shell`. While there are cases where that might
be useful, it does not seem like a good default.
This PR introduces a complex `shell_environment_policy` config option to
control the `env` used with these tool calls. It is inevitably a bit
complex so that it is possible to override individual components of the
policy so without having to restate the entire thing.
Details are in the updated `README.md` in this PR, but here is the
relevant bit that explains the individual fields of
`shell_environment_policy`:
| Field | Type | Default | Description |
| ------------------------- | -------------------------- | ------- |
-----------------------------------------------------------------------------------------------------------------------------------------------
|
| `inherit` | string | `core` | Starting template for the
environment:<br>`core` (`HOME`, `PATH`, `USER`, …), `all` (clone full
parent env), or `none` (start empty). |
| `ignore_default_excludes` | boolean | `false` | When `false`, Codex
removes any var whose **name** contains `KEY`, `SECRET`, or `TOKEN`
(case-insensitive) before other rules run. |
| `exclude` | array<string> | `[]` | Case-insensitive glob
patterns to drop after the default filter.<br>Examples: `"AWS_*"`,
`"AZURE_*"`. |
| `set` | table<string,string> | `{}` | Explicit key/value
overrides or additions – always win over inherited values. |
| `include_only` | array<string> | `[]` | If non-empty, a
whitelist of patterns; only variables that match _one_ pattern survive
the final step. (Generally used with `inherit = "all"`.) |
In particular, note that the default is `inherit = "core"`, so:
* if you have extra env variables that you want to inherit from the
parent process, use `inherit = "all"` and then specify `include_only`
* if you have extra env variables where you want to hardcode the values,
the default `inherit = "core"` will work fine, but then you need to
specify `set`
This configuration is not battle-tested, so we will probably still have
to play with it a bit. `core/src/exec_env.rs` has the critical business
logic as well as unit tests.
Though if nothing else, previous to this change:
```
$ cargo run --bin codex -- debug seatbelt -- printenv OPENAI_API_KEY
# ...prints OPENAI_API_KEY...
```
But after this change it does not print anything (as desired).
One final thing to call out about this PR is that the
`configure_command!` macro we use in `core/src/exec.rs` has to do some
complex logic with respect to how it builds up the `env` for the process
being spawned under Landlock/seccomp. Specifically, doing
`cmd.env_clear()` followed by `cmd.envs(&$env_map)` (which is arguably
the most intuitive way to do it) caused the Landlock unit tests to fail
because the processes spawned by the unit tests started failing in
unexpected ways! If we forgo `env_clear()` in favor of updating env vars
one at a time, the tests still pass. The comment in the code talks about
this a bit, and while I would like to investigate this more, I need to
move on for the moment, but I do plan to come back to it to fully
understand what is going on. For example, this suggests that we might
not be able to spawn a C program that calls `env_clear()`, which would
be...weird. We may still have to fiddle with our Landlock config if that
is the case.
2025-05-22 09:51:19 -07:00
|
|
|
#[serde(default)]
|
|
|
|
|
pub shell_environment_policy: ShellEnvironmentPolicyToml,
|
|
|
|
|
|
2025-04-29 18:42:52 -07:00
|
|
|
// The `default` attribute ensures that the field is treated as `None` when
|
|
|
|
|
// the key is omitted from the TOML. Without it, Serde treats the field as
|
|
|
|
|
// required because we supply a custom deserializer.
|
|
|
|
|
#[serde(default, deserialize_with = "deserialize_sandbox_permissions")]
|
fix: overhaul SandboxPolicy and config loading in Rust (#732)
Previous to this PR, `SandboxPolicy` was a bit difficult to work with:
https://github.com/openai/codex/blob/237f8a11e11fdcc793a09e787e48215676d9b95b/codex-rs/core/src/protocol.rs#L98-L108
Specifically:
* It was an `enum` and therefore options were mutually exclusive as
opposed to additive.
* It defined things in terms of what the agent _could not_ do as opposed
to what they _could_ do. This made things hard to support because we
would prefer to build up a sandbox config by starting with something
extremely restrictive and only granting permissions for things the user
as explicitly allowed.
This PR changes things substantially by redefining the policy in terms
of two concepts:
* A `SandboxPermission` enum that defines permissions that can be
granted to the agent/sandbox.
* A `SandboxPolicy` that internally stores a `Vec<SandboxPermission>`,
but externally exposes a simpler API that can be used to configure
Seatbelt/Landlock.
Previous to this PR, we supported a `--sandbox` flag that effectively
mapped to an enum value in `SandboxPolicy`. Though now that
`SandboxPolicy` is a wrapper around `Vec<SandboxPermission>`, the single
`--sandbox` flag no longer makes sense. While I could have turned it
into a flag that the user can specify multiple times, I think the
current values to use with such a flag are long and potentially messy,
so for the moment, I have dropped support for `--sandbox` altogether and
we can bring it back once we have figured out the naming thing.
Since `--sandbox` is gone, users now have to specify `--full-auto` to
get a sandbox that allows writes in `cwd`. Admittedly, there is no clean
way to specify the equivalent of `--full-auto` in your `config.toml`
right now, so we will have to revisit that, as well.
Because `Config` presents a `SandboxPolicy` field and `SandboxPolicy`
changed considerably, I had to overhaul how config loading works, as
well. There are now two distinct concepts, `ConfigToml` and `Config`:
* `ConfigToml` is the deserialization of `~/.codex/config.toml`. As one
might expect, every field is `Optional` and it is `#[derive(Deserialize,
Default)]`. Consistent use of `Optional` makes it clear what the user
has specified explicitly.
* `Config` is the "normalized config" and is produced by merging
`ConfigToml` with `ConfigOverrides`. Where `ConfigToml` contains a raw
`Option<Vec<SandboxPermission>>`, `Config` presents only the final
`SandboxPolicy`.
The changes to `core/src/exec.rs` and `core/src/linux.rs` merit extra
special attention to ensure we are faithfully mapping the
`SandboxPolicy` to the Seatbelt and Landlock configs, respectively.
Also, take note that `core/src/seatbelt_readonly_policy.sbpl` has been
renamed to `codex-rs/core/src/seatbelt_base_policy.sbpl` and that
`(allow file-read*)` has been removed from the `.sbpl` file as now this
is added to the policy in `core/src/exec.rs` when
`sandbox_policy.has_full_disk_read_access()` is `true`.
2025-04-29 15:01:16 -07:00
|
|
|
pub sandbox_permissions: Option<Vec<SandboxPermission>>,
|
|
|
|
|
|
|
|
|
|
/// Disable server-side response storage (sends the full conversation
|
|
|
|
|
/// context with every request). Currently necessary for OpenAI customers
|
|
|
|
|
/// who have opted into Zero Data Retention (ZDR).
|
|
|
|
|
pub disable_response_storage: Option<bool>,
|
|
|
|
|
|
feat: configurable notifications in the Rust CLI (#793)
With this change, you can specify a program that will be executed to get
notified about events generated by Codex. The notification info will be
packaged as a JSON object. The supported notification types are defined
by the `UserNotification` enum introduced in this PR. Initially, it
contains only one variant, `AgentTurnComplete`:
```rust
pub(crate) enum UserNotification {
#[serde(rename_all = "kebab-case")]
AgentTurnComplete {
turn_id: String,
/// Messages that the user sent to the agent to initiate the turn.
input_messages: Vec<String>,
/// The last message sent by the assistant in the turn.
last_assistant_message: Option<String>,
},
}
```
This is intended to support the common case when a "turn" ends, which
often means it is now your chance to give Codex further instructions.
For example, I have the following in my `~/.codex/config.toml`:
```toml
notify = ["python3", "/Users/mbolin/.codex/notify.py"]
```
I created my own custom notifier script that calls out to
[terminal-notifier](https://github.com/julienXX/terminal-notifier) to
show a desktop push notification on macOS. Contents of `notify.py`:
```python
#!/usr/bin/env python3
import json
import subprocess
import sys
def main() -> int:
if len(sys.argv) != 2:
print("Usage: notify.py <NOTIFICATION_JSON>")
return 1
try:
notification = json.loads(sys.argv[1])
except json.JSONDecodeError:
return 1
match notification_type := notification.get("type"):
case "agent-turn-complete":
assistant_message = notification.get("last-assistant-message")
if assistant_message:
title = f"Codex: {assistant_message}"
else:
title = "Codex: Turn Complete!"
input_messages = notification.get("input_messages", [])
message = " ".join(input_messages)
title += message
case _:
print(f"not sending a push notification for: {notification_type}")
return 0
subprocess.check_output(
[
"terminal-notifier",
"-title",
title,
"-message",
message,
"-group",
"codex",
"-ignoreDnD",
"-activate",
"com.googlecode.iterm2",
]
)
return 0
if __name__ == "__main__":
sys.exit(main())
```
For reference, here are related PRs that tried to add this functionality
to the TypeScript version of the Codex CLI:
* https://github.com/openai/codex/pull/160
* https://github.com/openai/codex/pull/498
2025-05-02 19:48:13 -07:00
|
|
|
/// Optional external command to spawn for end-user notifications.
|
|
|
|
|
#[serde(default)]
|
|
|
|
|
pub notify: Option<Vec<String>>,
|
|
|
|
|
|
fix: overhaul SandboxPolicy and config loading in Rust (#732)
Previous to this PR, `SandboxPolicy` was a bit difficult to work with:
https://github.com/openai/codex/blob/237f8a11e11fdcc793a09e787e48215676d9b95b/codex-rs/core/src/protocol.rs#L98-L108
Specifically:
* It was an `enum` and therefore options were mutually exclusive as
opposed to additive.
* It defined things in terms of what the agent _could not_ do as opposed
to what they _could_ do. This made things hard to support because we
would prefer to build up a sandbox config by starting with something
extremely restrictive and only granting permissions for things the user
as explicitly allowed.
This PR changes things substantially by redefining the policy in terms
of two concepts:
* A `SandboxPermission` enum that defines permissions that can be
granted to the agent/sandbox.
* A `SandboxPolicy` that internally stores a `Vec<SandboxPermission>`,
but externally exposes a simpler API that can be used to configure
Seatbelt/Landlock.
Previous to this PR, we supported a `--sandbox` flag that effectively
mapped to an enum value in `SandboxPolicy`. Though now that
`SandboxPolicy` is a wrapper around `Vec<SandboxPermission>`, the single
`--sandbox` flag no longer makes sense. While I could have turned it
into a flag that the user can specify multiple times, I think the
current values to use with such a flag are long and potentially messy,
so for the moment, I have dropped support for `--sandbox` altogether and
we can bring it back once we have figured out the naming thing.
Since `--sandbox` is gone, users now have to specify `--full-auto` to
get a sandbox that allows writes in `cwd`. Admittedly, there is no clean
way to specify the equivalent of `--full-auto` in your `config.toml`
right now, so we will have to revisit that, as well.
Because `Config` presents a `SandboxPolicy` field and `SandboxPolicy`
changed considerably, I had to overhaul how config loading works, as
well. There are now two distinct concepts, `ConfigToml` and `Config`:
* `ConfigToml` is the deserialization of `~/.codex/config.toml`. As one
might expect, every field is `Optional` and it is `#[derive(Deserialize,
Default)]`. Consistent use of `Optional` makes it clear what the user
has specified explicitly.
* `Config` is the "normalized config" and is produced by merging
`ConfigToml` with `ConfigOverrides`. Where `ConfigToml` contains a raw
`Option<Vec<SandboxPermission>>`, `Config` presents only the final
`SandboxPolicy`.
The changes to `core/src/exec.rs` and `core/src/linux.rs` merit extra
special attention to ensure we are faithfully mapping the
`SandboxPolicy` to the Seatbelt and Landlock configs, respectively.
Also, take note that `core/src/seatbelt_readonly_policy.sbpl` has been
renamed to `codex-rs/core/src/seatbelt_base_policy.sbpl` and that
`(allow file-read*)` has been removed from the `.sbpl` file as now this
is added to the policy in `core/src/exec.rs` when
`sandbox_policy.has_full_disk_read_access()` is `true`.
2025-04-29 15:01:16 -07:00
|
|
|
/// System instructions.
|
|
|
|
|
pub instructions: Option<String>,
|
feat: support mcp_servers in config.toml (#829)
This adds initial support for MCP servers in the style of Claude Desktop
and Cursor. Note this PR is the bare minimum to get things working end
to end: all configured MCP servers are launched every time Codex is run,
there is no recovery for MCP servers that crash, etc.
(Also, I took some shortcuts to change some fields of `Session` to be
`pub(crate)`, which also means there are circular deps between
`codex.rs` and `mcp_tool_call.rs`, but I will clean that up in a
subsequent PR.)
`codex-rs/README.md` is updated as part of this PR to explain how to use
this feature. There is a bit of plumbing to route the new settings from
`Config` to the business logic in `codex.rs`. The most significant
chunks for new code are in `mcp_connection_manager.rs` (which defines
the `McpConnectionManager` struct) and `mcp_tool_call.rs`, which is
responsible for tool calls.
This PR also introduces new `McpToolCallBegin` and `McpToolCallEnd`
event types to the protocol, but does not add any handlers for them.
(See https://github.com/openai/codex/pull/836 for initial usage.)
To test, I added the following to my `~/.codex/config.toml`:
```toml
# Local build of https://github.com/hideya/mcp-server-weather-js
[mcp_servers.weather]
command = "/Users/mbolin/code/mcp-server-weather-js/dist/index.js"
args = []
```
And then I ran the following:
```
codex-rs$ cargo run --bin codex exec 'what is the weather in san francisco'
[2025-05-06T22:40:05] Task started: 1
[2025-05-06T22:40:18] Agent message: Here’s the latest National Weather Service forecast for San Francisco (downtown, near 37.77° N, 122.42° W):
This Afternoon (Tue):
• Sunny, high near 69 °F
• West-southwest wind around 12 mph
Tonight:
• Partly cloudy, low around 52 °F
• SW wind 7–10 mph
...
```
Note that Codex itself is not able to make network calls, so it would
not normally be able to get live weather information like this. However,
the weather MCP is [currently] not run under the Codex sandbox, so it is
able to hit `api.weather.gov` and fetch current weather information.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/829).
* #836
* __->__ #829
2025-05-06 15:47:59 -07:00
|
|
|
|
|
|
|
|
/// Definition for MCP servers that Codex can reach out to for tool calls.
|
|
|
|
|
#[serde(default)]
|
|
|
|
|
pub mcp_servers: HashMap<String, McpServerConfig>,
|
2025-05-07 17:38:28 -07:00
|
|
|
|
|
|
|
|
/// User-defined provider entries that extend/override the built-in list.
|
|
|
|
|
#[serde(default)]
|
|
|
|
|
pub model_providers: HashMap<String, ModelProviderInfo>,
|
2025-05-10 17:52:59 -07:00
|
|
|
|
|
|
|
|
/// Maximum number of bytes to include from an AGENTS.md project doc file.
|
|
|
|
|
pub project_doc_max_bytes: Option<usize>,
|
2025-05-13 16:52:52 -07:00
|
|
|
|
|
|
|
|
/// Profile to use from the `profiles` map.
|
|
|
|
|
pub profile: Option<String>,
|
|
|
|
|
|
|
|
|
|
/// Named profiles to facilitate switching between different configurations.
|
|
|
|
|
#[serde(default)]
|
|
|
|
|
pub profiles: HashMap<String, ConfigProfile>,
|
feat: record messages from user in ~/.codex/history.jsonl (#939)
This is a large change to support a "history" feature like you would
expect in a shell like Bash.
History events are recorded in `$CODEX_HOME/history.jsonl`. Because it
is a JSONL file, it is straightforward to append new entries (as opposed
to the TypeScript file that uses `$CODEX_HOME/history.json`, so to be
valid JSON, each new entry entails rewriting the entire file). Because
it is possible for there to be multiple instances of Codex CLI writing
to `history.jsonl` at once, we use advisory file locking when working
with `history.jsonl` in `codex-rs/core/src/message_history.rs`.
Because we believe history is a sufficiently useful feature, we enable
it by default. Though to provide some safety, we set the file
permissions of `history.jsonl` to be `o600` so that other users on the
system cannot read the user's history. We do not yet support a default
list of `SENSITIVE_PATTERNS` as the TypeScript CLI does:
https://github.com/openai/codex/blob/3fdf9df1335ac9501e3fb0e61715359145711e8b/codex-cli/src/utils/storage/command-history.ts#L10-L17
We are going to take a more conservative approach to this list in the
Rust CLI. For example, while `/\b[A-Za-z0-9-_]{20,}\b/` might exclude
sensitive information like API tokens, it would also exclude valuable
information such as references to Git commits.
As noted in the updated documentation, users can opt-out of history by
adding the following to `config.toml`:
```toml
[history]
persistence = "none"
```
Because `history.jsonl` could, in theory, be quite large, we take a[n
arguably overly pedantic] approach in reading history entries into
memory. Specifically, we start by telling the client the current number
of entries in the history file (`history_entry_count`) as well as the
inode (`history_log_id`) of `history.jsonl` (see the new fields on
`SessionConfiguredEvent`).
The client is responsible for keeping new entries in memory to create a
"local history," but if the user hits up enough times to go "past" the
end of local history, then the client should use the new
`GetHistoryEntryRequest` in the protocol to fetch older entries.
Specifically, it should pass the `history_log_id` it was given
originally and work backwards from `history_entry_count`. (It should
really fetch history in batches rather than one-at-a-time, but that is
something we can improve upon in subsequent PRs.)
The motivation behind this crazy scheme is that it is designed to defend
against:
* The `history.jsonl` being truncated during the session such that the
index into the history is no longer consistent with what had been read
up to that point. We do not yet have logic to enforce a `max_bytes` for
`history.jsonl`, but once we do, we will aspire to implement it in a way
that should result in a new inode for the file on most systems.
* New items from concurrent Codex CLI sessions amending to the history.
Because, in absence of truncation, `history.jsonl` is an append-only
log, so long as the client reads backwards from `history_entry_count`,
it should always get a consistent view of history. (That said, it will
not be able to read _new_ commands from concurrent sessions, but perhaps
we will introduce a `/` command to reload latest history or something
down the road.)
Admittedly, my testing of this feature thus far has been fairly light. I
expect we will find bugs and introduce enhancements/fixes going forward.
2025-05-15 16:26:23 -07:00
|
|
|
|
|
|
|
|
/// Settings that govern if and what will be written to `~/.codex/history.jsonl`.
|
|
|
|
|
#[serde(default)]
|
|
|
|
|
pub history: Option<History>,
|
2025-05-16 11:33:08 -07:00
|
|
|
|
|
|
|
|
/// Optional URI-based file opener. If set, citations to files in the model
|
|
|
|
|
/// output will be hyperlinked using the specified URI scheme.
|
|
|
|
|
pub file_opener: Option<UriBasedFileOpener>,
|
2025-05-16 16:16:50 -07:00
|
|
|
|
|
|
|
|
/// Collection of settings that are specific to the TUI.
|
|
|
|
|
pub tui: Option<Tui>,
|
fix: overhaul SandboxPolicy and config loading in Rust (#732)
Previous to this PR, `SandboxPolicy` was a bit difficult to work with:
https://github.com/openai/codex/blob/237f8a11e11fdcc793a09e787e48215676d9b95b/codex-rs/core/src/protocol.rs#L98-L108
Specifically:
* It was an `enum` and therefore options were mutually exclusive as
opposed to additive.
* It defined things in terms of what the agent _could not_ do as opposed
to what they _could_ do. This made things hard to support because we
would prefer to build up a sandbox config by starting with something
extremely restrictive and only granting permissions for things the user
as explicitly allowed.
This PR changes things substantially by redefining the policy in terms
of two concepts:
* A `SandboxPermission` enum that defines permissions that can be
granted to the agent/sandbox.
* A `SandboxPolicy` that internally stores a `Vec<SandboxPermission>`,
but externally exposes a simpler API that can be used to configure
Seatbelt/Landlock.
Previous to this PR, we supported a `--sandbox` flag that effectively
mapped to an enum value in `SandboxPolicy`. Though now that
`SandboxPolicy` is a wrapper around `Vec<SandboxPermission>`, the single
`--sandbox` flag no longer makes sense. While I could have turned it
into a flag that the user can specify multiple times, I think the
current values to use with such a flag are long and potentially messy,
so for the moment, I have dropped support for `--sandbox` altogether and
we can bring it back once we have figured out the naming thing.
Since `--sandbox` is gone, users now have to specify `--full-auto` to
get a sandbox that allows writes in `cwd`. Admittedly, there is no clean
way to specify the equivalent of `--full-auto` in your `config.toml`
right now, so we will have to revisit that, as well.
Because `Config` presents a `SandboxPolicy` field and `SandboxPolicy`
changed considerably, I had to overhaul how config loading works, as
well. There are now two distinct concepts, `ConfigToml` and `Config`:
* `ConfigToml` is the deserialization of `~/.codex/config.toml`. As one
might expect, every field is `Optional` and it is `#[derive(Deserialize,
Default)]`. Consistent use of `Optional` makes it clear what the user
has specified explicitly.
* `Config` is the "normalized config" and is produced by merging
`ConfigToml` with `ConfigOverrides`. Where `ConfigToml` contains a raw
`Option<Vec<SandboxPermission>>`, `Config` presents only the final
`SandboxPolicy`.
The changes to `core/src/exec.rs` and `core/src/linux.rs` merit extra
special attention to ensure we are faithfully mapping the
`SandboxPolicy` to the Seatbelt and Landlock configs, respectively.
Also, take note that `core/src/seatbelt_readonly_policy.sbpl` has been
renamed to `codex-rs/core/src/seatbelt_base_policy.sbpl` and that
`(allow file-read*)` has been removed from the `.sbpl` file as now this
is added to the policy in `core/src/exec.rs` when
`sandbox_policy.has_full_disk_read_access()` is `true`.
2025-04-29 15:01:16 -07:00
|
|
|
}
|
|
|
|
|
|
2025-04-29 18:42:52 -07:00
|
|
|
fn deserialize_sandbox_permissions<'de, D>(
|
|
|
|
|
deserializer: D,
|
|
|
|
|
) -> Result<Option<Vec<SandboxPermission>>, D::Error>
|
|
|
|
|
where
|
|
|
|
|
D: serde::Deserializer<'de>,
|
|
|
|
|
{
|
|
|
|
|
let permissions: Option<Vec<String>> = Option::deserialize(deserializer)?;
|
|
|
|
|
|
|
|
|
|
match permissions {
|
|
|
|
|
Some(raw_permissions) => {
|
2025-05-15 00:30:13 -07:00
|
|
|
let base_path = find_codex_home().map_err(serde::de::Error::custom)?;
|
2025-04-29 18:42:52 -07:00
|
|
|
|
|
|
|
|
let converted = raw_permissions
|
|
|
|
|
.into_iter()
|
|
|
|
|
.map(|raw| {
|
|
|
|
|
parse_sandbox_permission_with_base_path(&raw, base_path.clone())
|
|
|
|
|
.map_err(serde::de::Error::custom)
|
|
|
|
|
})
|
|
|
|
|
.collect::<Result<Vec<_>, D::Error>>()?;
|
|
|
|
|
|
|
|
|
|
Ok(Some(converted))
|
|
|
|
|
}
|
|
|
|
|
None => Ok(None),
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2025-04-27 21:47:50 -07:00
|
|
|
/// Optional overrides for user configuration (e.g., from CLI flags).
|
|
|
|
|
#[derive(Default, Debug, Clone)]
|
|
|
|
|
pub struct ConfigOverrides {
|
|
|
|
|
pub model: Option<String>,
|
2025-05-04 10:57:12 -07:00
|
|
|
pub cwd: Option<PathBuf>,
|
2025-04-27 21:47:50 -07:00
|
|
|
pub approval_policy: Option<AskForApproval>,
|
|
|
|
|
pub sandbox_policy: Option<SandboxPolicy>,
|
2025-05-13 16:52:52 -07:00
|
|
|
pub model_provider: Option<String>,
|
|
|
|
|
pub config_profile: Option<String>,
|
2025-05-22 21:52:28 -07:00
|
|
|
pub codex_linux_sandbox_exe: Option<PathBuf>,
|
2025-04-27 21:47:50 -07:00
|
|
|
}
|
feat: initial import of Rust implementation of Codex CLI in codex-rs/ (#629)
As stated in `codex-rs/README.md`:
Today, Codex CLI is written in TypeScript and requires Node.js 22+ to
run it. For a number of users, this runtime requirement inhibits
adoption: they would be better served by a standalone executable. As
maintainers, we want Codex to run efficiently in a wide range of
environments with minimal overhead. We also want to take advantage of
operating system-specific APIs to provide better sandboxing, where
possible.
To that end, we are moving forward with a Rust implementation of Codex
CLI contained in this folder, which has the following benefits:
- The CLI compiles to small, standalone, platform-specific binaries.
- Can make direct, native calls to
[seccomp](https://man7.org/linux/man-pages/man2/seccomp.2.html) and
[landlock](https://man7.org/linux/man-pages/man7/landlock.7.html) in
order to support sandboxing on Linux.
- No runtime garbage collection, resulting in lower memory consumption
and better, more predictable performance.
Currently, the Rust implementation is materially behind the TypeScript
implementation in functionality, so continue to use the TypeScript
implmentation for the time being. We will publish native executables via
GitHub Releases as soon as we feel the Rust version is usable.
2025-04-24 13:31:40 -07:00
|
|
|
|
2025-04-27 21:47:50 -07:00
|
|
|
impl Config {
|
2025-05-15 00:30:13 -07:00
|
|
|
/// Meant to be used exclusively for tests: `load_with_overrides()` should
|
|
|
|
|
/// be used in all other cases.
|
|
|
|
|
pub fn load_from_base_config_with_overrides(
|
2025-05-07 17:38:28 -07:00
|
|
|
cfg: ConfigToml,
|
|
|
|
|
overrides: ConfigOverrides,
|
2025-05-15 00:30:13 -07:00
|
|
|
codex_home: PathBuf,
|
2025-05-07 17:38:28 -07:00
|
|
|
) -> std::io::Result<Self> {
|
2025-05-15 00:30:13 -07:00
|
|
|
let instructions = Self::load_instructions(Some(&codex_home));
|
feat: initial import of Rust implementation of Codex CLI in codex-rs/ (#629)
As stated in `codex-rs/README.md`:
Today, Codex CLI is written in TypeScript and requires Node.js 22+ to
run it. For a number of users, this runtime requirement inhibits
adoption: they would be better served by a standalone executable. As
maintainers, we want Codex to run efficiently in a wide range of
environments with minimal overhead. We also want to take advantage of
operating system-specific APIs to provide better sandboxing, where
possible.
To that end, we are moving forward with a Rust implementation of Codex
CLI contained in this folder, which has the following benefits:
- The CLI compiles to small, standalone, platform-specific binaries.
- Can make direct, native calls to
[seccomp](https://man7.org/linux/man-pages/man2/seccomp.2.html) and
[landlock](https://man7.org/linux/man-pages/man7/landlock.7.html) in
order to support sandboxing on Linux.
- No runtime garbage collection, resulting in lower memory consumption
and better, more predictable performance.
Currently, the Rust implementation is materially behind the TypeScript
implementation in functionality, so continue to use the TypeScript
implmentation for the time being. We will publish native executables via
GitHub Releases as soon as we feel the Rust version is usable.
2025-04-24 13:31:40 -07:00
|
|
|
|
2025-04-27 21:47:50 -07:00
|
|
|
// Destructure ConfigOverrides fully to ensure all overrides are applied.
|
|
|
|
|
let ConfigOverrides {
|
|
|
|
|
model,
|
2025-05-04 10:57:12 -07:00
|
|
|
cwd,
|
2025-04-27 21:47:50 -07:00
|
|
|
approval_policy,
|
|
|
|
|
sandbox_policy,
|
2025-05-13 16:52:52 -07:00
|
|
|
model_provider,
|
|
|
|
|
config_profile: config_profile_key,
|
2025-05-22 21:52:28 -07:00
|
|
|
codex_linux_sandbox_exe,
|
2025-04-27 21:47:50 -07:00
|
|
|
} = overrides;
|
|
|
|
|
|
2025-05-13 16:52:52 -07:00
|
|
|
let config_profile = match config_profile_key.or(cfg.profile) {
|
|
|
|
|
Some(key) => cfg
|
|
|
|
|
.profiles
|
|
|
|
|
.get(&key)
|
|
|
|
|
.ok_or_else(|| {
|
|
|
|
|
std::io::Error::new(
|
|
|
|
|
std::io::ErrorKind::NotFound,
|
|
|
|
|
format!("config profile `{key}` not found"),
|
|
|
|
|
)
|
|
|
|
|
})?
|
|
|
|
|
.clone(),
|
|
|
|
|
None => ConfigProfile::default(),
|
|
|
|
|
};
|
|
|
|
|
|
fix: overhaul SandboxPolicy and config loading in Rust (#732)
Previous to this PR, `SandboxPolicy` was a bit difficult to work with:
https://github.com/openai/codex/blob/237f8a11e11fdcc793a09e787e48215676d9b95b/codex-rs/core/src/protocol.rs#L98-L108
Specifically:
* It was an `enum` and therefore options were mutually exclusive as
opposed to additive.
* It defined things in terms of what the agent _could not_ do as opposed
to what they _could_ do. This made things hard to support because we
would prefer to build up a sandbox config by starting with something
extremely restrictive and only granting permissions for things the user
as explicitly allowed.
This PR changes things substantially by redefining the policy in terms
of two concepts:
* A `SandboxPermission` enum that defines permissions that can be
granted to the agent/sandbox.
* A `SandboxPolicy` that internally stores a `Vec<SandboxPermission>`,
but externally exposes a simpler API that can be used to configure
Seatbelt/Landlock.
Previous to this PR, we supported a `--sandbox` flag that effectively
mapped to an enum value in `SandboxPolicy`. Though now that
`SandboxPolicy` is a wrapper around `Vec<SandboxPermission>`, the single
`--sandbox` flag no longer makes sense. While I could have turned it
into a flag that the user can specify multiple times, I think the
current values to use with such a flag are long and potentially messy,
so for the moment, I have dropped support for `--sandbox` altogether and
we can bring it back once we have figured out the naming thing.
Since `--sandbox` is gone, users now have to specify `--full-auto` to
get a sandbox that allows writes in `cwd`. Admittedly, there is no clean
way to specify the equivalent of `--full-auto` in your `config.toml`
right now, so we will have to revisit that, as well.
Because `Config` presents a `SandboxPolicy` field and `SandboxPolicy`
changed considerably, I had to overhaul how config loading works, as
well. There are now two distinct concepts, `ConfigToml` and `Config`:
* `ConfigToml` is the deserialization of `~/.codex/config.toml`. As one
might expect, every field is `Optional` and it is `#[derive(Deserialize,
Default)]`. Consistent use of `Optional` makes it clear what the user
has specified explicitly.
* `Config` is the "normalized config" and is produced by merging
`ConfigToml` with `ConfigOverrides`. Where `ConfigToml` contains a raw
`Option<Vec<SandboxPermission>>`, `Config` presents only the final
`SandboxPolicy`.
The changes to `core/src/exec.rs` and `core/src/linux.rs` merit extra
special attention to ensure we are faithfully mapping the
`SandboxPolicy` to the Seatbelt and Landlock configs, respectively.
Also, take note that `core/src/seatbelt_readonly_policy.sbpl` has been
renamed to `codex-rs/core/src/seatbelt_base_policy.sbpl` and that
`(allow file-read*)` has been removed from the `.sbpl` file as now this
is added to the policy in `core/src/exec.rs` when
`sandbox_policy.has_full_disk_read_access()` is `true`.
2025-04-29 15:01:16 -07:00
|
|
|
let sandbox_policy = match sandbox_policy {
|
|
|
|
|
Some(sandbox_policy) => sandbox_policy,
|
|
|
|
|
None => {
|
|
|
|
|
// Derive a SandboxPolicy from the permissions in the config.
|
|
|
|
|
match cfg.sandbox_permissions {
|
|
|
|
|
// Note this means the user can explicitly set permissions
|
|
|
|
|
// to the empty list in the config file, granting it no
|
|
|
|
|
// permissions whatsoever.
|
|
|
|
|
Some(permissions) => SandboxPolicy::from(permissions),
|
|
|
|
|
// Default to read only rather than completely locked down.
|
|
|
|
|
None => SandboxPolicy::new_read_only_policy(),
|
|
|
|
|
}
|
2025-04-27 21:47:50 -07:00
|
|
|
}
|
fix: overhaul SandboxPolicy and config loading in Rust (#732)
Previous to this PR, `SandboxPolicy` was a bit difficult to work with:
https://github.com/openai/codex/blob/237f8a11e11fdcc793a09e787e48215676d9b95b/codex-rs/core/src/protocol.rs#L98-L108
Specifically:
* It was an `enum` and therefore options were mutually exclusive as
opposed to additive.
* It defined things in terms of what the agent _could not_ do as opposed
to what they _could_ do. This made things hard to support because we
would prefer to build up a sandbox config by starting with something
extremely restrictive and only granting permissions for things the user
as explicitly allowed.
This PR changes things substantially by redefining the policy in terms
of two concepts:
* A `SandboxPermission` enum that defines permissions that can be
granted to the agent/sandbox.
* A `SandboxPolicy` that internally stores a `Vec<SandboxPermission>`,
but externally exposes a simpler API that can be used to configure
Seatbelt/Landlock.
Previous to this PR, we supported a `--sandbox` flag that effectively
mapped to an enum value in `SandboxPolicy`. Though now that
`SandboxPolicy` is a wrapper around `Vec<SandboxPermission>`, the single
`--sandbox` flag no longer makes sense. While I could have turned it
into a flag that the user can specify multiple times, I think the
current values to use with such a flag are long and potentially messy,
so for the moment, I have dropped support for `--sandbox` altogether and
we can bring it back once we have figured out the naming thing.
Since `--sandbox` is gone, users now have to specify `--full-auto` to
get a sandbox that allows writes in `cwd`. Admittedly, there is no clean
way to specify the equivalent of `--full-auto` in your `config.toml`
right now, so we will have to revisit that, as well.
Because `Config` presents a `SandboxPolicy` field and `SandboxPolicy`
changed considerably, I had to overhaul how config loading works, as
well. There are now two distinct concepts, `ConfigToml` and `Config`:
* `ConfigToml` is the deserialization of `~/.codex/config.toml`. As one
might expect, every field is `Optional` and it is `#[derive(Deserialize,
Default)]`. Consistent use of `Optional` makes it clear what the user
has specified explicitly.
* `Config` is the "normalized config" and is produced by merging
`ConfigToml` with `ConfigOverrides`. Where `ConfigToml` contains a raw
`Option<Vec<SandboxPermission>>`, `Config` presents only the final
`SandboxPolicy`.
The changes to `core/src/exec.rs` and `core/src/linux.rs` merit extra
special attention to ensure we are faithfully mapping the
`SandboxPolicy` to the Seatbelt and Landlock configs, respectively.
Also, take note that `core/src/seatbelt_readonly_policy.sbpl` has been
renamed to `codex-rs/core/src/seatbelt_base_policy.sbpl` and that
`(allow file-read*)` has been removed from the `.sbpl` file as now this
is added to the policy in `core/src/exec.rs` when
`sandbox_policy.has_full_disk_read_access()` is `true`.
2025-04-29 15:01:16 -07:00
|
|
|
};
|
2025-04-27 21:47:50 -07:00
|
|
|
|
2025-05-07 17:38:28 -07:00
|
|
|
let mut model_providers = built_in_model_providers();
|
|
|
|
|
// Merge user-defined providers into the built-in list.
|
|
|
|
|
for (key, provider) in cfg.model_providers.into_iter() {
|
|
|
|
|
model_providers.entry(key).or_insert(provider);
|
|
|
|
|
}
|
|
|
|
|
|
2025-05-13 16:52:52 -07:00
|
|
|
let model_provider_id = model_provider
|
|
|
|
|
.or(config_profile.model_provider)
|
2025-05-07 17:38:28 -07:00
|
|
|
.or(cfg.model_provider)
|
|
|
|
|
.unwrap_or_else(|| "openai".to_string());
|
|
|
|
|
let model_provider = model_providers
|
2025-05-08 21:46:06 -07:00
|
|
|
.get(&model_provider_id)
|
2025-05-07 17:38:28 -07:00
|
|
|
.ok_or_else(|| {
|
|
|
|
|
std::io::Error::new(
|
|
|
|
|
std::io::ErrorKind::NotFound,
|
2025-05-08 21:46:06 -07:00
|
|
|
format!("Model provider `{model_provider_id}` not found"),
|
2025-05-07 17:38:28 -07:00
|
|
|
)
|
|
|
|
|
})?
|
|
|
|
|
.clone();
|
|
|
|
|
|
feat: introduce support for shell_environment_policy in config.toml (#1061)
To date, when handling `shell` and `local_shell` tool calls, we were
spawning new processes using the environment inherited from the Codex
process itself. This means that the sensitive `OPENAI_API_KEY` that
Codex needs to talk to OpenAI models was made available to everything
run by `shell` and `local_shell`. While there are cases where that might
be useful, it does not seem like a good default.
This PR introduces a complex `shell_environment_policy` config option to
control the `env` used with these tool calls. It is inevitably a bit
complex so that it is possible to override individual components of the
policy so without having to restate the entire thing.
Details are in the updated `README.md` in this PR, but here is the
relevant bit that explains the individual fields of
`shell_environment_policy`:
| Field | Type | Default | Description |
| ------------------------- | -------------------------- | ------- |
-----------------------------------------------------------------------------------------------------------------------------------------------
|
| `inherit` | string | `core` | Starting template for the
environment:<br>`core` (`HOME`, `PATH`, `USER`, …), `all` (clone full
parent env), or `none` (start empty). |
| `ignore_default_excludes` | boolean | `false` | When `false`, Codex
removes any var whose **name** contains `KEY`, `SECRET`, or `TOKEN`
(case-insensitive) before other rules run. |
| `exclude` | array<string> | `[]` | Case-insensitive glob
patterns to drop after the default filter.<br>Examples: `"AWS_*"`,
`"AZURE_*"`. |
| `set` | table<string,string> | `{}` | Explicit key/value
overrides or additions – always win over inherited values. |
| `include_only` | array<string> | `[]` | If non-empty, a
whitelist of patterns; only variables that match _one_ pattern survive
the final step. (Generally used with `inherit = "all"`.) |
In particular, note that the default is `inherit = "core"`, so:
* if you have extra env variables that you want to inherit from the
parent process, use `inherit = "all"` and then specify `include_only`
* if you have extra env variables where you want to hardcode the values,
the default `inherit = "core"` will work fine, but then you need to
specify `set`
This configuration is not battle-tested, so we will probably still have
to play with it a bit. `core/src/exec_env.rs` has the critical business
logic as well as unit tests.
Though if nothing else, previous to this change:
```
$ cargo run --bin codex -- debug seatbelt -- printenv OPENAI_API_KEY
# ...prints OPENAI_API_KEY...
```
But after this change it does not print anything (as desired).
One final thing to call out about this PR is that the
`configure_command!` macro we use in `core/src/exec.rs` has to do some
complex logic with respect to how it builds up the `env` for the process
being spawned under Landlock/seccomp. Specifically, doing
`cmd.env_clear()` followed by `cmd.envs(&$env_map)` (which is arguably
the most intuitive way to do it) caused the Landlock unit tests to fail
because the processes spawned by the unit tests started failing in
unexpected ways! If we forgo `env_clear()` in favor of updating env vars
one at a time, the tests still pass. The comment in the code talks about
this a bit, and while I would like to investigate this more, I need to
move on for the moment, but I do plan to come back to it to fully
understand what is going on. For example, this suggests that we might
not be able to spawn a C program that calls `env_clear()`, which would
be...weird. We may still have to fiddle with our Landlock config if that
is the case.
2025-05-22 09:51:19 -07:00
|
|
|
let shell_environment_policy = cfg.shell_environment_policy.into();
|
|
|
|
|
|
2025-05-12 08:45:46 -07:00
|
|
|
let resolved_cwd = {
|
|
|
|
|
use std::env;
|
|
|
|
|
|
|
|
|
|
match cwd {
|
|
|
|
|
None => {
|
|
|
|
|
tracing::info!("cwd not set, using current dir");
|
|
|
|
|
env::current_dir()?
|
|
|
|
|
}
|
|
|
|
|
Some(p) if p.is_absolute() => p,
|
|
|
|
|
Some(p) => {
|
|
|
|
|
// Resolve relative path against the current working directory.
|
|
|
|
|
tracing::info!("cwd is relative, resolving against current dir");
|
|
|
|
|
let mut current = env::current_dir()?;
|
|
|
|
|
current.push(p);
|
|
|
|
|
current
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
};
|
|
|
|
|
|
feat: record messages from user in ~/.codex/history.jsonl (#939)
This is a large change to support a "history" feature like you would
expect in a shell like Bash.
History events are recorded in `$CODEX_HOME/history.jsonl`. Because it
is a JSONL file, it is straightforward to append new entries (as opposed
to the TypeScript file that uses `$CODEX_HOME/history.json`, so to be
valid JSON, each new entry entails rewriting the entire file). Because
it is possible for there to be multiple instances of Codex CLI writing
to `history.jsonl` at once, we use advisory file locking when working
with `history.jsonl` in `codex-rs/core/src/message_history.rs`.
Because we believe history is a sufficiently useful feature, we enable
it by default. Though to provide some safety, we set the file
permissions of `history.jsonl` to be `o600` so that other users on the
system cannot read the user's history. We do not yet support a default
list of `SENSITIVE_PATTERNS` as the TypeScript CLI does:
https://github.com/openai/codex/blob/3fdf9df1335ac9501e3fb0e61715359145711e8b/codex-cli/src/utils/storage/command-history.ts#L10-L17
We are going to take a more conservative approach to this list in the
Rust CLI. For example, while `/\b[A-Za-z0-9-_]{20,}\b/` might exclude
sensitive information like API tokens, it would also exclude valuable
information such as references to Git commits.
As noted in the updated documentation, users can opt-out of history by
adding the following to `config.toml`:
```toml
[history]
persistence = "none"
```
Because `history.jsonl` could, in theory, be quite large, we take a[n
arguably overly pedantic] approach in reading history entries into
memory. Specifically, we start by telling the client the current number
of entries in the history file (`history_entry_count`) as well as the
inode (`history_log_id`) of `history.jsonl` (see the new fields on
`SessionConfiguredEvent`).
The client is responsible for keeping new entries in memory to create a
"local history," but if the user hits up enough times to go "past" the
end of local history, then the client should use the new
`GetHistoryEntryRequest` in the protocol to fetch older entries.
Specifically, it should pass the `history_log_id` it was given
originally and work backwards from `history_entry_count`. (It should
really fetch history in batches rather than one-at-a-time, but that is
something we can improve upon in subsequent PRs.)
The motivation behind this crazy scheme is that it is designed to defend
against:
* The `history.jsonl` being truncated during the session such that the
index into the history is no longer consistent with what had been read
up to that point. We do not yet have logic to enforce a `max_bytes` for
`history.jsonl`, but once we do, we will aspire to implement it in a way
that should result in a new inode for the file on most systems.
* New items from concurrent Codex CLI sessions amending to the history.
Because, in absence of truncation, `history.jsonl` is an append-only
log, so long as the client reads backwards from `history_entry_count`,
it should always get a consistent view of history. (That said, it will
not be able to read _new_ commands from concurrent sessions, but perhaps
we will introduce a `/` command to reload latest history or something
down the road.)
Admittedly, my testing of this feature thus far has been fairly light. I
expect we will find bugs and introduce enhancements/fixes going forward.
2025-05-15 16:26:23 -07:00
|
|
|
let history = cfg.history.unwrap_or_default();
|
|
|
|
|
|
2025-05-07 17:38:28 -07:00
|
|
|
let config = Self {
|
2025-05-13 16:52:52 -07:00
|
|
|
model: model
|
|
|
|
|
.or(config_profile.model)
|
|
|
|
|
.or(cfg.model)
|
|
|
|
|
.unwrap_or_else(default_model),
|
2025-05-08 21:46:06 -07:00
|
|
|
model_provider_id,
|
2025-05-07 17:38:28 -07:00
|
|
|
model_provider,
|
2025-05-12 08:45:46 -07:00
|
|
|
cwd: resolved_cwd,
|
fix: overhaul SandboxPolicy and config loading in Rust (#732)
Previous to this PR, `SandboxPolicy` was a bit difficult to work with:
https://github.com/openai/codex/blob/237f8a11e11fdcc793a09e787e48215676d9b95b/codex-rs/core/src/protocol.rs#L98-L108
Specifically:
* It was an `enum` and therefore options were mutually exclusive as
opposed to additive.
* It defined things in terms of what the agent _could not_ do as opposed
to what they _could_ do. This made things hard to support because we
would prefer to build up a sandbox config by starting with something
extremely restrictive and only granting permissions for things the user
as explicitly allowed.
This PR changes things substantially by redefining the policy in terms
of two concepts:
* A `SandboxPermission` enum that defines permissions that can be
granted to the agent/sandbox.
* A `SandboxPolicy` that internally stores a `Vec<SandboxPermission>`,
but externally exposes a simpler API that can be used to configure
Seatbelt/Landlock.
Previous to this PR, we supported a `--sandbox` flag that effectively
mapped to an enum value in `SandboxPolicy`. Though now that
`SandboxPolicy` is a wrapper around `Vec<SandboxPermission>`, the single
`--sandbox` flag no longer makes sense. While I could have turned it
into a flag that the user can specify multiple times, I think the
current values to use with such a flag are long and potentially messy,
so for the moment, I have dropped support for `--sandbox` altogether and
we can bring it back once we have figured out the naming thing.
Since `--sandbox` is gone, users now have to specify `--full-auto` to
get a sandbox that allows writes in `cwd`. Admittedly, there is no clean
way to specify the equivalent of `--full-auto` in your `config.toml`
right now, so we will have to revisit that, as well.
Because `Config` presents a `SandboxPolicy` field and `SandboxPolicy`
changed considerably, I had to overhaul how config loading works, as
well. There are now two distinct concepts, `ConfigToml` and `Config`:
* `ConfigToml` is the deserialization of `~/.codex/config.toml`. As one
might expect, every field is `Optional` and it is `#[derive(Deserialize,
Default)]`. Consistent use of `Optional` makes it clear what the user
has specified explicitly.
* `Config` is the "normalized config" and is produced by merging
`ConfigToml` with `ConfigOverrides`. Where `ConfigToml` contains a raw
`Option<Vec<SandboxPermission>>`, `Config` presents only the final
`SandboxPolicy`.
The changes to `core/src/exec.rs` and `core/src/linux.rs` merit extra
special attention to ensure we are faithfully mapping the
`SandboxPolicy` to the Seatbelt and Landlock configs, respectively.
Also, take note that `core/src/seatbelt_readonly_policy.sbpl` has been
renamed to `codex-rs/core/src/seatbelt_base_policy.sbpl` and that
`(allow file-read*)` has been removed from the `.sbpl` file as now this
is added to the policy in `core/src/exec.rs` when
`sandbox_policy.has_full_disk_read_access()` is `true`.
2025-04-29 15:01:16 -07:00
|
|
|
approval_policy: approval_policy
|
2025-05-13 16:52:52 -07:00
|
|
|
.or(config_profile.approval_policy)
|
fix: overhaul SandboxPolicy and config loading in Rust (#732)
Previous to this PR, `SandboxPolicy` was a bit difficult to work with:
https://github.com/openai/codex/blob/237f8a11e11fdcc793a09e787e48215676d9b95b/codex-rs/core/src/protocol.rs#L98-L108
Specifically:
* It was an `enum` and therefore options were mutually exclusive as
opposed to additive.
* It defined things in terms of what the agent _could not_ do as opposed
to what they _could_ do. This made things hard to support because we
would prefer to build up a sandbox config by starting with something
extremely restrictive and only granting permissions for things the user
as explicitly allowed.
This PR changes things substantially by redefining the policy in terms
of two concepts:
* A `SandboxPermission` enum that defines permissions that can be
granted to the agent/sandbox.
* A `SandboxPolicy` that internally stores a `Vec<SandboxPermission>`,
but externally exposes a simpler API that can be used to configure
Seatbelt/Landlock.
Previous to this PR, we supported a `--sandbox` flag that effectively
mapped to an enum value in `SandboxPolicy`. Though now that
`SandboxPolicy` is a wrapper around `Vec<SandboxPermission>`, the single
`--sandbox` flag no longer makes sense. While I could have turned it
into a flag that the user can specify multiple times, I think the
current values to use with such a flag are long and potentially messy,
so for the moment, I have dropped support for `--sandbox` altogether and
we can bring it back once we have figured out the naming thing.
Since `--sandbox` is gone, users now have to specify `--full-auto` to
get a sandbox that allows writes in `cwd`. Admittedly, there is no clean
way to specify the equivalent of `--full-auto` in your `config.toml`
right now, so we will have to revisit that, as well.
Because `Config` presents a `SandboxPolicy` field and `SandboxPolicy`
changed considerably, I had to overhaul how config loading works, as
well. There are now two distinct concepts, `ConfigToml` and `Config`:
* `ConfigToml` is the deserialization of `~/.codex/config.toml`. As one
might expect, every field is `Optional` and it is `#[derive(Deserialize,
Default)]`. Consistent use of `Optional` makes it clear what the user
has specified explicitly.
* `Config` is the "normalized config" and is produced by merging
`ConfigToml` with `ConfigOverrides`. Where `ConfigToml` contains a raw
`Option<Vec<SandboxPermission>>`, `Config` presents only the final
`SandboxPolicy`.
The changes to `core/src/exec.rs` and `core/src/linux.rs` merit extra
special attention to ensure we are faithfully mapping the
`SandboxPolicy` to the Seatbelt and Landlock configs, respectively.
Also, take note that `core/src/seatbelt_readonly_policy.sbpl` has been
renamed to `codex-rs/core/src/seatbelt_base_policy.sbpl` and that
`(allow file-read*)` has been removed from the `.sbpl` file as now this
is added to the policy in `core/src/exec.rs` when
`sandbox_policy.has_full_disk_read_access()` is `true`.
2025-04-29 15:01:16 -07:00
|
|
|
.or(cfg.approval_policy)
|
|
|
|
|
.unwrap_or_else(AskForApproval::default),
|
|
|
|
|
sandbox_policy,
|
feat: introduce support for shell_environment_policy in config.toml (#1061)
To date, when handling `shell` and `local_shell` tool calls, we were
spawning new processes using the environment inherited from the Codex
process itself. This means that the sensitive `OPENAI_API_KEY` that
Codex needs to talk to OpenAI models was made available to everything
run by `shell` and `local_shell`. While there are cases where that might
be useful, it does not seem like a good default.
This PR introduces a complex `shell_environment_policy` config option to
control the `env` used with these tool calls. It is inevitably a bit
complex so that it is possible to override individual components of the
policy so without having to restate the entire thing.
Details are in the updated `README.md` in this PR, but here is the
relevant bit that explains the individual fields of
`shell_environment_policy`:
| Field | Type | Default | Description |
| ------------------------- | -------------------------- | ------- |
-----------------------------------------------------------------------------------------------------------------------------------------------
|
| `inherit` | string | `core` | Starting template for the
environment:<br>`core` (`HOME`, `PATH`, `USER`, …), `all` (clone full
parent env), or `none` (start empty). |
| `ignore_default_excludes` | boolean | `false` | When `false`, Codex
removes any var whose **name** contains `KEY`, `SECRET`, or `TOKEN`
(case-insensitive) before other rules run. |
| `exclude` | array<string> | `[]` | Case-insensitive glob
patterns to drop after the default filter.<br>Examples: `"AWS_*"`,
`"AZURE_*"`. |
| `set` | table<string,string> | `{}` | Explicit key/value
overrides or additions – always win over inherited values. |
| `include_only` | array<string> | `[]` | If non-empty, a
whitelist of patterns; only variables that match _one_ pattern survive
the final step. (Generally used with `inherit = "all"`.) |
In particular, note that the default is `inherit = "core"`, so:
* if you have extra env variables that you want to inherit from the
parent process, use `inherit = "all"` and then specify `include_only`
* if you have extra env variables where you want to hardcode the values,
the default `inherit = "core"` will work fine, but then you need to
specify `set`
This configuration is not battle-tested, so we will probably still have
to play with it a bit. `core/src/exec_env.rs` has the critical business
logic as well as unit tests.
Though if nothing else, previous to this change:
```
$ cargo run --bin codex -- debug seatbelt -- printenv OPENAI_API_KEY
# ...prints OPENAI_API_KEY...
```
But after this change it does not print anything (as desired).
One final thing to call out about this PR is that the
`configure_command!` macro we use in `core/src/exec.rs` has to do some
complex logic with respect to how it builds up the `env` for the process
being spawned under Landlock/seccomp. Specifically, doing
`cmd.env_clear()` followed by `cmd.envs(&$env_map)` (which is arguably
the most intuitive way to do it) caused the Landlock unit tests to fail
because the processes spawned by the unit tests started failing in
unexpected ways! If we forgo `env_clear()` in favor of updating env vars
one at a time, the tests still pass. The comment in the code talks about
this a bit, and while I would like to investigate this more, I need to
move on for the moment, but I do plan to come back to it to fully
understand what is going on. For example, this suggests that we might
not be able to spawn a C program that calls `env_clear()`, which would
be...weird. We may still have to fiddle with our Landlock config if that
is the case.
2025-05-22 09:51:19 -07:00
|
|
|
shell_environment_policy,
|
feat: add support for -c/--config to override individual config items (#1137)
This PR introduces support for `-c`/`--config` so users can override
individual config values on the command line using `--config
name=value`. Example:
```
codex --config model=o4-mini
```
Making it possible to set arbitrary config values on the command line
results in a more flexible configuration scheme and makes it easier to
provide single-line examples that can be copy-pasted from documentation.
Effectively, it means there are four levels of configuration for some
values:
- Default value (e.g., `model` currently defaults to `o4-mini`)
- Value in `config.toml` (e.g., user could override the default to be
`model = "o3"` in their `config.toml`)
- Specifying `-c` or `--config` to override `model` (e.g., user can
include `-c model=o3` in their list of args to Codex)
- If available, a config-specific flag can be used, which takes
precedence over `-c` (e.g., user can specify `--model o3` in their list
of args to Codex)
Now that it is possible to specify anything that could be configured in
`config.toml` on the command line using `-c`, we do not need to have a
custom flag for every possible config option (which can clutter the
output of `--help`). To that end, as part of this PR, we drop support
for the `--disable-response-storage` flag, as users can now specify `-c
disable_response_storage=true` to get the equivalent functionality.
Under the hood, this works by loading the `config.toml` into a
`toml::Value`. Then for each `key=value`, we create a small synthetic
TOML file with `value` so that we can run the TOML parser to get the
equivalent `toml::Value`. We then parse `key` to determine the point in
the original `toml::Value` to do the insert/replace. Once all of the
overrides from `-c` args have been applied, the `toml::Value` is
deserialized into a `ConfigToml` and then the `ConfigOverrides` are
applied, as before.
2025-05-27 23:11:44 -07:00
|
|
|
disable_response_storage: config_profile
|
|
|
|
|
.disable_response_storage
|
fix: overhaul SandboxPolicy and config loading in Rust (#732)
Previous to this PR, `SandboxPolicy` was a bit difficult to work with:
https://github.com/openai/codex/blob/237f8a11e11fdcc793a09e787e48215676d9b95b/codex-rs/core/src/protocol.rs#L98-L108
Specifically:
* It was an `enum` and therefore options were mutually exclusive as
opposed to additive.
* It defined things in terms of what the agent _could not_ do as opposed
to what they _could_ do. This made things hard to support because we
would prefer to build up a sandbox config by starting with something
extremely restrictive and only granting permissions for things the user
as explicitly allowed.
This PR changes things substantially by redefining the policy in terms
of two concepts:
* A `SandboxPermission` enum that defines permissions that can be
granted to the agent/sandbox.
* A `SandboxPolicy` that internally stores a `Vec<SandboxPermission>`,
but externally exposes a simpler API that can be used to configure
Seatbelt/Landlock.
Previous to this PR, we supported a `--sandbox` flag that effectively
mapped to an enum value in `SandboxPolicy`. Though now that
`SandboxPolicy` is a wrapper around `Vec<SandboxPermission>`, the single
`--sandbox` flag no longer makes sense. While I could have turned it
into a flag that the user can specify multiple times, I think the
current values to use with such a flag are long and potentially messy,
so for the moment, I have dropped support for `--sandbox` altogether and
we can bring it back once we have figured out the naming thing.
Since `--sandbox` is gone, users now have to specify `--full-auto` to
get a sandbox that allows writes in `cwd`. Admittedly, there is no clean
way to specify the equivalent of `--full-auto` in your `config.toml`
right now, so we will have to revisit that, as well.
Because `Config` presents a `SandboxPolicy` field and `SandboxPolicy`
changed considerably, I had to overhaul how config loading works, as
well. There are now two distinct concepts, `ConfigToml` and `Config`:
* `ConfigToml` is the deserialization of `~/.codex/config.toml`. As one
might expect, every field is `Optional` and it is `#[derive(Deserialize,
Default)]`. Consistent use of `Optional` makes it clear what the user
has specified explicitly.
* `Config` is the "normalized config" and is produced by merging
`ConfigToml` with `ConfigOverrides`. Where `ConfigToml` contains a raw
`Option<Vec<SandboxPermission>>`, `Config` presents only the final
`SandboxPolicy`.
The changes to `core/src/exec.rs` and `core/src/linux.rs` merit extra
special attention to ensure we are faithfully mapping the
`SandboxPolicy` to the Seatbelt and Landlock configs, respectively.
Also, take note that `core/src/seatbelt_readonly_policy.sbpl` has been
renamed to `codex-rs/core/src/seatbelt_base_policy.sbpl` and that
`(allow file-read*)` has been removed from the `.sbpl` file as now this
is added to the policy in `core/src/exec.rs` when
`sandbox_policy.has_full_disk_read_access()` is `true`.
2025-04-29 15:01:16 -07:00
|
|
|
.or(cfg.disable_response_storage)
|
|
|
|
|
.unwrap_or(false),
|
feat: configurable notifications in the Rust CLI (#793)
With this change, you can specify a program that will be executed to get
notified about events generated by Codex. The notification info will be
packaged as a JSON object. The supported notification types are defined
by the `UserNotification` enum introduced in this PR. Initially, it
contains only one variant, `AgentTurnComplete`:
```rust
pub(crate) enum UserNotification {
#[serde(rename_all = "kebab-case")]
AgentTurnComplete {
turn_id: String,
/// Messages that the user sent to the agent to initiate the turn.
input_messages: Vec<String>,
/// The last message sent by the assistant in the turn.
last_assistant_message: Option<String>,
},
}
```
This is intended to support the common case when a "turn" ends, which
often means it is now your chance to give Codex further instructions.
For example, I have the following in my `~/.codex/config.toml`:
```toml
notify = ["python3", "/Users/mbolin/.codex/notify.py"]
```
I created my own custom notifier script that calls out to
[terminal-notifier](https://github.com/julienXX/terminal-notifier) to
show a desktop push notification on macOS. Contents of `notify.py`:
```python
#!/usr/bin/env python3
import json
import subprocess
import sys
def main() -> int:
if len(sys.argv) != 2:
print("Usage: notify.py <NOTIFICATION_JSON>")
return 1
try:
notification = json.loads(sys.argv[1])
except json.JSONDecodeError:
return 1
match notification_type := notification.get("type"):
case "agent-turn-complete":
assistant_message = notification.get("last-assistant-message")
if assistant_message:
title = f"Codex: {assistant_message}"
else:
title = "Codex: Turn Complete!"
input_messages = notification.get("input_messages", [])
message = " ".join(input_messages)
title += message
case _:
print(f"not sending a push notification for: {notification_type}")
return 0
subprocess.check_output(
[
"terminal-notifier",
"-title",
title,
"-message",
message,
"-group",
"codex",
"-ignoreDnD",
"-activate",
"com.googlecode.iterm2",
]
)
return 0
if __name__ == "__main__":
sys.exit(main())
```
For reference, here are related PRs that tried to add this functionality
to the TypeScript version of the Codex CLI:
* https://github.com/openai/codex/pull/160
* https://github.com/openai/codex/pull/498
2025-05-02 19:48:13 -07:00
|
|
|
notify: cfg.notify,
|
fix: overhaul SandboxPolicy and config loading in Rust (#732)
Previous to this PR, `SandboxPolicy` was a bit difficult to work with:
https://github.com/openai/codex/blob/237f8a11e11fdcc793a09e787e48215676d9b95b/codex-rs/core/src/protocol.rs#L98-L108
Specifically:
* It was an `enum` and therefore options were mutually exclusive as
opposed to additive.
* It defined things in terms of what the agent _could not_ do as opposed
to what they _could_ do. This made things hard to support because we
would prefer to build up a sandbox config by starting with something
extremely restrictive and only granting permissions for things the user
as explicitly allowed.
This PR changes things substantially by redefining the policy in terms
of two concepts:
* A `SandboxPermission` enum that defines permissions that can be
granted to the agent/sandbox.
* A `SandboxPolicy` that internally stores a `Vec<SandboxPermission>`,
but externally exposes a simpler API that can be used to configure
Seatbelt/Landlock.
Previous to this PR, we supported a `--sandbox` flag that effectively
mapped to an enum value in `SandboxPolicy`. Though now that
`SandboxPolicy` is a wrapper around `Vec<SandboxPermission>`, the single
`--sandbox` flag no longer makes sense. While I could have turned it
into a flag that the user can specify multiple times, I think the
current values to use with such a flag are long and potentially messy,
so for the moment, I have dropped support for `--sandbox` altogether and
we can bring it back once we have figured out the naming thing.
Since `--sandbox` is gone, users now have to specify `--full-auto` to
get a sandbox that allows writes in `cwd`. Admittedly, there is no clean
way to specify the equivalent of `--full-auto` in your `config.toml`
right now, so we will have to revisit that, as well.
Because `Config` presents a `SandboxPolicy` field and `SandboxPolicy`
changed considerably, I had to overhaul how config loading works, as
well. There are now two distinct concepts, `ConfigToml` and `Config`:
* `ConfigToml` is the deserialization of `~/.codex/config.toml`. As one
might expect, every field is `Optional` and it is `#[derive(Deserialize,
Default)]`. Consistent use of `Optional` makes it clear what the user
has specified explicitly.
* `Config` is the "normalized config" and is produced by merging
`ConfigToml` with `ConfigOverrides`. Where `ConfigToml` contains a raw
`Option<Vec<SandboxPermission>>`, `Config` presents only the final
`SandboxPolicy`.
The changes to `core/src/exec.rs` and `core/src/linux.rs` merit extra
special attention to ensure we are faithfully mapping the
`SandboxPolicy` to the Seatbelt and Landlock configs, respectively.
Also, take note that `core/src/seatbelt_readonly_policy.sbpl` has been
renamed to `codex-rs/core/src/seatbelt_base_policy.sbpl` and that
`(allow file-read*)` has been removed from the `.sbpl` file as now this
is added to the policy in `core/src/exec.rs` when
`sandbox_policy.has_full_disk_read_access()` is `true`.
2025-04-29 15:01:16 -07:00
|
|
|
instructions,
|
feat: support mcp_servers in config.toml (#829)
This adds initial support for MCP servers in the style of Claude Desktop
and Cursor. Note this PR is the bare minimum to get things working end
to end: all configured MCP servers are launched every time Codex is run,
there is no recovery for MCP servers that crash, etc.
(Also, I took some shortcuts to change some fields of `Session` to be
`pub(crate)`, which also means there are circular deps between
`codex.rs` and `mcp_tool_call.rs`, but I will clean that up in a
subsequent PR.)
`codex-rs/README.md` is updated as part of this PR to explain how to use
this feature. There is a bit of plumbing to route the new settings from
`Config` to the business logic in `codex.rs`. The most significant
chunks for new code are in `mcp_connection_manager.rs` (which defines
the `McpConnectionManager` struct) and `mcp_tool_call.rs`, which is
responsible for tool calls.
This PR also introduces new `McpToolCallBegin` and `McpToolCallEnd`
event types to the protocol, but does not add any handlers for them.
(See https://github.com/openai/codex/pull/836 for initial usage.)
To test, I added the following to my `~/.codex/config.toml`:
```toml
# Local build of https://github.com/hideya/mcp-server-weather-js
[mcp_servers.weather]
command = "/Users/mbolin/code/mcp-server-weather-js/dist/index.js"
args = []
```
And then I ran the following:
```
codex-rs$ cargo run --bin codex exec 'what is the weather in san francisco'
[2025-05-06T22:40:05] Task started: 1
[2025-05-06T22:40:18] Agent message: Here’s the latest National Weather Service forecast for San Francisco (downtown, near 37.77° N, 122.42° W):
This Afternoon (Tue):
• Sunny, high near 69 °F
• West-southwest wind around 12 mph
Tonight:
• Partly cloudy, low around 52 °F
• SW wind 7–10 mph
...
```
Note that Codex itself is not able to make network calls, so it would
not normally be able to get live weather information like this. However,
the weather MCP is [currently] not run under the Codex sandbox, so it is
able to hit `api.weather.gov` and fetch current weather information.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/829).
* #836
* __->__ #829
2025-05-06 15:47:59 -07:00
|
|
|
mcp_servers: cfg.mcp_servers,
|
2025-05-07 17:38:28 -07:00
|
|
|
model_providers,
|
2025-05-10 17:52:59 -07:00
|
|
|
project_doc_max_bytes: cfg.project_doc_max_bytes.unwrap_or(PROJECT_DOC_MAX_BYTES),
|
2025-05-15 00:30:13 -07:00
|
|
|
codex_home,
|
feat: record messages from user in ~/.codex/history.jsonl (#939)
This is a large change to support a "history" feature like you would
expect in a shell like Bash.
History events are recorded in `$CODEX_HOME/history.jsonl`. Because it
is a JSONL file, it is straightforward to append new entries (as opposed
to the TypeScript file that uses `$CODEX_HOME/history.json`, so to be
valid JSON, each new entry entails rewriting the entire file). Because
it is possible for there to be multiple instances of Codex CLI writing
to `history.jsonl` at once, we use advisory file locking when working
with `history.jsonl` in `codex-rs/core/src/message_history.rs`.
Because we believe history is a sufficiently useful feature, we enable
it by default. Though to provide some safety, we set the file
permissions of `history.jsonl` to be `o600` so that other users on the
system cannot read the user's history. We do not yet support a default
list of `SENSITIVE_PATTERNS` as the TypeScript CLI does:
https://github.com/openai/codex/blob/3fdf9df1335ac9501e3fb0e61715359145711e8b/codex-cli/src/utils/storage/command-history.ts#L10-L17
We are going to take a more conservative approach to this list in the
Rust CLI. For example, while `/\b[A-Za-z0-9-_]{20,}\b/` might exclude
sensitive information like API tokens, it would also exclude valuable
information such as references to Git commits.
As noted in the updated documentation, users can opt-out of history by
adding the following to `config.toml`:
```toml
[history]
persistence = "none"
```
Because `history.jsonl` could, in theory, be quite large, we take a[n
arguably overly pedantic] approach in reading history entries into
memory. Specifically, we start by telling the client the current number
of entries in the history file (`history_entry_count`) as well as the
inode (`history_log_id`) of `history.jsonl` (see the new fields on
`SessionConfiguredEvent`).
The client is responsible for keeping new entries in memory to create a
"local history," but if the user hits up enough times to go "past" the
end of local history, then the client should use the new
`GetHistoryEntryRequest` in the protocol to fetch older entries.
Specifically, it should pass the `history_log_id` it was given
originally and work backwards from `history_entry_count`. (It should
really fetch history in batches rather than one-at-a-time, but that is
something we can improve upon in subsequent PRs.)
The motivation behind this crazy scheme is that it is designed to defend
against:
* The `history.jsonl` being truncated during the session such that the
index into the history is no longer consistent with what had been read
up to that point. We do not yet have logic to enforce a `max_bytes` for
`history.jsonl`, but once we do, we will aspire to implement it in a way
that should result in a new inode for the file on most systems.
* New items from concurrent Codex CLI sessions amending to the history.
Because, in absence of truncation, `history.jsonl` is an append-only
log, so long as the client reads backwards from `history_entry_count`,
it should always get a consistent view of history. (That said, it will
not be able to read _new_ commands from concurrent sessions, but perhaps
we will introduce a `/` command to reload latest history or something
down the road.)
Admittedly, my testing of this feature thus far has been fairly light. I
expect we will find bugs and introduce enhancements/fixes going forward.
2025-05-15 16:26:23 -07:00
|
|
|
history,
|
2025-05-16 11:33:08 -07:00
|
|
|
file_opener: cfg.file_opener.unwrap_or(UriBasedFileOpener::VsCode),
|
2025-05-16 16:16:50 -07:00
|
|
|
tui: cfg.tui.unwrap_or_default(),
|
2025-05-22 21:52:28 -07:00
|
|
|
codex_linux_sandbox_exe,
|
2025-05-07 17:38:28 -07:00
|
|
|
};
|
|
|
|
|
Ok(config)
|
feat: initial import of Rust implementation of Codex CLI in codex-rs/ (#629)
As stated in `codex-rs/README.md`:
Today, Codex CLI is written in TypeScript and requires Node.js 22+ to
run it. For a number of users, this runtime requirement inhibits
adoption: they would be better served by a standalone executable. As
maintainers, we want Codex to run efficiently in a wide range of
environments with minimal overhead. We also want to take advantage of
operating system-specific APIs to provide better sandboxing, where
possible.
To that end, we are moving forward with a Rust implementation of Codex
CLI contained in this folder, which has the following benefits:
- The CLI compiles to small, standalone, platform-specific binaries.
- Can make direct, native calls to
[seccomp](https://man7.org/linux/man-pages/man2/seccomp.2.html) and
[landlock](https://man7.org/linux/man-pages/man7/landlock.7.html) in
order to support sandboxing on Linux.
- No runtime garbage collection, resulting in lower memory consumption
and better, more predictable performance.
Currently, the Rust implementation is materially behind the TypeScript
implementation in functionality, so continue to use the TypeScript
implmentation for the time being. We will publish native executables via
GitHub Releases as soon as we feel the Rust version is usable.
2025-04-24 13:31:40 -07:00
|
|
|
}
|
|
|
|
|
|
2025-05-13 16:52:52 -07:00
|
|
|
fn load_instructions(codex_dir: Option<&Path>) -> Option<String> {
|
|
|
|
|
let mut p = match codex_dir {
|
|
|
|
|
Some(p) => p.to_path_buf(),
|
|
|
|
|
None => return None,
|
|
|
|
|
};
|
|
|
|
|
|
fix: write logs to ~/.codex/log instead of /tmp (#669)
Previously, the Rust TUI was writing log files to `/tmp`, which is
world-readable and not available on Windows, so that isn't great.
This PR tries to clean things up by adding a function that provides the
path to the "Codex config dir," e.g., `~/.codex` (though I suppose we
could support `$CODEX_HOME` to override this?) and then defines other
paths in terms of the result of `codex_dir()`.
For example, `log_dir()` returns the folder where log files should be
written which is defined in terms of `codex_dir()`. I updated the TUI to
use this function. On UNIX, we even go so far as to `chmod 600` the log
file by default, though as noted in a comment, it's a bit tedious to do
the equivalent on Windows, so we just let that go for now.
This also changes the default logging level to `info` for `codex_core`
and `codex_tui` when `RUST_LOG` is not specified. I'm not really sure if
we should use a more verbose default (it may be helpful when debugging
user issues), though if so, we should probably also set up log rotation?
2025-04-25 17:37:41 -07:00
|
|
|
p.push("instructions.md");
|
2025-05-12 17:24:44 -07:00
|
|
|
std::fs::read_to_string(&p).ok().and_then(|s| {
|
|
|
|
|
let s = s.trim();
|
|
|
|
|
if s.is_empty() {
|
|
|
|
|
None
|
|
|
|
|
} else {
|
|
|
|
|
Some(s.to_string())
|
|
|
|
|
}
|
|
|
|
|
})
|
feat: initial import of Rust implementation of Codex CLI in codex-rs/ (#629)
As stated in `codex-rs/README.md`:
Today, Codex CLI is written in TypeScript and requires Node.js 22+ to
run it. For a number of users, this runtime requirement inhibits
adoption: they would be better served by a standalone executable. As
maintainers, we want Codex to run efficiently in a wide range of
environments with minimal overhead. We also want to take advantage of
operating system-specific APIs to provide better sandboxing, where
possible.
To that end, we are moving forward with a Rust implementation of Codex
CLI contained in this folder, which has the following benefits:
- The CLI compiles to small, standalone, platform-specific binaries.
- Can make direct, native calls to
[seccomp](https://man7.org/linux/man-pages/man2/seccomp.2.html) and
[landlock](https://man7.org/linux/man-pages/man7/landlock.7.html) in
order to support sandboxing on Linux.
- No runtime garbage collection, resulting in lower memory consumption
and better, more predictable performance.
Currently, the Rust implementation is materially behind the TypeScript
implementation in functionality, so continue to use the TypeScript
implmentation for the time being. We will publish native executables via
GitHub Releases as soon as we feel the Rust version is usable.
2025-04-24 13:31:40 -07:00
|
|
|
}
|
|
|
|
|
}
|
fix: write logs to ~/.codex/log instead of /tmp (#669)
Previously, the Rust TUI was writing log files to `/tmp`, which is
world-readable and not available on Windows, so that isn't great.
This PR tries to clean things up by adding a function that provides the
path to the "Codex config dir," e.g., `~/.codex` (though I suppose we
could support `$CODEX_HOME` to override this?) and then defines other
paths in terms of the result of `codex_dir()`.
For example, `log_dir()` returns the folder where log files should be
written which is defined in terms of `codex_dir()`. I updated the TUI to
use this function. On UNIX, we even go so far as to `chmod 600` the log
file by default, though as noted in a comment, it's a bit tedious to do
the equivalent on Windows, so we just let that go for now.
This also changes the default logging level to `info` for `codex_core`
and `codex_tui` when `RUST_LOG` is not specified. I'm not really sure if
we should use a more verbose default (it may be helpful when debugging
user issues), though if so, we should probably also set up log rotation?
2025-04-25 17:37:41 -07:00
|
|
|
|
2025-04-27 21:47:50 -07:00
|
|
|
fn default_model() -> String {
|
|
|
|
|
OPENAI_DEFAULT_MODEL.to_string()
|
|
|
|
|
}
|
|
|
|
|
|
2025-05-15 00:30:13 -07:00
|
|
|
/// Returns the path to the Codex configuration directory, which can be
|
|
|
|
|
/// specified by the `CODEX_HOME` environment variable. If not set, defaults to
|
|
|
|
|
/// `~/.codex`.
|
|
|
|
|
///
|
|
|
|
|
/// - If `CODEX_HOME` is set, the value will be canonicalized and this
|
|
|
|
|
/// function will Err if the path does not exist.
|
|
|
|
|
/// - If `CODEX_HOME` is not set, this function does not verify that the
|
|
|
|
|
/// directory exists.
|
|
|
|
|
fn find_codex_home() -> std::io::Result<PathBuf> {
|
|
|
|
|
// Honor the `CODEX_HOME` environment variable when it is set to allow users
|
|
|
|
|
// (and tests) to override the default location.
|
|
|
|
|
if let Ok(val) = std::env::var("CODEX_HOME") {
|
|
|
|
|
if !val.is_empty() {
|
|
|
|
|
return PathBuf::from(val).canonicalize();
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
fix: write logs to ~/.codex/log instead of /tmp (#669)
Previously, the Rust TUI was writing log files to `/tmp`, which is
world-readable and not available on Windows, so that isn't great.
This PR tries to clean things up by adding a function that provides the
path to the "Codex config dir," e.g., `~/.codex` (though I suppose we
could support `$CODEX_HOME` to override this?) and then defines other
paths in terms of the result of `codex_dir()`.
For example, `log_dir()` returns the folder where log files should be
written which is defined in terms of `codex_dir()`. I updated the TUI to
use this function. On UNIX, we even go so far as to `chmod 600` the log
file by default, though as noted in a comment, it's a bit tedious to do
the equivalent on Windows, so we just let that go for now.
This also changes the default logging level to `info` for `codex_core`
and `codex_tui` when `RUST_LOG` is not specified. I'm not really sure if
we should use a more verbose default (it may be helpful when debugging
user issues), though if so, we should probably also set up log rotation?
2025-04-25 17:37:41 -07:00
|
|
|
let mut p = home_dir().ok_or_else(|| {
|
|
|
|
|
std::io::Error::new(
|
|
|
|
|
std::io::ErrorKind::NotFound,
|
|
|
|
|
"Could not find home directory",
|
|
|
|
|
)
|
|
|
|
|
})?;
|
|
|
|
|
p.push(".codex");
|
|
|
|
|
Ok(p)
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Returns the path to the folder where Codex logs are stored. Does not verify
|
|
|
|
|
/// that the directory exists.
|
2025-05-15 00:30:13 -07:00
|
|
|
pub fn log_dir(cfg: &Config) -> std::io::Result<PathBuf> {
|
|
|
|
|
let mut p = cfg.codex_home.clone();
|
fix: write logs to ~/.codex/log instead of /tmp (#669)
Previously, the Rust TUI was writing log files to `/tmp`, which is
world-readable and not available on Windows, so that isn't great.
This PR tries to clean things up by adding a function that provides the
path to the "Codex config dir," e.g., `~/.codex` (though I suppose we
could support `$CODEX_HOME` to override this?) and then defines other
paths in terms of the result of `codex_dir()`.
For example, `log_dir()` returns the folder where log files should be
written which is defined in terms of `codex_dir()`. I updated the TUI to
use this function. On UNIX, we even go so far as to `chmod 600` the log
file by default, though as noted in a comment, it's a bit tedious to do
the equivalent on Windows, so we just let that go for now.
This also changes the default logging level to `info` for `codex_core`
and `codex_tui` when `RUST_LOG` is not specified. I'm not really sure if
we should use a more verbose default (it may be helpful when debugging
user issues), though if so, we should probably also set up log rotation?
2025-04-25 17:37:41 -07:00
|
|
|
p.push("log");
|
|
|
|
|
Ok(p)
|
|
|
|
|
}
|
2025-04-29 18:42:52 -07:00
|
|
|
|
2025-05-06 17:38:56 -07:00
|
|
|
pub fn parse_sandbox_permission_with_base_path(
|
2025-05-06 12:02:49 -07:00
|
|
|
raw: &str,
|
|
|
|
|
base_path: PathBuf,
|
|
|
|
|
) -> std::io::Result<SandboxPermission> {
|
|
|
|
|
use SandboxPermission::*;
|
|
|
|
|
|
|
|
|
|
if let Some(path) = raw.strip_prefix("disk-write-folder=") {
|
|
|
|
|
return if path.is_empty() {
|
|
|
|
|
Err(std::io::Error::new(
|
|
|
|
|
std::io::ErrorKind::InvalidInput,
|
|
|
|
|
"--sandbox-permission disk-write-folder=<PATH> requires a non-empty PATH",
|
|
|
|
|
))
|
|
|
|
|
} else {
|
|
|
|
|
use path_absolutize::*;
|
|
|
|
|
|
|
|
|
|
let file = PathBuf::from(path);
|
|
|
|
|
let absolute_path = if file.is_relative() {
|
|
|
|
|
file.absolutize_from(base_path)
|
|
|
|
|
} else {
|
|
|
|
|
file.absolutize()
|
|
|
|
|
}
|
|
|
|
|
.map(|path| path.into_owned())?;
|
|
|
|
|
Ok(DiskWriteFolder {
|
|
|
|
|
folder: absolute_path,
|
|
|
|
|
})
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
match raw {
|
|
|
|
|
"disk-full-read-access" => Ok(DiskFullReadAccess),
|
|
|
|
|
"disk-write-platform-user-temp-folder" => Ok(DiskWritePlatformUserTempFolder),
|
|
|
|
|
"disk-write-platform-global-temp-folder" => Ok(DiskWritePlatformGlobalTempFolder),
|
|
|
|
|
"disk-write-cwd" => Ok(DiskWriteCwd),
|
|
|
|
|
"disk-full-write-access" => Ok(DiskFullWriteAccess),
|
|
|
|
|
"network-full-access" => Ok(NetworkFullAccess),
|
2025-05-07 08:37:48 -07:00
|
|
|
_ => Err(std::io::Error::new(
|
|
|
|
|
std::io::ErrorKind::InvalidInput,
|
|
|
|
|
format!(
|
|
|
|
|
"`{raw}` is not a recognised permission.\nRun with `--help` to see the accepted values."
|
|
|
|
|
),
|
|
|
|
|
)),
|
2025-05-06 12:02:49 -07:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2025-04-29 18:42:52 -07:00
|
|
|
#[cfg(test)]
|
|
|
|
|
mod tests {
|
2025-05-12 08:45:46 -07:00
|
|
|
#![allow(clippy::expect_used, clippy::unwrap_used)]
|
2025-05-20 11:55:25 -07:00
|
|
|
use crate::config_types::HistoryPersistence;
|
|
|
|
|
|
2025-04-29 18:42:52 -07:00
|
|
|
use super::*;
|
2025-05-13 16:52:52 -07:00
|
|
|
use pretty_assertions::assert_eq;
|
|
|
|
|
use tempfile::TempDir;
|
2025-04-29 18:42:52 -07:00
|
|
|
|
|
|
|
|
/// Verify that the `sandbox_permissions` field on `ConfigToml` correctly
|
|
|
|
|
/// differentiates between a value that is completely absent in the
|
|
|
|
|
/// provided TOML (i.e. `None`) and one that is explicitly specified as an
|
|
|
|
|
/// empty array (i.e. `Some(vec![])`). This ensures that downstream logic
|
|
|
|
|
/// that treats these two cases differently (default read-only policy vs a
|
|
|
|
|
/// fully locked-down sandbox) continues to function.
|
|
|
|
|
#[test]
|
|
|
|
|
fn test_sandbox_permissions_none_vs_empty_vec() {
|
|
|
|
|
// Case 1: `sandbox_permissions` key is *absent* from the TOML source.
|
|
|
|
|
let toml_source_without_key = "";
|
|
|
|
|
let cfg_without_key: ConfigToml = toml::from_str(toml_source_without_key)
|
|
|
|
|
.expect("TOML deserialization without key should succeed");
|
|
|
|
|
assert!(cfg_without_key.sandbox_permissions.is_none());
|
|
|
|
|
|
|
|
|
|
// Case 2: `sandbox_permissions` is present but set to an *empty array*.
|
|
|
|
|
let toml_source_with_empty = "sandbox_permissions = []";
|
|
|
|
|
let cfg_with_empty: ConfigToml = toml::from_str(toml_source_with_empty)
|
|
|
|
|
.expect("TOML deserialization with empty array should succeed");
|
|
|
|
|
assert_eq!(Some(vec![]), cfg_with_empty.sandbox_permissions);
|
|
|
|
|
|
|
|
|
|
// Case 3: `sandbox_permissions` contains a non-empty list of valid values.
|
|
|
|
|
let toml_source_with_values = r#"
|
|
|
|
|
sandbox_permissions = ["disk-full-read-access", "network-full-access"]
|
|
|
|
|
"#;
|
|
|
|
|
let cfg_with_values: ConfigToml = toml::from_str(toml_source_with_values)
|
|
|
|
|
.expect("TOML deserialization with valid permissions should succeed");
|
|
|
|
|
|
|
|
|
|
assert_eq!(
|
|
|
|
|
Some(vec![
|
|
|
|
|
SandboxPermission::DiskFullReadAccess,
|
|
|
|
|
SandboxPermission::NetworkFullAccess
|
|
|
|
|
]),
|
|
|
|
|
cfg_with_values.sandbox_permissions
|
|
|
|
|
);
|
|
|
|
|
}
|
|
|
|
|
|
feat: record messages from user in ~/.codex/history.jsonl (#939)
This is a large change to support a "history" feature like you would
expect in a shell like Bash.
History events are recorded in `$CODEX_HOME/history.jsonl`. Because it
is a JSONL file, it is straightforward to append new entries (as opposed
to the TypeScript file that uses `$CODEX_HOME/history.json`, so to be
valid JSON, each new entry entails rewriting the entire file). Because
it is possible for there to be multiple instances of Codex CLI writing
to `history.jsonl` at once, we use advisory file locking when working
with `history.jsonl` in `codex-rs/core/src/message_history.rs`.
Because we believe history is a sufficiently useful feature, we enable
it by default. Though to provide some safety, we set the file
permissions of `history.jsonl` to be `o600` so that other users on the
system cannot read the user's history. We do not yet support a default
list of `SENSITIVE_PATTERNS` as the TypeScript CLI does:
https://github.com/openai/codex/blob/3fdf9df1335ac9501e3fb0e61715359145711e8b/codex-cli/src/utils/storage/command-history.ts#L10-L17
We are going to take a more conservative approach to this list in the
Rust CLI. For example, while `/\b[A-Za-z0-9-_]{20,}\b/` might exclude
sensitive information like API tokens, it would also exclude valuable
information such as references to Git commits.
As noted in the updated documentation, users can opt-out of history by
adding the following to `config.toml`:
```toml
[history]
persistence = "none"
```
Because `history.jsonl` could, in theory, be quite large, we take a[n
arguably overly pedantic] approach in reading history entries into
memory. Specifically, we start by telling the client the current number
of entries in the history file (`history_entry_count`) as well as the
inode (`history_log_id`) of `history.jsonl` (see the new fields on
`SessionConfiguredEvent`).
The client is responsible for keeping new entries in memory to create a
"local history," but if the user hits up enough times to go "past" the
end of local history, then the client should use the new
`GetHistoryEntryRequest` in the protocol to fetch older entries.
Specifically, it should pass the `history_log_id` it was given
originally and work backwards from `history_entry_count`. (It should
really fetch history in batches rather than one-at-a-time, but that is
something we can improve upon in subsequent PRs.)
The motivation behind this crazy scheme is that it is designed to defend
against:
* The `history.jsonl` being truncated during the session such that the
index into the history is no longer consistent with what had been read
up to that point. We do not yet have logic to enforce a `max_bytes` for
`history.jsonl`, but once we do, we will aspire to implement it in a way
that should result in a new inode for the file on most systems.
* New items from concurrent Codex CLI sessions amending to the history.
Because, in absence of truncation, `history.jsonl` is an append-only
log, so long as the client reads backwards from `history_entry_count`,
it should always get a consistent view of history. (That said, it will
not be able to read _new_ commands from concurrent sessions, but perhaps
we will introduce a `/` command to reload latest history or something
down the road.)
Admittedly, my testing of this feature thus far has been fairly light. I
expect we will find bugs and introduce enhancements/fixes going forward.
2025-05-15 16:26:23 -07:00
|
|
|
#[test]
|
|
|
|
|
fn test_toml_parsing() {
|
|
|
|
|
let history_with_persistence = r#"
|
|
|
|
|
[history]
|
|
|
|
|
persistence = "save-all"
|
|
|
|
|
"#;
|
|
|
|
|
let history_with_persistence_cfg: ConfigToml =
|
|
|
|
|
toml::from_str::<ConfigToml>(history_with_persistence)
|
|
|
|
|
.expect("TOML deserialization should succeed");
|
|
|
|
|
assert_eq!(
|
|
|
|
|
Some(History {
|
|
|
|
|
persistence: HistoryPersistence::SaveAll,
|
|
|
|
|
max_bytes: None,
|
|
|
|
|
}),
|
|
|
|
|
history_with_persistence_cfg.history
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
let history_no_persistence = r#"
|
|
|
|
|
[history]
|
|
|
|
|
persistence = "none"
|
|
|
|
|
"#;
|
|
|
|
|
|
|
|
|
|
let history_no_persistence_cfg: ConfigToml =
|
|
|
|
|
toml::from_str::<ConfigToml>(history_no_persistence)
|
|
|
|
|
.expect("TOML deserialization should succeed");
|
|
|
|
|
assert_eq!(
|
|
|
|
|
Some(History {
|
|
|
|
|
persistence: HistoryPersistence::None,
|
|
|
|
|
max_bytes: None,
|
|
|
|
|
}),
|
|
|
|
|
history_no_persistence_cfg.history
|
|
|
|
|
);
|
|
|
|
|
}
|
|
|
|
|
|
2025-04-29 18:42:52 -07:00
|
|
|
/// Deserializing a TOML string containing an *invalid* permission should
|
|
|
|
|
/// fail with a helpful error rather than silently defaulting or
|
|
|
|
|
/// succeeding.
|
|
|
|
|
#[test]
|
|
|
|
|
fn test_sandbox_permissions_illegal_value() {
|
|
|
|
|
let toml_bad = r#"sandbox_permissions = ["not-a-real-permission"]"#;
|
|
|
|
|
|
|
|
|
|
let err = toml::from_str::<ConfigToml>(toml_bad)
|
|
|
|
|
.expect_err("Deserialization should fail for invalid permission");
|
|
|
|
|
|
|
|
|
|
// Make sure the error message contains the invalid value so users have
|
|
|
|
|
// useful feedback.
|
|
|
|
|
let msg = err.to_string();
|
|
|
|
|
assert!(msg.contains("not-a-real-permission"));
|
|
|
|
|
}
|
2025-05-13 16:52:52 -07:00
|
|
|
|
2025-05-15 00:30:13 -07:00
|
|
|
struct PrecedenceTestFixture {
|
|
|
|
|
cwd: TempDir,
|
|
|
|
|
codex_home: TempDir,
|
|
|
|
|
cfg: ConfigToml,
|
|
|
|
|
model_provider_map: HashMap<String, ModelProviderInfo>,
|
|
|
|
|
openai_provider: ModelProviderInfo,
|
|
|
|
|
openai_chat_completions_provider: ModelProviderInfo,
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
impl PrecedenceTestFixture {
|
|
|
|
|
fn cwd(&self) -> PathBuf {
|
|
|
|
|
self.cwd.path().to_path_buf()
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
fn codex_home(&self) -> PathBuf {
|
|
|
|
|
self.codex_home.path().to_path_buf()
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
fn create_test_fixture() -> std::io::Result<PrecedenceTestFixture> {
|
2025-05-13 16:52:52 -07:00
|
|
|
let toml = r#"
|
|
|
|
|
model = "o3"
|
|
|
|
|
approval_policy = "unless-allow-listed"
|
|
|
|
|
sandbox_permissions = ["disk-full-read-access"]
|
|
|
|
|
disable_response_storage = false
|
|
|
|
|
|
|
|
|
|
# Can be used to determine which profile to use if not specified by
|
|
|
|
|
# `ConfigOverrides`.
|
|
|
|
|
profile = "gpt3"
|
|
|
|
|
|
|
|
|
|
[model_providers.openai-chat-completions]
|
|
|
|
|
name = "OpenAI using Chat Completions"
|
|
|
|
|
base_url = "https://api.openai.com/v1"
|
|
|
|
|
env_key = "OPENAI_API_KEY"
|
|
|
|
|
wire_api = "chat"
|
|
|
|
|
|
|
|
|
|
[profiles.o3]
|
|
|
|
|
model = "o3"
|
|
|
|
|
model_provider = "openai"
|
|
|
|
|
approval_policy = "never"
|
|
|
|
|
|
|
|
|
|
[profiles.gpt3]
|
|
|
|
|
model = "gpt-3.5-turbo"
|
|
|
|
|
model_provider = "openai-chat-completions"
|
|
|
|
|
|
|
|
|
|
[profiles.zdr]
|
|
|
|
|
model = "o3"
|
|
|
|
|
model_provider = "openai"
|
|
|
|
|
approval_policy = "on-failure"
|
|
|
|
|
disable_response_storage = true
|
|
|
|
|
"#;
|
|
|
|
|
|
|
|
|
|
let cfg: ConfigToml = toml::from_str(toml).expect("TOML deserialization should succeed");
|
|
|
|
|
|
|
|
|
|
// Use a temporary directory for the cwd so it does not contain an
|
|
|
|
|
// AGENTS.md file.
|
|
|
|
|
let cwd_temp_dir = TempDir::new().unwrap();
|
|
|
|
|
let cwd = cwd_temp_dir.path().to_path_buf();
|
|
|
|
|
// Make it look like a Git repo so it does not search for AGENTS.md in
|
|
|
|
|
// a parent folder, either.
|
|
|
|
|
std::fs::write(cwd.join(".git"), "gitdir: nowhere")?;
|
|
|
|
|
|
2025-05-15 00:30:13 -07:00
|
|
|
let codex_home_temp_dir = TempDir::new().unwrap();
|
|
|
|
|
|
2025-05-13 16:52:52 -07:00
|
|
|
let openai_chat_completions_provider = ModelProviderInfo {
|
|
|
|
|
name: "OpenAI using Chat Completions".to_string(),
|
|
|
|
|
base_url: "https://api.openai.com/v1".to_string(),
|
|
|
|
|
env_key: Some("OPENAI_API_KEY".to_string()),
|
|
|
|
|
wire_api: crate::WireApi::Chat,
|
|
|
|
|
env_key_instructions: None,
|
|
|
|
|
};
|
|
|
|
|
let model_provider_map = {
|
|
|
|
|
let mut model_provider_map = built_in_model_providers();
|
|
|
|
|
model_provider_map.insert(
|
|
|
|
|
"openai-chat-completions".to_string(),
|
|
|
|
|
openai_chat_completions_provider.clone(),
|
|
|
|
|
);
|
|
|
|
|
model_provider_map
|
|
|
|
|
};
|
|
|
|
|
|
|
|
|
|
let openai_provider = model_provider_map
|
|
|
|
|
.get("openai")
|
|
|
|
|
.expect("openai provider should exist")
|
|
|
|
|
.clone();
|
|
|
|
|
|
2025-05-15 00:30:13 -07:00
|
|
|
Ok(PrecedenceTestFixture {
|
|
|
|
|
cwd: cwd_temp_dir,
|
|
|
|
|
codex_home: codex_home_temp_dir,
|
|
|
|
|
cfg,
|
|
|
|
|
model_provider_map,
|
|
|
|
|
openai_provider,
|
|
|
|
|
openai_chat_completions_provider,
|
|
|
|
|
})
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Users can specify config values at multiple levels that have the
|
|
|
|
|
/// following precedence:
|
|
|
|
|
///
|
|
|
|
|
/// 1. custom command-line argument, e.g. `--model o3`
|
|
|
|
|
/// 2. as part of a profile, where the `--profile` is specified via a CLI
|
|
|
|
|
/// (or in the config file itelf)
|
|
|
|
|
/// 3. as an entry in `config.toml`, e.g. `model = "o3"`
|
|
|
|
|
/// 4. the default value for a required field defined in code, e.g.,
|
|
|
|
|
/// `crate::flags::OPENAI_DEFAULT_MODEL`
|
|
|
|
|
///
|
|
|
|
|
/// Note that profiles are the recommended way to specify a group of
|
|
|
|
|
/// configuration options together.
|
|
|
|
|
#[test]
|
|
|
|
|
fn test_precedence_fixture_with_o3_profile() -> std::io::Result<()> {
|
|
|
|
|
let fixture = create_test_fixture()?;
|
|
|
|
|
|
2025-05-13 16:52:52 -07:00
|
|
|
let o3_profile_overrides = ConfigOverrides {
|
|
|
|
|
config_profile: Some("o3".to_string()),
|
2025-05-15 00:30:13 -07:00
|
|
|
cwd: Some(fixture.cwd()),
|
2025-05-13 16:52:52 -07:00
|
|
|
..Default::default()
|
|
|
|
|
};
|
2025-05-15 00:30:13 -07:00
|
|
|
let o3_profile_config: Config = Config::load_from_base_config_with_overrides(
|
|
|
|
|
fixture.cfg.clone(),
|
|
|
|
|
o3_profile_overrides,
|
|
|
|
|
fixture.codex_home(),
|
|
|
|
|
)?;
|
2025-05-13 16:52:52 -07:00
|
|
|
assert_eq!(
|
|
|
|
|
Config {
|
|
|
|
|
model: "o3".to_string(),
|
|
|
|
|
model_provider_id: "openai".to_string(),
|
2025-05-15 00:30:13 -07:00
|
|
|
model_provider: fixture.openai_provider.clone(),
|
2025-05-13 16:52:52 -07:00
|
|
|
approval_policy: AskForApproval::Never,
|
|
|
|
|
sandbox_policy: SandboxPolicy::new_read_only_policy(),
|
feat: introduce support for shell_environment_policy in config.toml (#1061)
To date, when handling `shell` and `local_shell` tool calls, we were
spawning new processes using the environment inherited from the Codex
process itself. This means that the sensitive `OPENAI_API_KEY` that
Codex needs to talk to OpenAI models was made available to everything
run by `shell` and `local_shell`. While there are cases where that might
be useful, it does not seem like a good default.
This PR introduces a complex `shell_environment_policy` config option to
control the `env` used with these tool calls. It is inevitably a bit
complex so that it is possible to override individual components of the
policy so without having to restate the entire thing.
Details are in the updated `README.md` in this PR, but here is the
relevant bit that explains the individual fields of
`shell_environment_policy`:
| Field | Type | Default | Description |
| ------------------------- | -------------------------- | ------- |
-----------------------------------------------------------------------------------------------------------------------------------------------
|
| `inherit` | string | `core` | Starting template for the
environment:<br>`core` (`HOME`, `PATH`, `USER`, …), `all` (clone full
parent env), or `none` (start empty). |
| `ignore_default_excludes` | boolean | `false` | When `false`, Codex
removes any var whose **name** contains `KEY`, `SECRET`, or `TOKEN`
(case-insensitive) before other rules run. |
| `exclude` | array<string> | `[]` | Case-insensitive glob
patterns to drop after the default filter.<br>Examples: `"AWS_*"`,
`"AZURE_*"`. |
| `set` | table<string,string> | `{}` | Explicit key/value
overrides or additions – always win over inherited values. |
| `include_only` | array<string> | `[]` | If non-empty, a
whitelist of patterns; only variables that match _one_ pattern survive
the final step. (Generally used with `inherit = "all"`.) |
In particular, note that the default is `inherit = "core"`, so:
* if you have extra env variables that you want to inherit from the
parent process, use `inherit = "all"` and then specify `include_only`
* if you have extra env variables where you want to hardcode the values,
the default `inherit = "core"` will work fine, but then you need to
specify `set`
This configuration is not battle-tested, so we will probably still have
to play with it a bit. `core/src/exec_env.rs` has the critical business
logic as well as unit tests.
Though if nothing else, previous to this change:
```
$ cargo run --bin codex -- debug seatbelt -- printenv OPENAI_API_KEY
# ...prints OPENAI_API_KEY...
```
But after this change it does not print anything (as desired).
One final thing to call out about this PR is that the
`configure_command!` macro we use in `core/src/exec.rs` has to do some
complex logic with respect to how it builds up the `env` for the process
being spawned under Landlock/seccomp. Specifically, doing
`cmd.env_clear()` followed by `cmd.envs(&$env_map)` (which is arguably
the most intuitive way to do it) caused the Landlock unit tests to fail
because the processes spawned by the unit tests started failing in
unexpected ways! If we forgo `env_clear()` in favor of updating env vars
one at a time, the tests still pass. The comment in the code talks about
this a bit, and while I would like to investigate this more, I need to
move on for the moment, but I do plan to come back to it to fully
understand what is going on. For example, this suggests that we might
not be able to spawn a C program that calls `env_clear()`, which would
be...weird. We may still have to fiddle with our Landlock config if that
is the case.
2025-05-22 09:51:19 -07:00
|
|
|
shell_environment_policy: ShellEnvironmentPolicy::default(),
|
2025-05-13 16:52:52 -07:00
|
|
|
disable_response_storage: false,
|
|
|
|
|
instructions: None,
|
|
|
|
|
notify: None,
|
2025-05-15 00:30:13 -07:00
|
|
|
cwd: fixture.cwd(),
|
2025-05-13 16:52:52 -07:00
|
|
|
mcp_servers: HashMap::new(),
|
2025-05-15 00:30:13 -07:00
|
|
|
model_providers: fixture.model_provider_map.clone(),
|
2025-05-13 16:52:52 -07:00
|
|
|
project_doc_max_bytes: PROJECT_DOC_MAX_BYTES,
|
2025-05-15 00:30:13 -07:00
|
|
|
codex_home: fixture.codex_home(),
|
feat: record messages from user in ~/.codex/history.jsonl (#939)
This is a large change to support a "history" feature like you would
expect in a shell like Bash.
History events are recorded in `$CODEX_HOME/history.jsonl`. Because it
is a JSONL file, it is straightforward to append new entries (as opposed
to the TypeScript file that uses `$CODEX_HOME/history.json`, so to be
valid JSON, each new entry entails rewriting the entire file). Because
it is possible for there to be multiple instances of Codex CLI writing
to `history.jsonl` at once, we use advisory file locking when working
with `history.jsonl` in `codex-rs/core/src/message_history.rs`.
Because we believe history is a sufficiently useful feature, we enable
it by default. Though to provide some safety, we set the file
permissions of `history.jsonl` to be `o600` so that other users on the
system cannot read the user's history. We do not yet support a default
list of `SENSITIVE_PATTERNS` as the TypeScript CLI does:
https://github.com/openai/codex/blob/3fdf9df1335ac9501e3fb0e61715359145711e8b/codex-cli/src/utils/storage/command-history.ts#L10-L17
We are going to take a more conservative approach to this list in the
Rust CLI. For example, while `/\b[A-Za-z0-9-_]{20,}\b/` might exclude
sensitive information like API tokens, it would also exclude valuable
information such as references to Git commits.
As noted in the updated documentation, users can opt-out of history by
adding the following to `config.toml`:
```toml
[history]
persistence = "none"
```
Because `history.jsonl` could, in theory, be quite large, we take a[n
arguably overly pedantic] approach in reading history entries into
memory. Specifically, we start by telling the client the current number
of entries in the history file (`history_entry_count`) as well as the
inode (`history_log_id`) of `history.jsonl` (see the new fields on
`SessionConfiguredEvent`).
The client is responsible for keeping new entries in memory to create a
"local history," but if the user hits up enough times to go "past" the
end of local history, then the client should use the new
`GetHistoryEntryRequest` in the protocol to fetch older entries.
Specifically, it should pass the `history_log_id` it was given
originally and work backwards from `history_entry_count`. (It should
really fetch history in batches rather than one-at-a-time, but that is
something we can improve upon in subsequent PRs.)
The motivation behind this crazy scheme is that it is designed to defend
against:
* The `history.jsonl` being truncated during the session such that the
index into the history is no longer consistent with what had been read
up to that point. We do not yet have logic to enforce a `max_bytes` for
`history.jsonl`, but once we do, we will aspire to implement it in a way
that should result in a new inode for the file on most systems.
* New items from concurrent Codex CLI sessions amending to the history.
Because, in absence of truncation, `history.jsonl` is an append-only
log, so long as the client reads backwards from `history_entry_count`,
it should always get a consistent view of history. (That said, it will
not be able to read _new_ commands from concurrent sessions, but perhaps
we will introduce a `/` command to reload latest history or something
down the road.)
Admittedly, my testing of this feature thus far has been fairly light. I
expect we will find bugs and introduce enhancements/fixes going forward.
2025-05-15 16:26:23 -07:00
|
|
|
history: History::default(),
|
2025-05-16 11:33:08 -07:00
|
|
|
file_opener: UriBasedFileOpener::VsCode,
|
2025-05-16 16:16:50 -07:00
|
|
|
tui: Tui::default(),
|
2025-05-22 21:52:28 -07:00
|
|
|
codex_linux_sandbox_exe: None,
|
2025-05-13 16:52:52 -07:00
|
|
|
},
|
|
|
|
|
o3_profile_config
|
|
|
|
|
);
|
2025-05-15 00:30:13 -07:00
|
|
|
Ok(())
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
#[test]
|
|
|
|
|
fn test_precedence_fixture_with_gpt3_profile() -> std::io::Result<()> {
|
|
|
|
|
let fixture = create_test_fixture()?;
|
2025-05-13 16:52:52 -07:00
|
|
|
|
|
|
|
|
let gpt3_profile_overrides = ConfigOverrides {
|
|
|
|
|
config_profile: Some("gpt3".to_string()),
|
2025-05-15 00:30:13 -07:00
|
|
|
cwd: Some(fixture.cwd()),
|
2025-05-13 16:52:52 -07:00
|
|
|
..Default::default()
|
|
|
|
|
};
|
|
|
|
|
let gpt3_profile_config = Config::load_from_base_config_with_overrides(
|
2025-05-15 00:30:13 -07:00
|
|
|
fixture.cfg.clone(),
|
2025-05-13 16:52:52 -07:00
|
|
|
gpt3_profile_overrides,
|
2025-05-15 00:30:13 -07:00
|
|
|
fixture.codex_home(),
|
2025-05-13 16:52:52 -07:00
|
|
|
)?;
|
|
|
|
|
let expected_gpt3_profile_config = Config {
|
|
|
|
|
model: "gpt-3.5-turbo".to_string(),
|
|
|
|
|
model_provider_id: "openai-chat-completions".to_string(),
|
2025-05-15 00:30:13 -07:00
|
|
|
model_provider: fixture.openai_chat_completions_provider.clone(),
|
2025-05-13 16:52:52 -07:00
|
|
|
approval_policy: AskForApproval::UnlessAllowListed,
|
|
|
|
|
sandbox_policy: SandboxPolicy::new_read_only_policy(),
|
feat: introduce support for shell_environment_policy in config.toml (#1061)
To date, when handling `shell` and `local_shell` tool calls, we were
spawning new processes using the environment inherited from the Codex
process itself. This means that the sensitive `OPENAI_API_KEY` that
Codex needs to talk to OpenAI models was made available to everything
run by `shell` and `local_shell`. While there are cases where that might
be useful, it does not seem like a good default.
This PR introduces a complex `shell_environment_policy` config option to
control the `env` used with these tool calls. It is inevitably a bit
complex so that it is possible to override individual components of the
policy so without having to restate the entire thing.
Details are in the updated `README.md` in this PR, but here is the
relevant bit that explains the individual fields of
`shell_environment_policy`:
| Field | Type | Default | Description |
| ------------------------- | -------------------------- | ------- |
-----------------------------------------------------------------------------------------------------------------------------------------------
|
| `inherit` | string | `core` | Starting template for the
environment:<br>`core` (`HOME`, `PATH`, `USER`, …), `all` (clone full
parent env), or `none` (start empty). |
| `ignore_default_excludes` | boolean | `false` | When `false`, Codex
removes any var whose **name** contains `KEY`, `SECRET`, or `TOKEN`
(case-insensitive) before other rules run. |
| `exclude` | array<string> | `[]` | Case-insensitive glob
patterns to drop after the default filter.<br>Examples: `"AWS_*"`,
`"AZURE_*"`. |
| `set` | table<string,string> | `{}` | Explicit key/value
overrides or additions – always win over inherited values. |
| `include_only` | array<string> | `[]` | If non-empty, a
whitelist of patterns; only variables that match _one_ pattern survive
the final step. (Generally used with `inherit = "all"`.) |
In particular, note that the default is `inherit = "core"`, so:
* if you have extra env variables that you want to inherit from the
parent process, use `inherit = "all"` and then specify `include_only`
* if you have extra env variables where you want to hardcode the values,
the default `inherit = "core"` will work fine, but then you need to
specify `set`
This configuration is not battle-tested, so we will probably still have
to play with it a bit. `core/src/exec_env.rs` has the critical business
logic as well as unit tests.
Though if nothing else, previous to this change:
```
$ cargo run --bin codex -- debug seatbelt -- printenv OPENAI_API_KEY
# ...prints OPENAI_API_KEY...
```
But after this change it does not print anything (as desired).
One final thing to call out about this PR is that the
`configure_command!` macro we use in `core/src/exec.rs` has to do some
complex logic with respect to how it builds up the `env` for the process
being spawned under Landlock/seccomp. Specifically, doing
`cmd.env_clear()` followed by `cmd.envs(&$env_map)` (which is arguably
the most intuitive way to do it) caused the Landlock unit tests to fail
because the processes spawned by the unit tests started failing in
unexpected ways! If we forgo `env_clear()` in favor of updating env vars
one at a time, the tests still pass. The comment in the code talks about
this a bit, and while I would like to investigate this more, I need to
move on for the moment, but I do plan to come back to it to fully
understand what is going on. For example, this suggests that we might
not be able to spawn a C program that calls `env_clear()`, which would
be...weird. We may still have to fiddle with our Landlock config if that
is the case.
2025-05-22 09:51:19 -07:00
|
|
|
shell_environment_policy: ShellEnvironmentPolicy::default(),
|
2025-05-13 16:52:52 -07:00
|
|
|
disable_response_storage: false,
|
|
|
|
|
instructions: None,
|
|
|
|
|
notify: None,
|
2025-05-15 00:30:13 -07:00
|
|
|
cwd: fixture.cwd(),
|
2025-05-13 16:52:52 -07:00
|
|
|
mcp_servers: HashMap::new(),
|
2025-05-15 00:30:13 -07:00
|
|
|
model_providers: fixture.model_provider_map.clone(),
|
2025-05-13 16:52:52 -07:00
|
|
|
project_doc_max_bytes: PROJECT_DOC_MAX_BYTES,
|
2025-05-15 00:30:13 -07:00
|
|
|
codex_home: fixture.codex_home(),
|
feat: record messages from user in ~/.codex/history.jsonl (#939)
This is a large change to support a "history" feature like you would
expect in a shell like Bash.
History events are recorded in `$CODEX_HOME/history.jsonl`. Because it
is a JSONL file, it is straightforward to append new entries (as opposed
to the TypeScript file that uses `$CODEX_HOME/history.json`, so to be
valid JSON, each new entry entails rewriting the entire file). Because
it is possible for there to be multiple instances of Codex CLI writing
to `history.jsonl` at once, we use advisory file locking when working
with `history.jsonl` in `codex-rs/core/src/message_history.rs`.
Because we believe history is a sufficiently useful feature, we enable
it by default. Though to provide some safety, we set the file
permissions of `history.jsonl` to be `o600` so that other users on the
system cannot read the user's history. We do not yet support a default
list of `SENSITIVE_PATTERNS` as the TypeScript CLI does:
https://github.com/openai/codex/blob/3fdf9df1335ac9501e3fb0e61715359145711e8b/codex-cli/src/utils/storage/command-history.ts#L10-L17
We are going to take a more conservative approach to this list in the
Rust CLI. For example, while `/\b[A-Za-z0-9-_]{20,}\b/` might exclude
sensitive information like API tokens, it would also exclude valuable
information such as references to Git commits.
As noted in the updated documentation, users can opt-out of history by
adding the following to `config.toml`:
```toml
[history]
persistence = "none"
```
Because `history.jsonl` could, in theory, be quite large, we take a[n
arguably overly pedantic] approach in reading history entries into
memory. Specifically, we start by telling the client the current number
of entries in the history file (`history_entry_count`) as well as the
inode (`history_log_id`) of `history.jsonl` (see the new fields on
`SessionConfiguredEvent`).
The client is responsible for keeping new entries in memory to create a
"local history," but if the user hits up enough times to go "past" the
end of local history, then the client should use the new
`GetHistoryEntryRequest` in the protocol to fetch older entries.
Specifically, it should pass the `history_log_id` it was given
originally and work backwards from `history_entry_count`. (It should
really fetch history in batches rather than one-at-a-time, but that is
something we can improve upon in subsequent PRs.)
The motivation behind this crazy scheme is that it is designed to defend
against:
* The `history.jsonl` being truncated during the session such that the
index into the history is no longer consistent with what had been read
up to that point. We do not yet have logic to enforce a `max_bytes` for
`history.jsonl`, but once we do, we will aspire to implement it in a way
that should result in a new inode for the file on most systems.
* New items from concurrent Codex CLI sessions amending to the history.
Because, in absence of truncation, `history.jsonl` is an append-only
log, so long as the client reads backwards from `history_entry_count`,
it should always get a consistent view of history. (That said, it will
not be able to read _new_ commands from concurrent sessions, but perhaps
we will introduce a `/` command to reload latest history or something
down the road.)
Admittedly, my testing of this feature thus far has been fairly light. I
expect we will find bugs and introduce enhancements/fixes going forward.
2025-05-15 16:26:23 -07:00
|
|
|
history: History::default(),
|
2025-05-16 11:33:08 -07:00
|
|
|
file_opener: UriBasedFileOpener::VsCode,
|
2025-05-16 16:16:50 -07:00
|
|
|
tui: Tui::default(),
|
2025-05-22 21:52:28 -07:00
|
|
|
codex_linux_sandbox_exe: None,
|
2025-05-13 16:52:52 -07:00
|
|
|
};
|
2025-05-15 00:30:13 -07:00
|
|
|
|
|
|
|
|
assert_eq!(expected_gpt3_profile_config, gpt3_profile_config);
|
2025-05-13 16:52:52 -07:00
|
|
|
|
|
|
|
|
// Verify that loading without specifying a profile in ConfigOverrides
|
2025-05-15 00:30:13 -07:00
|
|
|
// uses the default profile from the config file (which is "gpt3").
|
2025-05-13 16:52:52 -07:00
|
|
|
let default_profile_overrides = ConfigOverrides {
|
2025-05-15 00:30:13 -07:00
|
|
|
cwd: Some(fixture.cwd()),
|
2025-05-13 16:52:52 -07:00
|
|
|
..Default::default()
|
|
|
|
|
};
|
2025-05-15 00:30:13 -07:00
|
|
|
|
2025-05-13 16:52:52 -07:00
|
|
|
let default_profile_config = Config::load_from_base_config_with_overrides(
|
2025-05-15 00:30:13 -07:00
|
|
|
fixture.cfg.clone(),
|
2025-05-13 16:52:52 -07:00
|
|
|
default_profile_overrides,
|
2025-05-15 00:30:13 -07:00
|
|
|
fixture.codex_home(),
|
2025-05-13 16:52:52 -07:00
|
|
|
)?;
|
2025-05-15 00:30:13 -07:00
|
|
|
|
2025-05-13 16:52:52 -07:00
|
|
|
assert_eq!(expected_gpt3_profile_config, default_profile_config);
|
2025-05-15 00:30:13 -07:00
|
|
|
Ok(())
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
#[test]
|
|
|
|
|
fn test_precedence_fixture_with_zdr_profile() -> std::io::Result<()> {
|
|
|
|
|
let fixture = create_test_fixture()?;
|
2025-05-13 16:52:52 -07:00
|
|
|
|
|
|
|
|
let zdr_profile_overrides = ConfigOverrides {
|
|
|
|
|
config_profile: Some("zdr".to_string()),
|
2025-05-15 00:30:13 -07:00
|
|
|
cwd: Some(fixture.cwd()),
|
2025-05-13 16:52:52 -07:00
|
|
|
..Default::default()
|
|
|
|
|
};
|
2025-05-15 00:30:13 -07:00
|
|
|
let zdr_profile_config = Config::load_from_base_config_with_overrides(
|
|
|
|
|
fixture.cfg.clone(),
|
|
|
|
|
zdr_profile_overrides,
|
|
|
|
|
fixture.codex_home(),
|
|
|
|
|
)?;
|
|
|
|
|
let expected_zdr_profile_config = Config {
|
|
|
|
|
model: "o3".to_string(),
|
|
|
|
|
model_provider_id: "openai".to_string(),
|
|
|
|
|
model_provider: fixture.openai_provider.clone(),
|
|
|
|
|
approval_policy: AskForApproval::OnFailure,
|
|
|
|
|
sandbox_policy: SandboxPolicy::new_read_only_policy(),
|
feat: introduce support for shell_environment_policy in config.toml (#1061)
To date, when handling `shell` and `local_shell` tool calls, we were
spawning new processes using the environment inherited from the Codex
process itself. This means that the sensitive `OPENAI_API_KEY` that
Codex needs to talk to OpenAI models was made available to everything
run by `shell` and `local_shell`. While there are cases where that might
be useful, it does not seem like a good default.
This PR introduces a complex `shell_environment_policy` config option to
control the `env` used with these tool calls. It is inevitably a bit
complex so that it is possible to override individual components of the
policy so without having to restate the entire thing.
Details are in the updated `README.md` in this PR, but here is the
relevant bit that explains the individual fields of
`shell_environment_policy`:
| Field | Type | Default | Description |
| ------------------------- | -------------------------- | ------- |
-----------------------------------------------------------------------------------------------------------------------------------------------
|
| `inherit` | string | `core` | Starting template for the
environment:<br>`core` (`HOME`, `PATH`, `USER`, …), `all` (clone full
parent env), or `none` (start empty). |
| `ignore_default_excludes` | boolean | `false` | When `false`, Codex
removes any var whose **name** contains `KEY`, `SECRET`, or `TOKEN`
(case-insensitive) before other rules run. |
| `exclude` | array<string> | `[]` | Case-insensitive glob
patterns to drop after the default filter.<br>Examples: `"AWS_*"`,
`"AZURE_*"`. |
| `set` | table<string,string> | `{}` | Explicit key/value
overrides or additions – always win over inherited values. |
| `include_only` | array<string> | `[]` | If non-empty, a
whitelist of patterns; only variables that match _one_ pattern survive
the final step. (Generally used with `inherit = "all"`.) |
In particular, note that the default is `inherit = "core"`, so:
* if you have extra env variables that you want to inherit from the
parent process, use `inherit = "all"` and then specify `include_only`
* if you have extra env variables where you want to hardcode the values,
the default `inherit = "core"` will work fine, but then you need to
specify `set`
This configuration is not battle-tested, so we will probably still have
to play with it a bit. `core/src/exec_env.rs` has the critical business
logic as well as unit tests.
Though if nothing else, previous to this change:
```
$ cargo run --bin codex -- debug seatbelt -- printenv OPENAI_API_KEY
# ...prints OPENAI_API_KEY...
```
But after this change it does not print anything (as desired).
One final thing to call out about this PR is that the
`configure_command!` macro we use in `core/src/exec.rs` has to do some
complex logic with respect to how it builds up the `env` for the process
being spawned under Landlock/seccomp. Specifically, doing
`cmd.env_clear()` followed by `cmd.envs(&$env_map)` (which is arguably
the most intuitive way to do it) caused the Landlock unit tests to fail
because the processes spawned by the unit tests started failing in
unexpected ways! If we forgo `env_clear()` in favor of updating env vars
one at a time, the tests still pass. The comment in the code talks about
this a bit, and while I would like to investigate this more, I need to
move on for the moment, but I do plan to come back to it to fully
understand what is going on. For example, this suggests that we might
not be able to spawn a C program that calls `env_clear()`, which would
be...weird. We may still have to fiddle with our Landlock config if that
is the case.
2025-05-22 09:51:19 -07:00
|
|
|
shell_environment_policy: ShellEnvironmentPolicy::default(),
|
2025-05-15 00:30:13 -07:00
|
|
|
disable_response_storage: true,
|
|
|
|
|
instructions: None,
|
|
|
|
|
notify: None,
|
|
|
|
|
cwd: fixture.cwd(),
|
|
|
|
|
mcp_servers: HashMap::new(),
|
|
|
|
|
model_providers: fixture.model_provider_map.clone(),
|
|
|
|
|
project_doc_max_bytes: PROJECT_DOC_MAX_BYTES,
|
|
|
|
|
codex_home: fixture.codex_home(),
|
feat: record messages from user in ~/.codex/history.jsonl (#939)
This is a large change to support a "history" feature like you would
expect in a shell like Bash.
History events are recorded in `$CODEX_HOME/history.jsonl`. Because it
is a JSONL file, it is straightforward to append new entries (as opposed
to the TypeScript file that uses `$CODEX_HOME/history.json`, so to be
valid JSON, each new entry entails rewriting the entire file). Because
it is possible for there to be multiple instances of Codex CLI writing
to `history.jsonl` at once, we use advisory file locking when working
with `history.jsonl` in `codex-rs/core/src/message_history.rs`.
Because we believe history is a sufficiently useful feature, we enable
it by default. Though to provide some safety, we set the file
permissions of `history.jsonl` to be `o600` so that other users on the
system cannot read the user's history. We do not yet support a default
list of `SENSITIVE_PATTERNS` as the TypeScript CLI does:
https://github.com/openai/codex/blob/3fdf9df1335ac9501e3fb0e61715359145711e8b/codex-cli/src/utils/storage/command-history.ts#L10-L17
We are going to take a more conservative approach to this list in the
Rust CLI. For example, while `/\b[A-Za-z0-9-_]{20,}\b/` might exclude
sensitive information like API tokens, it would also exclude valuable
information such as references to Git commits.
As noted in the updated documentation, users can opt-out of history by
adding the following to `config.toml`:
```toml
[history]
persistence = "none"
```
Because `history.jsonl` could, in theory, be quite large, we take a[n
arguably overly pedantic] approach in reading history entries into
memory. Specifically, we start by telling the client the current number
of entries in the history file (`history_entry_count`) as well as the
inode (`history_log_id`) of `history.jsonl` (see the new fields on
`SessionConfiguredEvent`).
The client is responsible for keeping new entries in memory to create a
"local history," but if the user hits up enough times to go "past" the
end of local history, then the client should use the new
`GetHistoryEntryRequest` in the protocol to fetch older entries.
Specifically, it should pass the `history_log_id` it was given
originally and work backwards from `history_entry_count`. (It should
really fetch history in batches rather than one-at-a-time, but that is
something we can improve upon in subsequent PRs.)
The motivation behind this crazy scheme is that it is designed to defend
against:
* The `history.jsonl` being truncated during the session such that the
index into the history is no longer consistent with what had been read
up to that point. We do not yet have logic to enforce a `max_bytes` for
`history.jsonl`, but once we do, we will aspire to implement it in a way
that should result in a new inode for the file on most systems.
* New items from concurrent Codex CLI sessions amending to the history.
Because, in absence of truncation, `history.jsonl` is an append-only
log, so long as the client reads backwards from `history_entry_count`,
it should always get a consistent view of history. (That said, it will
not be able to read _new_ commands from concurrent sessions, but perhaps
we will introduce a `/` command to reload latest history or something
down the road.)
Admittedly, my testing of this feature thus far has been fairly light. I
expect we will find bugs and introduce enhancements/fixes going forward.
2025-05-15 16:26:23 -07:00
|
|
|
history: History::default(),
|
2025-05-16 11:33:08 -07:00
|
|
|
file_opener: UriBasedFileOpener::VsCode,
|
2025-05-16 16:16:50 -07:00
|
|
|
tui: Tui::default(),
|
2025-05-22 21:52:28 -07:00
|
|
|
codex_linux_sandbox_exe: None,
|
2025-05-15 00:30:13 -07:00
|
|
|
};
|
|
|
|
|
|
|
|
|
|
assert_eq!(expected_zdr_profile_config, zdr_profile_config);
|
2025-05-13 16:52:52 -07:00
|
|
|
|
|
|
|
|
Ok(())
|
|
|
|
|
}
|
2025-04-29 18:42:52 -07:00
|
|
|
}
|