Commit Graph

23 Commits

Author SHA1 Message Date
Michael Bolin
49d040215a fix: build all crates individually as part of CI (#833)
I discovered that `cargo build` worked for the entire workspace, but not
for the `mcp-client` or `core` crates.

* `mcp-client` failed to build because it underspecified the set of
features it needed from `tokio`.
* `core` failed to build because it was using a "feature" of its own
crate in the default, no-feature version.
 
This PR fixes the builds and adds a check in CI to defend against this
sort of thing going forward.
2025-05-06 12:02:49 -07:00
Michael Bolin
5d924d44cf fix: ensure apply_patch resolves relative paths against workdir or project cwd (#810)
https://github.com/openai/codex/pull/800 kicked off some work to be more
disciplined about honoring the `cwd` param passed in rather than
assuming `std::env::current_dir()` as the `cwd`. As part of this, we
need to ensure `apply_patch` calls honor the appropriate `cwd` as well,
which is significant if the paths in the `apply_patch` arg are not
absolute paths themselves. Failing that:

- The `apply_patch` function call can contain an optional`workdir`
param, so:
- If specified and is an absolute path, it should be used to resolve
relative paths
- If specified and is a relative path, should be resolved against
`Config.cwd` and then any relative paths will be resolved against the
result
- If `workdir` is not specified on the function call, relative paths
should be resolved against `Config.cwd`

Note that we had a similar issue in the TypeScript CLI that was fixed in
https://github.com/openai/codex/pull/556.

As part of the fix, this PR introduces `ApplyPatchAction` so clients can
deal with that instead of the raw `HashMap<PathBuf,
ApplyPatchFileChange>`. This enables us to enforce, by construction,
that all paths contained in the `ApplyPatchAction` are absolute paths.
2025-05-04 12:32:51 -07:00
Michael Bolin
a134bdde49 fix: is_inside_git_repo should take the directory as a param (#809)
https://github.com/openai/codex/pull/800 made `cwd` a property of
`Config` and made it so the `cwd` is not necessarily
`std::env::current_dir()`. As such, `is_inside_git_repo()` should check
`Config.cwd` rather than `std::env::current_dir()`.

This PR updates `is_inside_git_repo()` to take `Config` instead of an
arbitrary `PathBuf` to force the check to operate on a `Config` where
`cwd` has been resolved to what the user specified.
2025-05-04 11:39:10 -07:00
Michael Bolin
421e159888 feat: make cwd a required field of Config so we stop assuming std::env::current_dir() in a session (#800)
In order to expose Codex via an MCP server, I realized that we should be
taking `cwd` as a parameter rather than assuming
`std::env::current_dir()` as the `cwd`. Specifically, the user may want
to start a session in a directory other than the one where the MCP
server has been started.

This PR makes `cwd: PathBuf` a required field of `Session` and threads
it all the way through, though I think there is still an issue with not
honoring `workdir` for `apply_patch`, which is something we also had to
fix in the TypeScript version: https://github.com/openai/codex/pull/556.

This also adds `-C`/`--cd` to change the cwd via the command line.

To test, I ran:

```
cargo run --bin codex -- exec -C /tmp 'show the output of ls'
```

and verified it showed the contents of my `/tmp` folder instead of
`$PWD`.
2025-05-04 10:57:12 -07:00
Michael Bolin
a180ed44e8 feat: configurable notifications in the Rust CLI (#793)
With this change, you can specify a program that will be executed to get
notified about events generated by Codex. The notification info will be
packaged as a JSON object. The supported notification types are defined
by the `UserNotification` enum introduced in this PR. Initially, it
contains only one variant, `AgentTurnComplete`:

```rust
pub(crate) enum UserNotification {
    #[serde(rename_all = "kebab-case")]
    AgentTurnComplete {
        turn_id: String,

        /// Messages that the user sent to the agent to initiate the turn.
        input_messages: Vec<String>,

        /// The last message sent by the assistant in the turn.
        last_assistant_message: Option<String>,
    },
}
```

This is intended to support the common case when a "turn" ends, which
often means it is now your chance to give Codex further instructions.

For example, I have the following in my `~/.codex/config.toml`:

```toml
notify = ["python3", "/Users/mbolin/.codex/notify.py"]
```

I created my own custom notifier script that calls out to
[terminal-notifier](https://github.com/julienXX/terminal-notifier) to
show a desktop push notification on macOS. Contents of `notify.py`:

```python
#!/usr/bin/env python3

import json
import subprocess
import sys


def main() -> int:
    if len(sys.argv) != 2:
        print("Usage: notify.py <NOTIFICATION_JSON>")
        return 1

    try:
        notification = json.loads(sys.argv[1])
    except json.JSONDecodeError:
        return 1

    match notification_type := notification.get("type"):
        case "agent-turn-complete":
            assistant_message = notification.get("last-assistant-message")
            if assistant_message:
                title = f"Codex: {assistant_message}"
            else:
                title = "Codex: Turn Complete!"
            input_messages = notification.get("input_messages", [])
            message = " ".join(input_messages)
            title += message
        case _:
            print(f"not sending a push notification for: {notification_type}")
            return 0

    subprocess.check_output(
        [
            "terminal-notifier",
            "-title",
            title,
            "-message",
            message,
            "-group",
            "codex",
            "-ignoreDnD",
            "-activate",
            "com.googlecode.iterm2",
        ]
    )

    return 0


if __name__ == "__main__":
    sys.exit(main())
```

For reference, here are related PRs that tried to add this functionality
to the TypeScript version of the Codex CLI:

* https://github.com/openai/codex/pull/160
* https://github.com/openai/codex/pull/498
2025-05-02 19:48:13 -07:00
Michael Bolin
27bc4516bf feat: bring back -s option to specify sandbox permissions (#739) 2025-04-29 18:42:52 -07:00
Michael Bolin
0a00b5ed29 fix: overhaul SandboxPolicy and config loading in Rust (#732)
Previous to this PR, `SandboxPolicy` was a bit difficult to work with:


237f8a11e1/codex-rs/core/src/protocol.rs (L98-L108)

Specifically:

* It was an `enum` and therefore options were mutually exclusive as
opposed to additive.
* It defined things in terms of what the agent _could not_ do as opposed
to what they _could_ do. This made things hard to support because we
would prefer to build up a sandbox config by starting with something
extremely restrictive and only granting permissions for things the user
as explicitly allowed.

This PR changes things substantially by redefining the policy in terms
of two concepts:

* A `SandboxPermission` enum that defines permissions that can be
granted to the agent/sandbox.
* A `SandboxPolicy` that internally stores a `Vec<SandboxPermission>`,
but externally exposes a simpler API that can be used to configure
Seatbelt/Landlock.

Previous to this PR, we supported a `--sandbox` flag that effectively
mapped to an enum value in `SandboxPolicy`. Though now that
`SandboxPolicy` is a wrapper around `Vec<SandboxPermission>`, the single
`--sandbox` flag no longer makes sense. While I could have turned it
into a flag that the user can specify multiple times, I think the
current values to use with such a flag are long and potentially messy,
so for the moment, I have dropped support for `--sandbox` altogether and
we can bring it back once we have figured out the naming thing.

Since `--sandbox` is gone, users now have to specify `--full-auto` to
get a sandbox that allows writes in `cwd`. Admittedly, there is no clean
way to specify the equivalent of `--full-auto` in your `config.toml`
right now, so we will have to revisit that, as well.

Because `Config` presents a `SandboxPolicy` field and `SandboxPolicy`
changed considerably, I had to overhaul how config loading works, as
well. There are now two distinct concepts, `ConfigToml` and `Config`:

* `ConfigToml` is the deserialization of `~/.codex/config.toml`. As one
might expect, every field is `Optional` and it is `#[derive(Deserialize,
Default)]`. Consistent use of `Optional` makes it clear what the user
has specified explicitly.
* `Config` is the "normalized config" and is produced by merging
`ConfigToml` with `ConfigOverrides`. Where `ConfigToml` contains a raw
`Option<Vec<SandboxPermission>>`, `Config` presents only the final
`SandboxPolicy`.

The changes to `core/src/exec.rs` and `core/src/linux.rs` merit extra
special attention to ensure we are faithfully mapping the
`SandboxPolicy` to the Seatbelt and Landlock configs, respectively.

Also, take note that `core/src/seatbelt_readonly_policy.sbpl` has been
renamed to `codex-rs/core/src/seatbelt_base_policy.sbpl` and that
`(allow file-read*)` has been removed from the `.sbpl` file as now this
is added to the policy in `core/src/exec.rs` when
`sandbox_policy.has_full_disk_read_access()` is `true`.
2025-04-29 15:01:16 -07:00
Michael Bolin
b9bba09819 fix: eliminate runtime dependency on patch(1) for apply_patch (#718)
When processing an `apply_patch` tool call, we were already computing
the new file content in order to compute the unified diff. Before this
PR, we were shelling out to `patch(1)` to apply the unified diff once
the user accepted the change, but this updates the code to just retain
the new file content and use it to write the file when the user accepts.
This simplifies deployment because it no longer assumes `patch(1)` is on
the host.

Note this change is internal to the Codex agent and does not affect
`protocol.rs`.
2025-04-28 21:15:41 -07:00
Michael Bolin
e79549f039 feat: add debug landlock subcommand comparable to debug seatbelt (#715)
This PR adds a `debug landlock` subcommand to the Codex CLI for testing
how Codex would execute a command using the specified sandbox policy.

Built and ran this code in the `rust:latest` Docker container. In the
container, hitting the network with vanilla `curl` succeeds:

```
$ curl google.com
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
<A HREF="http://www.google.com/">here</A>.
</BODY></HTML>
```

whereas this fails, as expected:

```
$ cargo run -- debug landlock -s network-restricted -- curl google.com
curl: (6) getaddrinfo() thread failed to start
```
2025-04-28 16:37:05 -07:00
Michael Bolin
e7ad9449ea feat: make it possible to set disable_response_storage = true in config.toml (#714)
https://github.com/openai/codex/pull/642 introduced support for the
`--disable-response-storage` flag, but if you are a ZDR customer, it is
tedious to set this every time, so this PR makes it possible to set this
once in `config.toml` and be done with it.

Incidentally, this tidies things up such that now `init_codex()` takes
only one parameter: `Config`.
2025-04-28 15:39:34 -07:00
Michael Bolin
40460faf2a fix: tighten up check for /usr/bin/sandbox-exec (#710)
* In both TypeScript and Rust, we now invoke `/usr/bin/sandbox-exec`
explicitly rather than whatever `sandbox-exec` happens to be on the
`PATH`.
* Changed `isSandboxExecAvailable` to use `access()` rather than
`command -v` so that:
  *  We only do the check once over the lifetime of the Codex process.
  * The check is specific to `/usr/bin/sandbox-exec`.
* We now do a syscall rather than incur the overhead of spawning a
process, dealing with timeouts, etc.

I think there is still room for improvement here where we should move
the `isSandboxExecAvailable` check earlier in the CLI, ideally right
after we do arg parsing to verify that we can provide the Seatbelt
sandbox if that is what the user has requested.
2025-04-28 13:42:04 -07:00
Michael Bolin
38575ed8aa fix: increase timeout of test_writable_root (#713)
Although we made some promising fixes in
https://github.com/openai/codex/pull/662, we are still seeing some
flakiness in `test_writable_root()`. If this continues to flake with the
more generous timeout, we should try something other than simply
increasing the timeout.
2025-04-28 13:09:27 -07:00
Michael Bolin
4eda4dd772 feat: load defaults into Config and introduce ConfigOverrides (#677)
This changes how instantiating `Config` works and also adds
`approval_policy` and `sandbox_policy` as fields. The idea is:

* All fields of `Config` have appropriate default values.
* `Config` is initially loaded from `~/.codex/config.toml`, so values in
`config.toml` will override those defaults.
* Clients must instantiate `Config` via
`Config::load_with_overrides(ConfigOverrides)` where `ConfigOverrides`
has optional overrides that are expected to be settable based on CLI
flags.

The `Config` should be defined early in the program and then passed
down. Now functions like `init_codex()` take fewer individual parameters
because they can just take a `Config`.

Also, `Config::load()` used to fail silently if `~/.codex/config.toml`
had a parse error and fell back to the default config. This seemed
really bad because it wasn't clear why the values in my `config.toml`
weren't getting picked up. I changed things so that
`load_with_overrides()` returns `Result<Config>` and verified that the
various CLIs print a reasonable error if `config.toml` is malformed.

Finally, I also updated the TUI to show which **sandbox** value is being
used, as we do for other key values like **model** and **approval**.
This was also a reminder that the various values of `--sandbox` are
honored on Linux but not macOS today, so I added some TODOs about fixing
that.
2025-04-27 21:47:50 -07:00
Michael Bolin
b0ba65a936 fix: write logs to ~/.codex/log instead of /tmp (#669)
Previously, the Rust TUI was writing log files to `/tmp`, which is
world-readable and not available on Windows, so that isn't great.

This PR tries to clean things up by adding a function that provides the
path to the "Codex config dir," e.g., `~/.codex` (though I suppose we
could support `$CODEX_HOME` to override this?) and then defines other
paths in terms of the result of `codex_dir()`.

For example, `log_dir()` returns the folder where log files should be
written which is defined in terms of `codex_dir()`. I updated the TUI to
use this function. On UNIX, we even go so far as to `chmod 600` the log
file by default, though as noted in a comment, it's a bit tedious to do
the equivalent on Windows, so we just let that go for now.

This also changes the default logging level to `info` for `codex_core`
and `codex_tui` when `RUST_LOG` is not specified. I'm not really sure if
we should use a more verbose default (it may be helpful when debugging
user issues), though if so, we should probably also set up log rotation?
2025-04-25 17:37:41 -07:00
Michael Bolin
c18f1689a9 fix: small fixes so Codex compiles on Windows (#673)
Small fixes required:

* `ExitStatusExt` differs because UNIX expects exit code to be `i32`
whereas Windows does `u32`
* Marking a file "executable only by owner" is a bit more involved on
Windows. We just do something approximate for now (and add a TODO) to
get things compiling.

I created this PR on my personal Windows machine and `cargo test` and
`cargo clippy` succeed. Once this is in, I'll rebase
https://github.com/openai/codex/pull/665 on top so Windows stays fixed!
2025-04-25 15:58:44 -07:00
Michael Bolin
ebd2ae4abd fix: remove dependency on expanduser crate (#667)
In putting up https://github.com/openai/codex/pull/665, I discovered
that the `expanduser` crate does not compile on Windows. Looking into
it, we do not seem to need it because we were only using it with a value
that was passed in via a command-line flag, so the shell expands `~` for
us before we see it, anyway. (I changed the type in `Cli` from `String`
to `PathBuf`, to boot.)

If we do need this sort of functionality in the future,
https://docs.rs/shellexpand/latest/shellexpand/fn.tilde.html seems
promising.
2025-04-25 14:20:21 -07:00
Michael Bolin
9c3ebac3b7 fix: flipped the sense of Prompt.store in #642 (#663)
I got the sense of this wrong in
https://github.com/openai/codex/pull/642. In that PR, I made
`--disable-response-storage` work, but broke the default case.

With this fix, both cases work and I think the code is a bit cleaner.
2025-04-25 13:41:17 -07:00
Parker Thompson
7d9de34bc7 [codex-rs] Improve linux sandbox timeouts (#662)
* Fixes flaking rust unit test
* Adds explicit sandbox exec timeout handling
2025-04-25 12:56:20 -07:00
Michael Bolin
b323d10ea7 feat: add ZDR support to Rust implementation (#642)
This adds support for the `--disable-response-storage` flag across our
multiple Rust CLIs to support customers who have opted into Zero-Data
Retention (ZDR). The analogous changes to the TypeScript CLI were:

* https://github.com/openai/codex/pull/481
* https://github.com/openai/codex/pull/543

For a client using ZDR, `previous_response_id` will never be available,
so the `input` field of an API request must include the full transcript
of the conversation thus far. As such, this PR changes the type of
`Prompt.input` from `Vec<ResponseInputItem>` to `Vec<ResponseItem>`.

Practically speaking, `ResponseItem` was effectively a "superset" of
`ResponseInputItem` already. The main difference for us is that
`ResponseItem` includes the `FunctionCall` variant that we have to
include as part of the conversation history in the ZDR case.

Another key change in this PR is modifying `try_run_turn()` so that it
returns the `Vec<ResponseItem>` for the turn in addition to the
`Vec<ResponseInputItem>` produced by `try_run_turn()`. This is because
the caller of `run_turn()` needs to record the `Vec<ResponseItem>` when
ZDR is enabled.

To that end, this PR introduces `ZdrTranscript` (and adds
`zdr_transcript: Option<ZdrTranscript>` to `struct State` in `codex.rs`)
to take responsibility for maintaining the conversation transcript in
the ZDR case.
2025-04-25 12:08:18 -07:00
oai-ragona
d7a40195e6 [codex-rs] Reliability pass on networking (#658)
We currently see a behavior that looks like this:
```
2025-04-25T16:52:24.552789Z  WARN codex_core::codex: stream disconnected - retrying turn (1/10 in 232ms)...
codex> event: BackgroundEvent { message: "stream error: stream disconnected before completion: Transport error: error decoding response body; retrying 1/10 in 232ms…" }
2025-04-25T16:52:54.789885Z  WARN codex_core::codex: stream disconnected - retrying turn (2/10 in 418ms)...
codex> event: BackgroundEvent { message: "stream error: stream disconnected before completion: Transport error: error decoding response body; retrying 2/10 in 418ms…" }
```

This PR contains a few different fixes that attempt to resolve/improve
this:
1. **Remove overall client timeout.** I think
[this](https://github.com/openai/codex/pull/658/files#diff-c39945d3c42f29b506ff54b7fa2be0795b06d7ad97f1bf33956f60e3c6f19c19L173)
is perhaps the big fix -- it looks to me like this was actually timing
out even if events were still coming through, and that was causing a
disconnect right in the middle of a healthy stream.
2. **Cap response sizes.** We were frequently sending MUCH larger
responses than the upstream typescript `codex`, and that was definitely
not helping. [Fix
here](https://github.com/openai/codex/pull/658/files#diff-d792bef59aa3ee8cb0cbad8b176dbfefe451c227ac89919da7c3e536a9d6cdc0R21-R26)
for that one.
3. **Much higher idle timeout.** Our idle timeout value was much lower
than typescript.
4. **Sub-linear backoff.** We were much too aggressively backing off,
[this](https://github.com/openai/codex/pull/658/files#diff-5d5959b95c6239e6188516da5c6b7eb78154cd9cfedfb9f753d30a7b6d6b8b06R30-R33)
makes it sub-exponential but maintains the jitter and such.

I was seeing that `stream error: stream disconnected` behavior
constantly, and anecdotally I can no longer reproduce. It feels much
snappier.
2025-04-25 11:44:22 -07:00
Michael Bolin
bfe6fac463 fix: close stdin when running an exec tool call (#636)
We were already doing this in the TypeScript version, but forgot to
bring this over to Rust:


c38c2a59c7/codex-cli/src/utils/agent/sandbox/raw-exec.ts (L76-L78)
2025-04-24 18:06:08 -07:00
oai-ragona
b34ed2ab83 [codex-rs] More fine-grained sandbox flag support on Linux (#632)
##### What/Why
This PR makes it so that in Linux we actually respect the different
types of `--sandbox` flag, such that users can apply network and
filesystem restrictions in combination (currently the only supported
behavior), or just pick one or the other.

We should add similar support for OSX in a future PR.

##### Testing
From Linux devbox, updated tests to use more specific flags:
```
test linux::tests_linux::sandbox_blocks_ping ... ok
test linux::tests_linux::sandbox_blocks_getent ... ok
test linux::tests_linux::test_root_read ... ok
test linux::tests_linux::test_dev_null_write ... ok
test linux::tests_linux::sandbox_blocks_dev_tcp_redirection ... ok
test linux::tests_linux::sandbox_blocks_ssh ... ok
test linux::tests_linux::test_writable_root ... ok
test linux::tests_linux::sandbox_blocks_curl ... ok
test linux::tests_linux::sandbox_blocks_wget ... ok
test linux::tests_linux::sandbox_blocks_nc ... ok
test linux::tests_linux::test_root_write - should panic ... ok
```

##### Todo
- [ ] Add negative tests (e.g. confirm you can hit the network if you
configure filesystem only restrictions)
2025-04-24 15:33:45 -07:00
Michael Bolin
31d0d7a305 feat: initial import of Rust implementation of Codex CLI in codex-rs/ (#629)
As stated in `codex-rs/README.md`:

Today, Codex CLI is written in TypeScript and requires Node.js 22+ to
run it. For a number of users, this runtime requirement inhibits
adoption: they would be better served by a standalone executable. As
maintainers, we want Codex to run efficiently in a wide range of
environments with minimal overhead. We also want to take advantage of
operating system-specific APIs to provide better sandboxing, where
possible.

To that end, we are moving forward with a Rust implementation of Codex
CLI contained in this folder, which has the following benefits:

- The CLI compiles to small, standalone, platform-specific binaries.
- Can make direct, native calls to
[seccomp](https://man7.org/linux/man-pages/man2/seccomp.2.html) and
[landlock](https://man7.org/linux/man-pages/man7/landlock.7.html) in
order to support sandboxing on Linux.
- No runtime garbage collection, resulting in lower memory consumption
and better, more predictable performance.

Currently, the Rust implementation is materially behind the TypeScript
implementation in functionality, so continue to use the TypeScript
implmentation for the time being. We will publish native executables via
GitHub Releases as soon as we feel the Rust version is usable.
2025-04-24 13:31:40 -07:00