codex-rs/cli/src/lib.rs

mod exit_status;
#[cfg(unix)]
pub mod landlock;
pub mod proto;
pub mod seatbelt;

use clap::Parser;
use codex_common::SandboxPermissionOption;
use codex_core::protocol::SandboxPolicy;

#[derive(Debug, Parser)]
pub struct SeatbeltCommand {
    /// Convenience alias for low-friction sandboxed automatic execution (network-disabled sandbox that can write to cwd and TMPDIR)
    #[arg(long = "full-auto", default_value_t = false)]
    pub full_auto: bool,

    #[clap(flatten)]
    pub sandbox: SandboxPermissionOption,

    /// Full command args to run under seatbelt.
    #[arg(trailing_var_arg = true)]
    pub command: Vec<String>,
}

#[derive(Debug, Parser)]
pub struct LandlockCommand {
    /// Convenience alias for low-friction sandboxed automatic execution (network-disabled sandbox that can write to cwd and TMPDIR)
    #[arg(long = "full-auto", default_value_t = false)]
    pub full_auto: bool,

    #[clap(flatten)]
    pub sandbox: SandboxPermissionOption,

    /// Full command args to run under landlock.
    #[arg(trailing_var_arg = true)]
    pub command: Vec<String>,
}

pub fn create_sandbox_policy(full_auto: bool, sandbox: SandboxPermissionOption) -> SandboxPolicy {
    if full_auto {
        SandboxPolicy::new_full_auto_policy()
    } else {
        match sandbox.permissions.map(Into::into) {
            Some(sandbox_policy) => sandbox_policy,
            None => SandboxPolicy::new_read_only_policy(),
        }
    }
}
feat: experimental env var: CODEX_SANDBOX_NETWORK_DISABLED (#879) When using Codex to develop Codex itself, I noticed that sometimes it would try to add `#[ignore]` to the following tests: ``` keeps_previous_response_id_between_tasks() retries_on_early_close() ``` Both of these tests start a `MockServer` that launches an HTTP server on an ephemeral port and requires network access to hit it, which the Seatbelt policy associated with `--full-auto` correctly denies. If I wasn't paying attention to the code that Codex was generating, one of these `#[ignore]` annotations could have slipped into the codebase, effectively disabling the test for everyone. To that end, this PR enables an experimental environment variable named `CODEX_SANDBOX_NETWORK_DISABLED` that is set to `1` if the `SandboxPolicy` used to spawn the process does not have full network access. I say it is "experimental" because I'm not convinced this API is quite right, but we need to start somewhere. (It might be more appropriate to have an env var like `CODEX_SANDBOX=full-auto`, but the challenge is that our newer `SandboxPolicy` abstraction does not map to a simple set of enums like in the TypeScript CLI.) We leverage this new functionality by adding the following code to the aforementioned tests as a way to "dynamically disable" them: ```rust if std::env::var(CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR).is_ok() { println!( "Skipping test because it cannot execute when network is disabled in a Codex sandbox." ); return; } ``` We can use the `debug seatbelt --full-auto` command to verify that `cargo test` fails when run under Seatbelt prior to this change: ``` $ cargo run --bin codex -- debug seatbelt --full-auto -- cargo test ---- keeps_previous_response_id_between_tasks stdout ---- thread 'keeps_previous_response_id_between_tasks' panicked at /Users/mbolin/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/wiremock-0.6.3/src/mock_server/builder.rs:107:46: Failed to bind an OS port for a mock server.: Os { code: 1, kind: PermissionDenied, message: "Operation not permitted" } note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace failures: keeps_previous_response_id_between_tasks test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s error: test failed, to rerun pass `-p codex-core --test previous_response_id` ``` Though after this change, the above command succeeds! This means that, going forward, when Codex operates on Codex itself, when it runs `cargo test`, only "real failures" should cause the command to fail. As part of this change, I decided to tighten up the codepaths for running `exec()` for shell tool calls. In particular, we do it in `core` for the main Codex business logic itself, but we also expose this logic via `debug` subcommands in the CLI in the `cli` crate. The logic for the `debug` subcommands was not quite as faithful to the true business logic as I liked, so I: * refactored a bit of the Linux code, splitting `linux.rs` into `linux_exec.rs` and `landlock.rs` in the `core` crate. * gating less code behind `#[cfg(target_os = "linux")]` because such code does not get built by default when I develop on Mac, which means I either have to build the code in Docker or wait for CI signal * introduced `macro_rules! configure_command` in `exec.rs` so we can have both sync and async versions of this code. The synchronous version seems more appropriate for straight threads or potentially fork/exec. 2025-05-09 18:29:34 -07:00			`mod exit_status;`
			`#[cfg(unix)]`
feat: codex-linux-sandbox standalone executable (#740) This introduces a standalone executable that run the equivalent of the `codex debug landlock` subcommand and updates `rust-release.yml` to include it in the release. The idea is that we will include this small binary with the TypeScript CLI to provide support for Linux sandboxing. 2025-04-29 19:21:26 -07:00			`pub mod landlock;`
			`pub mod proto;`
			`pub mod seatbelt;`

			`use clap::Parser;`
chore: introduce codex-common crate (#843) I started this PR because I wanted to share the `format_duration()` utility function in `codex-rs/exec/src/event_processor.rs` with the TUI. The question was: where to put it? `core` should have as few dependencies as possible, so moving it there would introduce a dependency on `chrono`, which seemed undesirable. `core` already had this `cli` feature to deal with a similar situation around sharing common utility functions, so I decided to: * make `core` feature-free * introduce `common` * `common` can have as many "special interest" features as it needs, each of which can declare their own deps * the first two features of common are `cli` and `elapsed` In practice, this meant updating a number of `Cargo.toml` files, replacing this line: ```toml codex-core = { path = "../core", features = ["cli"] } ``` with these: ```toml codex-core = { path = "../core" } codex-common = { path = "../common", features = ["cli"] } ``` Moving `format_duration()` into its own file gave it some "breathing room" to add a unit test, so I had Codex generate some tests and new support for durations over 1 minute. 2025-05-06 17:38:56 -07:00			`use codex_common::SandboxPermissionOption;`
feat: codex-linux-sandbox standalone executable (#740) This introduces a standalone executable that run the equivalent of the `codex debug landlock` subcommand and updates `rust-release.yml` to include it in the release. The idea is that we will include this small binary with the TypeScript CLI to provide support for Linux sandboxing. 2025-04-29 19:21:26 -07:00			`use codex_core::protocol::SandboxPolicy;`

			`#[derive(Debug, Parser)]`
			`pub struct SeatbeltCommand {`
			`/// Convenience alias for low-friction sandboxed automatic execution (network-disabled sandbox that can write to cwd and TMPDIR)`
			`#[arg(long = "full-auto", default_value_t = false)]`
			`pub full_auto: bool,`

			`#[clap(flatten)]`
			`pub sandbox: SandboxPermissionOption,`

			`/// Full command args to run under seatbelt.`
			`#[arg(trailing_var_arg = true)]`
			`pub command: Vec<String>,`
			`}`

			`#[derive(Debug, Parser)]`
			`pub struct LandlockCommand {`
			`/// Convenience alias for low-friction sandboxed automatic execution (network-disabled sandbox that can write to cwd and TMPDIR)`
			`#[arg(long = "full-auto", default_value_t = false)]`
			`pub full_auto: bool,`

			`#[clap(flatten)]`
			`pub sandbox: SandboxPermissionOption,`

			`/// Full command args to run under landlock.`
			`#[arg(trailing_var_arg = true)]`
			`pub command: Vec<String>,`
			`}`

			`pub fn create_sandbox_policy(full_auto: bool, sandbox: SandboxPermissionOption) -> SandboxPolicy {`
			`if full_auto {`
			`SandboxPolicy::new_full_auto_policy()`
			`} else {`
			`match sandbox.permissions.map(Into::into) {`
			`Some(sandbox_policy) => sandbox_policy,`
			`None => SandboxPolicy::new_read_only_policy(),`
			`}`
			`}`
			`}`