fix: overhaul how we spawn commands under seccomp/landlock on Linux (#1086)

Historically, we spawned the Seatbelt and Landlock sandboxes in substantially different ways: For **Seatbelt**, we would run `/usr/bin/sandbox-exec` with our policy specified as an arg followed by the original command: d1de7bb383/codex-rs/core/src/exec.rs (L147-L219) For **Landlock/Seccomp**, we would do `tokio::runtime::Builder::new_current_thread()`, _invoke Landlock/Seccomp APIs to modify the permissions of that new thread_, and then spawn the command: d1de7bb383/codex-rs/core/src/exec_linux.rs (L28-L49) While it is neat that Landlock/Seccomp supports applying a policy to only one thread without having to apply it to the entire process, it requires us to maintain two different codepaths and is a bit harder to reason about. The tipping point was https://github.com/openai/codex/pull/1061, in which we had to start building up the `env` in an unexpected way for the existing Landlock/Seccomp approach to continue to work. This PR overhauls things so that we do similar things for Mac and Linux. It turned out that we were already building our own "helper binary" comparable to Mac's `sandbox-exec` as part of the `cli` crate: d1de7bb383/codex-rs/cli/Cargo.toml (L10-L12) We originally created this to build a small binary to include with the Node.js version of the Codex CLI to provide support for Linux sandboxing. Though the sticky bit is that, at this point, we still want to deploy the Rust version of Codex as a single, standalone binary rather than a CLI and a supporting sandboxing binary. To satisfy this goal, we use "the arg0 trick," in which we: * use `std::env::current_exe()` to get the path to the CLI that is currently running * use the CLI as the `program` for the `Command` * set `"codex-linux-sandbox"` as arg0 for the `Command` A CLI that supports sandboxing should check arg0 at the start of the program. If it is `"codex-linux-sandbox"`, it must invoke `codex_linux_sandbox::run_main()`, which runs the CLI as if it were `codex-linux-sandbox`. When acting as `codex-linux-sandbox`, we make the appropriate Landlock/Seccomp API calls and then use `execvp(3)` to spawn the original command, so do _replace_ the process rather than spawn a subprocess. Incidentally, we do this before starting the Tokio runtime, so the process should only have one thread when `execvp(3)` is called. Because the `core` crate that needs to spawn the Linux sandboxing is not a CLI in its own right, this means that every CLI that includes `core` and relies on this behavior has to (1) implement it and (2) provide the path to the sandboxing executable. While the path is almost always `std::env::current_exe()`, we needed to make this configurable for integration tests, so `Config` now has a `codex_linux_sandbox_exe: Option<PathBuf>` property to facilitate threading this through, introduced in https://github.com/openai/codex/pull/1089. This common pattern is now captured in `codex_linux_sandbox::run_with_sandbox()` and all of the `main.rs` functions that should use it have been updated as part of this PR. The `codex-linux-sandbox` crate added to the Cargo workspace as part of this PR now has the bulk of the Landlock/Seccomp logic, which makes `core` a bit simpler. Indeed, `core/src/exec_linux.rs` and `core/src/landlock.rs` were removed/ported as part of this PR. I also moved the unit tests for this code into an integration test, `linux-sandbox/tests/landlock.rs`, in which I use `env!("CARGO_BIN_EXE_codex-linux-sandbox")` as the value for `codex_linux_sandbox_exe` since `std::env::current_exe()` is not appropriate in that case.
2025-05-23 11:37:07 -07:00
parent d1de7bb383
commit 89ef4efdcf
29 changed files with 862 additions and 729 deletions
--- a/codex-rs/linux-sandbox/src/landlock.rs
+++ b/codex-rs/linux-sandbox/src/landlock.rs
@@ -0,0 +1,139 @@
+use std::collections::BTreeMap;
+use std::path::Path;
+use std::path::PathBuf;
+
+use codex_core::error::CodexErr;
+use codex_core::error::Result;
+use codex_core::error::SandboxErr;
+use codex_core::protocol::SandboxPolicy;
+
+use landlock::ABI;
+use landlock::Access;
+use landlock::AccessFs;
+use landlock::CompatLevel;
+use landlock::Compatible;
+use landlock::Ruleset;
+use landlock::RulesetAttr;
+use landlock::RulesetCreatedAttr;
+use seccompiler::BpfProgram;
+use seccompiler::SeccompAction;
+use seccompiler::SeccompCmpArgLen;
+use seccompiler::SeccompCmpOp;
+use seccompiler::SeccompCondition;
+use seccompiler::SeccompFilter;
+use seccompiler::SeccompRule;
+use seccompiler::TargetArch;
+use seccompiler::apply_filter;
+
+/// Apply sandbox policies inside this thread so only the child inherits
+/// them, not the entire CLI process.
+pub(crate) fn apply_sandbox_policy_to_current_thread(
+    sandbox_policy: &SandboxPolicy,
+    cwd: &Path,
+) -> Result<()> {
+    if !sandbox_policy.has_full_network_access() {
+        install_network_seccomp_filter_on_current_thread()?;
+    }
+
+    if !sandbox_policy.has_full_disk_write_access() {
+        let writable_roots = sandbox_policy.get_writable_roots_with_cwd(cwd);
+        install_filesystem_landlock_rules_on_current_thread(writable_roots)?;
+    }
+
+    // TODO(ragona): Add appropriate restrictions if
+    // `sandbox_policy.has_full_disk_read_access()` is `false`.
+
+    Ok(())
+}
+
+/// Installs Landlock file-system rules on the current thread allowing read
+/// access to the entire file-system while restricting write access to
+/// `/dev/null` and the provided list of `writable_roots`.
+///
+/// # Errors
+/// Returns [`CodexErr::Sandbox`] variants when the ruleset fails to apply.
+fn install_filesystem_landlock_rules_on_current_thread(writable_roots: Vec<PathBuf>) -> Result<()> {
+    let abi = ABI::V5;
+    let access_rw = AccessFs::from_all(abi);
+    let access_ro = AccessFs::from_read(abi);
+
+    let mut ruleset = Ruleset::default()
+        .set_compatibility(CompatLevel::BestEffort)
+        .handle_access(access_rw)?
+        .create()?
+        .add_rules(landlock::path_beneath_rules(&["/"], access_ro))?
+        .add_rules(landlock::path_beneath_rules(&["/dev/null"], access_rw))?
+        .set_no_new_privs(true);
+
+    if !writable_roots.is_empty() {
+        ruleset = ruleset.add_rules(landlock::path_beneath_rules(&writable_roots, access_rw))?;
+    }
+
+    let status = ruleset.restrict_self()?;
+
+    if status.ruleset == landlock::RulesetStatus::NotEnforced {
+        return Err(CodexErr::Sandbox(SandboxErr::LandlockRestrict));
+    }
+
+    Ok(())
+}
+
+/// Installs a seccomp filter that blocks outbound network access except for
+/// AF_UNIX domain sockets.
+fn install_network_seccomp_filter_on_current_thread() -> std::result::Result<(), SandboxErr> {
+    // Build rule map.
+    let mut rules: BTreeMap<i64, Vec<SeccompRule>> = BTreeMap::new();
+
+    // Helper – insert unconditional deny rule for syscall number.
+    let mut deny_syscall = |nr: i64| {
+        rules.insert(nr, vec![]); // empty rule vec = unconditional match
+    };
+
+    deny_syscall(libc::SYS_connect);
+    deny_syscall(libc::SYS_accept);
+    deny_syscall(libc::SYS_accept4);
+    deny_syscall(libc::SYS_bind);
+    deny_syscall(libc::SYS_listen);
+    deny_syscall(libc::SYS_getpeername);
+    deny_syscall(libc::SYS_getsockname);
+    deny_syscall(libc::SYS_shutdown);
+    deny_syscall(libc::SYS_sendto);
+    deny_syscall(libc::SYS_sendmsg);
+    deny_syscall(libc::SYS_sendmmsg);
+    deny_syscall(libc::SYS_recvfrom);
+    deny_syscall(libc::SYS_recvmsg);
+    deny_syscall(libc::SYS_recvmmsg);
+    deny_syscall(libc::SYS_getsockopt);
+    deny_syscall(libc::SYS_setsockopt);
+    deny_syscall(libc::SYS_ptrace);
+
+    // For `socket` we allow AF_UNIX (arg0 == AF_UNIX) and deny everything else.
+    let unix_only_rule = SeccompRule::new(vec![SeccompCondition::new(
+        0, // first argument (domain)
+        SeccompCmpArgLen::Dword,
+        SeccompCmpOp::Eq,
+        libc::AF_UNIX as u64,
+    )?])?;
+
+    rules.insert(libc::SYS_socket, vec![unix_only_rule]);
+    rules.insert(libc::SYS_socketpair, vec![]); // always deny (Unix can use socketpair but fine, keep open?)
+
+    let filter = SeccompFilter::new(
+        rules,
+        SeccompAction::Allow,                     // default – allow
+        SeccompAction::Errno(libc::EPERM as u32), // when rule matches – return EPERM
+        if cfg!(target_arch = "x86_64") {
+            TargetArch::x86_64
+        } else if cfg!(target_arch = "aarch64") {
+            TargetArch::aarch64
+        } else {
+            unimplemented!("unsupported architecture for seccomp filter");
+        },
+    )?;
+
+    let prog: BpfProgram = filter.try_into()?;
+
+    apply_filter(&prog)?;
+
+    Ok(())
+}
--- a/codex-rs/linux-sandbox/src/lib.rs
+++ b/codex-rs/linux-sandbox/src/lib.rs
@@ -0,0 +1,63 @@
+#[cfg(target_os = "linux")]
+mod landlock;
+#[cfg(target_os = "linux")]
+mod linux_run_main;
+
+#[cfg(target_os = "linux")]
+pub use linux_run_main::run_main;
+
+use std::future::Future;
+use std::path::PathBuf;
+
+/// Helper that consolidates the common boilerplate found in several Codex
+/// binaries (`codex`, `codex-exec`, `codex-tui`) around dispatching to the
+/// `codex-linux-sandbox` sub-command.
+///
+/// When the current executable is invoked through the hard-link or alias
+/// named `codex-linux-sandbox` we *directly* execute [`run_main`](crate::run_main)
+/// (which never returns). Otherwise we:
+/// 1.  Construct a Tokio multi-thread runtime.
+/// 2.  Derive the path to the current executable (so children can re-invoke
+///     the sandbox) when running on Linux.
+/// 3.  Execute the provided async `main_fn` inside that runtime, forwarding
+///     any error.
+///
+/// This function eliminates duplicated code across the various `main.rs`
+/// entry-points.
+pub fn run_with_sandbox<F, Fut>(main_fn: F) -> anyhow::Result<()>
+where
+    F: FnOnce(Option<PathBuf>) -> Fut,
+    Fut: Future<Output = anyhow::Result<()>>,
+{
+    use std::path::Path;
+
+    // Determine if we were invoked via the special alias.
+    let argv0 = std::env::args().next().unwrap_or_default();
+    let exe_name = Path::new(&argv0)
+        .file_name()
+        .and_then(|s| s.to_str())
+        .unwrap_or("");
+
+    if exe_name == "codex-linux-sandbox" {
+        // Safety: [`run_main`] never returns.
+        crate::run_main();
+    }
+
+    // Regular invocation – create a Tokio runtime and execute the provided
+    // async entry-point.
+    let runtime = tokio::runtime::Runtime::new()?;
+    runtime.block_on(async move {
+        let codex_linux_sandbox_exe: Option<PathBuf> = if cfg!(target_os = "linux") {
+            std::env::current_exe().ok()
+        } else {
+            None
+        };
+
+        main_fn(codex_linux_sandbox_exe).await
+    })
+}
+
+#[cfg(not(target_os = "linux"))]
+pub fn run_main() -> ! {
+    panic!("codex-linux-sandbox is only supported on Linux");
+}
--- a/codex-rs/linux-sandbox/src/linux_run_main.rs
+++ b/codex-rs/linux-sandbox/src/linux_run_main.rs
@@ -0,0 +1,59 @@
+use clap::Parser;
+use codex_common::SandboxPermissionOption;
+use std::ffi::CString;
+
+use crate::landlock::apply_sandbox_policy_to_current_thread;
+
+#[derive(Debug, Parser)]
+pub struct LandlockCommand {
+    #[clap(flatten)]
+    pub sandbox: SandboxPermissionOption,
+
+    /// Full command args to run under landlock.
+    #[arg(trailing_var_arg = true)]
+    pub command: Vec<String>,
+}
+
+pub fn run_main() -> ! {
+    let LandlockCommand { sandbox, command } = LandlockCommand::parse();
+
+    let sandbox_policy = match sandbox.permissions.map(Into::into) {
+        Some(sandbox_policy) => sandbox_policy,
+        None => codex_core::protocol::SandboxPolicy::new_read_only_policy(),
+    };
+
+    let cwd = match std::env::current_dir() {
+        Ok(cwd) => cwd,
+        Err(e) => {
+            panic!("failed to getcwd(): {e:?}");
+        }
+    };
+
+    if let Err(e) = apply_sandbox_policy_to_current_thread(&sandbox_policy, &cwd) {
+        panic!("error running landlock: {e:?}");
+    }
+
+    if command.is_empty() {
+        panic!("No command specified to execute.");
+    }
+
+    #[expect(clippy::expect_used)]
+    let c_command =
+        CString::new(command[0].as_str()).expect("Failed to convert command to CString");
+    #[expect(clippy::expect_used)]
+    let c_args: Vec<CString> = command
+        .iter()
+        .map(|arg| CString::new(arg.as_str()).expect("Failed to convert arg to CString"))
+        .collect();
+
+    let mut c_args_ptrs: Vec<*const libc::c_char> = c_args.iter().map(|arg| arg.as_ptr()).collect();
+    c_args_ptrs.push(std::ptr::null());
+
+    unsafe {
+        libc::execvp(c_command.as_ptr(), c_args_ptrs.as_ptr());
+    }
+
+    // If execvp returns, there was an error.
+    let err = std::io::Error::last_os_error();
+    panic!("Failed to execvp {}: {err}", command[0].as_str());
+}
--- a/codex-rs/linux-sandbox/src/main.rs
+++ b/codex-rs/linux-sandbox/src/main.rs
@@ -0,0 +1,6 @@
+/// Note that the cwd, env, and command args are preserved in the ultimate call
+/// to `execv`, so the caller is responsible for ensuring those values are
+/// correct.
+fn main() -> ! {
+    codex_linux_sandbox::run_main()
+}