fix: overhaul how we spawn commands under seccomp/landlock on Linux (#1086)

Historically, we spawned the Seatbelt and Landlock sandboxes in
substantially different ways:

For **Seatbelt**, we would run `/usr/bin/sandbox-exec` with our policy
specified as an arg followed by the original command:


d1de7bb383/codex-rs/core/src/exec.rs (L147-L219)

For **Landlock/Seccomp**, we would do
`tokio::runtime::Builder::new_current_thread()`, _invoke
Landlock/Seccomp APIs to modify the permissions of that new thread_, and
then spawn the command:


d1de7bb383/codex-rs/core/src/exec_linux.rs (L28-L49)

While it is neat that Landlock/Seccomp supports applying a policy to
only one thread without having to apply it to the entire process, it
requires us to maintain two different codepaths and is a bit harder to
reason about. The tipping point was
https://github.com/openai/codex/pull/1061, in which we had to start
building up the `env` in an unexpected way for the existing
Landlock/Seccomp approach to continue to work.

This PR overhauls things so that we do similar things for Mac and Linux.
It turned out that we were already building our own "helper binary"
comparable to Mac's `sandbox-exec` as part of the `cli` crate:


d1de7bb383/codex-rs/cli/Cargo.toml (L10-L12)

We originally created this to build a small binary to include with the
Node.js version of the Codex CLI to provide support for Linux
sandboxing.

Though the sticky bit is that, at this point, we still want to deploy
the Rust version of Codex as a single, standalone binary rather than a
CLI and a supporting sandboxing binary. To satisfy this goal, we use
"the arg0 trick," in which we:

* use `std::env::current_exe()` to get the path to the CLI that is
currently running
* use the CLI as the `program` for the `Command`
* set `"codex-linux-sandbox"` as arg0 for the `Command`

A CLI that supports sandboxing should check arg0 at the start of the
program. If it is `"codex-linux-sandbox"`, it must invoke
`codex_linux_sandbox::run_main()`, which runs the CLI as if it were
`codex-linux-sandbox`. When acting as `codex-linux-sandbox`, we make the
appropriate Landlock/Seccomp API calls and then use `execvp(3)` to spawn
the original command, so do _replace_ the process rather than spawn a
subprocess. Incidentally, we do this before starting the Tokio runtime,
so the process should only have one thread when `execvp(3)` is called.

Because the `core` crate that needs to spawn the Linux sandboxing is not
a CLI in its own right, this means that every CLI that includes `core`
and relies on this behavior has to (1) implement it and (2) provide the
path to the sandboxing executable. While the path is almost always
`std::env::current_exe()`, we needed to make this configurable for
integration tests, so `Config` now has a `codex_linux_sandbox_exe:
Option<PathBuf>` property to facilitate threading this through,
introduced in https://github.com/openai/codex/pull/1089.

This common pattern is now captured in
`codex_linux_sandbox::run_with_sandbox()` and all of the `main.rs`
functions that should use it have been updated as part of this PR.

The `codex-linux-sandbox` crate added to the Cargo workspace as part of
this PR now has the bulk of the Landlock/Seccomp logic, which makes
`core` a bit simpler. Indeed, `core/src/exec_linux.rs` and
`core/src/landlock.rs` were removed/ported as part of this PR. I also
moved the unit tests for this code into an integration test,
`linux-sandbox/tests/landlock.rs`, in which I use
`env!("CARGO_BIN_EXE_codex-linux-sandbox")` as the value for
`codex_linux_sandbox_exe` since `std::env::current_exe()` is not
appropriate in that case.
This commit is contained in:
Michael Bolin
2025-05-23 11:37:07 -07:00
committed by GitHub
parent d1de7bb383
commit 89ef4efdcf
29 changed files with 862 additions and 729 deletions

View File

@@ -0,0 +1,139 @@
use std::collections::BTreeMap;
use std::path::Path;
use std::path::PathBuf;
use codex_core::error::CodexErr;
use codex_core::error::Result;
use codex_core::error::SandboxErr;
use codex_core::protocol::SandboxPolicy;
use landlock::ABI;
use landlock::Access;
use landlock::AccessFs;
use landlock::CompatLevel;
use landlock::Compatible;
use landlock::Ruleset;
use landlock::RulesetAttr;
use landlock::RulesetCreatedAttr;
use seccompiler::BpfProgram;
use seccompiler::SeccompAction;
use seccompiler::SeccompCmpArgLen;
use seccompiler::SeccompCmpOp;
use seccompiler::SeccompCondition;
use seccompiler::SeccompFilter;
use seccompiler::SeccompRule;
use seccompiler::TargetArch;
use seccompiler::apply_filter;
/// Apply sandbox policies inside this thread so only the child inherits
/// them, not the entire CLI process.
pub(crate) fn apply_sandbox_policy_to_current_thread(
sandbox_policy: &SandboxPolicy,
cwd: &Path,
) -> Result<()> {
if !sandbox_policy.has_full_network_access() {
install_network_seccomp_filter_on_current_thread()?;
}
if !sandbox_policy.has_full_disk_write_access() {
let writable_roots = sandbox_policy.get_writable_roots_with_cwd(cwd);
install_filesystem_landlock_rules_on_current_thread(writable_roots)?;
}
// TODO(ragona): Add appropriate restrictions if
// `sandbox_policy.has_full_disk_read_access()` is `false`.
Ok(())
}
/// Installs Landlock file-system rules on the current thread allowing read
/// access to the entire file-system while restricting write access to
/// `/dev/null` and the provided list of `writable_roots`.
///
/// # Errors
/// Returns [`CodexErr::Sandbox`] variants when the ruleset fails to apply.
fn install_filesystem_landlock_rules_on_current_thread(writable_roots: Vec<PathBuf>) -> Result<()> {
let abi = ABI::V5;
let access_rw = AccessFs::from_all(abi);
let access_ro = AccessFs::from_read(abi);
let mut ruleset = Ruleset::default()
.set_compatibility(CompatLevel::BestEffort)
.handle_access(access_rw)?
.create()?
.add_rules(landlock::path_beneath_rules(&["/"], access_ro))?
.add_rules(landlock::path_beneath_rules(&["/dev/null"], access_rw))?
.set_no_new_privs(true);
if !writable_roots.is_empty() {
ruleset = ruleset.add_rules(landlock::path_beneath_rules(&writable_roots, access_rw))?;
}
let status = ruleset.restrict_self()?;
if status.ruleset == landlock::RulesetStatus::NotEnforced {
return Err(CodexErr::Sandbox(SandboxErr::LandlockRestrict));
}
Ok(())
}
/// Installs a seccomp filter that blocks outbound network access except for
/// AF_UNIX domain sockets.
fn install_network_seccomp_filter_on_current_thread() -> std::result::Result<(), SandboxErr> {
// Build rule map.
let mut rules: BTreeMap<i64, Vec<SeccompRule>> = BTreeMap::new();
// Helper insert unconditional deny rule for syscall number.
let mut deny_syscall = |nr: i64| {
rules.insert(nr, vec![]); // empty rule vec = unconditional match
};
deny_syscall(libc::SYS_connect);
deny_syscall(libc::SYS_accept);
deny_syscall(libc::SYS_accept4);
deny_syscall(libc::SYS_bind);
deny_syscall(libc::SYS_listen);
deny_syscall(libc::SYS_getpeername);
deny_syscall(libc::SYS_getsockname);
deny_syscall(libc::SYS_shutdown);
deny_syscall(libc::SYS_sendto);
deny_syscall(libc::SYS_sendmsg);
deny_syscall(libc::SYS_sendmmsg);
deny_syscall(libc::SYS_recvfrom);
deny_syscall(libc::SYS_recvmsg);
deny_syscall(libc::SYS_recvmmsg);
deny_syscall(libc::SYS_getsockopt);
deny_syscall(libc::SYS_setsockopt);
deny_syscall(libc::SYS_ptrace);
// For `socket` we allow AF_UNIX (arg0 == AF_UNIX) and deny everything else.
let unix_only_rule = SeccompRule::new(vec![SeccompCondition::new(
0, // first argument (domain)
SeccompCmpArgLen::Dword,
SeccompCmpOp::Eq,
libc::AF_UNIX as u64,
)?])?;
rules.insert(libc::SYS_socket, vec![unix_only_rule]);
rules.insert(libc::SYS_socketpair, vec![]); // always deny (Unix can use socketpair but fine, keep open?)
let filter = SeccompFilter::new(
rules,
SeccompAction::Allow, // default allow
SeccompAction::Errno(libc::EPERM as u32), // when rule matches return EPERM
if cfg!(target_arch = "x86_64") {
TargetArch::x86_64
} else if cfg!(target_arch = "aarch64") {
TargetArch::aarch64
} else {
unimplemented!("unsupported architecture for seccomp filter");
},
)?;
let prog: BpfProgram = filter.try_into()?;
apply_filter(&prog)?;
Ok(())
}

View File

@@ -0,0 +1,63 @@
#[cfg(target_os = "linux")]
mod landlock;
#[cfg(target_os = "linux")]
mod linux_run_main;
#[cfg(target_os = "linux")]
pub use linux_run_main::run_main;
use std::future::Future;
use std::path::PathBuf;
/// Helper that consolidates the common boilerplate found in several Codex
/// binaries (`codex`, `codex-exec`, `codex-tui`) around dispatching to the
/// `codex-linux-sandbox` sub-command.
///
/// When the current executable is invoked through the hard-link or alias
/// named `codex-linux-sandbox` we *directly* execute [`run_main`](crate::run_main)
/// (which never returns). Otherwise we:
/// 1. Construct a Tokio multi-thread runtime.
/// 2. Derive the path to the current executable (so children can re-invoke
/// the sandbox) when running on Linux.
/// 3. Execute the provided async `main_fn` inside that runtime, forwarding
/// any error.
///
/// This function eliminates duplicated code across the various `main.rs`
/// entry-points.
pub fn run_with_sandbox<F, Fut>(main_fn: F) -> anyhow::Result<()>
where
F: FnOnce(Option<PathBuf>) -> Fut,
Fut: Future<Output = anyhow::Result<()>>,
{
use std::path::Path;
// Determine if we were invoked via the special alias.
let argv0 = std::env::args().next().unwrap_or_default();
let exe_name = Path::new(&argv0)
.file_name()
.and_then(|s| s.to_str())
.unwrap_or("");
if exe_name == "codex-linux-sandbox" {
// Safety: [`run_main`] never returns.
crate::run_main();
}
// Regular invocation create a Tokio runtime and execute the provided
// async entry-point.
let runtime = tokio::runtime::Runtime::new()?;
runtime.block_on(async move {
let codex_linux_sandbox_exe: Option<PathBuf> = if cfg!(target_os = "linux") {
std::env::current_exe().ok()
} else {
None
};
main_fn(codex_linux_sandbox_exe).await
})
}
#[cfg(not(target_os = "linux"))]
pub fn run_main() -> ! {
panic!("codex-linux-sandbox is only supported on Linux");
}

View File

@@ -0,0 +1,59 @@
use clap::Parser;
use codex_common::SandboxPermissionOption;
use std::ffi::CString;
use crate::landlock::apply_sandbox_policy_to_current_thread;
#[derive(Debug, Parser)]
pub struct LandlockCommand {
#[clap(flatten)]
pub sandbox: SandboxPermissionOption,
/// Full command args to run under landlock.
#[arg(trailing_var_arg = true)]
pub command: Vec<String>,
}
pub fn run_main() -> ! {
let LandlockCommand { sandbox, command } = LandlockCommand::parse();
let sandbox_policy = match sandbox.permissions.map(Into::into) {
Some(sandbox_policy) => sandbox_policy,
None => codex_core::protocol::SandboxPolicy::new_read_only_policy(),
};
let cwd = match std::env::current_dir() {
Ok(cwd) => cwd,
Err(e) => {
panic!("failed to getcwd(): {e:?}");
}
};
if let Err(e) = apply_sandbox_policy_to_current_thread(&sandbox_policy, &cwd) {
panic!("error running landlock: {e:?}");
}
if command.is_empty() {
panic!("No command specified to execute.");
}
#[expect(clippy::expect_used)]
let c_command =
CString::new(command[0].as_str()).expect("Failed to convert command to CString");
#[expect(clippy::expect_used)]
let c_args: Vec<CString> = command
.iter()
.map(|arg| CString::new(arg.as_str()).expect("Failed to convert arg to CString"))
.collect();
let mut c_args_ptrs: Vec<*const libc::c_char> = c_args.iter().map(|arg| arg.as_ptr()).collect();
c_args_ptrs.push(std::ptr::null());
unsafe {
libc::execvp(c_command.as_ptr(), c_args_ptrs.as_ptr());
}
// If execvp returns, there was an error.
let err = std::io::Error::last_os_error();
panic!("Failed to execvp {}: {err}", command[0].as_str());
}

View File

@@ -0,0 +1,6 @@
/// Note that the cwd, env, and command args are preserved in the ultimate call
/// to `execv`, so the caller is responsible for ensuring those values are
/// correct.
fn main() -> ! {
codex_linux_sandbox::run_main()
}