valknar/llmx - llmx - dev.pivoine.art

Author	SHA1	Message	Date
Michael Bolin	421e159888	feat: make cwd a required field of Config so we stop assuming std::env::current_dir() in a session (#800 ) In order to expose Codex via an MCP server, I realized that we should be taking `cwd` as a parameter rather than assuming `std::env::current_dir()` as the `cwd`. Specifically, the user may want to start a session in a directory other than the one where the MCP server has been started. This PR makes `cwd: PathBuf` a required field of `Session` and threads it all the way through, though I think there is still an issue with not honoring `workdir` for `apply_patch`, which is something we also had to fix in the TypeScript version: https://github.com/openai/codex/pull/556. This also adds `-C`/`--cd` to change the cwd via the command line. To test, I ran: ``` cargo run --bin codex -- exec -C /tmp 'show the output of ls' ``` and verified it showed the contents of my `/tmp` folder instead of `$PWD`.	2025-05-04 10:57:12 -07:00
Michael Bolin	0a00b5ed29	fix: overhaul SandboxPolicy and config loading in Rust (#732 ) Previous to this PR, `SandboxPolicy` was a bit difficult to work with: `237f8a11e1/codex-rs/core/src/protocol.rs (L98-L108)` Specifically: * It was an `enum` and therefore options were mutually exclusive as opposed to additive. * It defined things in terms of what the agent _could not_ do as opposed to what they _could_ do. This made things hard to support because we would prefer to build up a sandbox config by starting with something extremely restrictive and only granting permissions for things the user as explicitly allowed. This PR changes things substantially by redefining the policy in terms of two concepts: * A `SandboxPermission` enum that defines permissions that can be granted to the agent/sandbox. * A `SandboxPolicy` that internally stores a `Vec<SandboxPermission>`, but externally exposes a simpler API that can be used to configure Seatbelt/Landlock. Previous to this PR, we supported a `--sandbox` flag that effectively mapped to an enum value in `SandboxPolicy`. Though now that `SandboxPolicy` is a wrapper around `Vec<SandboxPermission>`, the single `--sandbox` flag no longer makes sense. While I could have turned it into a flag that the user can specify multiple times, I think the current values to use with such a flag are long and potentially messy, so for the moment, I have dropped support for `--sandbox` altogether and we can bring it back once we have figured out the naming thing. Since `--sandbox` is gone, users now have to specify `--full-auto` to get a sandbox that allows writes in `cwd`. Admittedly, there is no clean way to specify the equivalent of `--full-auto` in your `config.toml` right now, so we will have to revisit that, as well. Because `Config` presents a `SandboxPolicy` field and `SandboxPolicy` changed considerably, I had to overhaul how config loading works, as well. There are now two distinct concepts, `ConfigToml` and `Config`: * `ConfigToml` is the deserialization of `~/.codex/config.toml`. As one might expect, every field is `Optional` and it is `#[derive(Deserialize, Default)]`. Consistent use of `Optional` makes it clear what the user has specified explicitly. * `Config` is the "normalized config" and is produced by merging `ConfigToml` with `ConfigOverrides`. Where `ConfigToml` contains a raw `Option<Vec<SandboxPermission>>`, `Config` presents only the final `SandboxPolicy`. The changes to `core/src/exec.rs` and `core/src/linux.rs` merit extra special attention to ensure we are faithfully mapping the `SandboxPolicy` to the Seatbelt and Landlock configs, respectively. Also, take note that `core/src/seatbelt_readonly_policy.sbpl` has been renamed to `codex-rs/core/src/seatbelt_base_policy.sbpl` and that `(allow file-read*)` has been removed from the `.sbpl` file as now this is added to the policy in `core/src/exec.rs` when `sandbox_policy.has_full_disk_read_access()` is `true`.	2025-04-29 15:01:16 -07:00
Michael Bolin	e79549f039	feat: add `debug landlock` subcommand comparable to `debug seatbelt` (#715 ) This PR adds a `debug landlock` subcommand to the Codex CLI for testing how Codex would execute a command using the specified sandbox policy. Built and ran this code in the `rust:latest` Docker container. In the container, hitting the network with vanilla `curl` succeeds: ``` $ curl google.com <HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8"> <TITLE>301 Moved</TITLE></HEAD><BODY> <H1>301 Moved</H1> The document has moved <A HREF="http://www.google.com/">here</A>. </BODY></HTML> ``` whereas this fails, as expected: ``` $ cargo run -- debug landlock -s network-restricted -- curl google.com curl: (6) getaddrinfo() thread failed to start ```	2025-04-28 16:37:05 -07:00
Michael Bolin	38575ed8aa	fix: increase timeout of test_writable_root (#713 ) Although we made some promising fixes in https://github.com/openai/codex/pull/662, we are still seeing some flakiness in `test_writable_root()`. If this continues to flake with the more generous timeout, we should try something other than simply increasing the timeout.	2025-04-28 13:09:27 -07:00
Parker Thompson	7d9de34bc7	[codex-rs] Improve linux sandbox timeouts (#662 ) * Fixes flaking rust unit test * Adds explicit sandbox exec timeout handling	2025-04-25 12:56:20 -07:00
oai-ragona	b34ed2ab83	[codex-rs] More fine-grained sandbox flag support on Linux (#632 ) ##### What/Why This PR makes it so that in Linux we actually respect the different types of `--sandbox` flag, such that users can apply network and filesystem restrictions in combination (currently the only supported behavior), or just pick one or the other. We should add similar support for OSX in a future PR. ##### Testing From Linux devbox, updated tests to use more specific flags: ``` test linux::tests_linux::sandbox_blocks_ping ... ok test linux::tests_linux::sandbox_blocks_getent ... ok test linux::tests_linux::test_root_read ... ok test linux::tests_linux::test_dev_null_write ... ok test linux::tests_linux::sandbox_blocks_dev_tcp_redirection ... ok test linux::tests_linux::sandbox_blocks_ssh ... ok test linux::tests_linux::test_writable_root ... ok test linux::tests_linux::sandbox_blocks_curl ... ok test linux::tests_linux::sandbox_blocks_wget ... ok test linux::tests_linux::sandbox_blocks_nc ... ok test linux::tests_linux::test_root_write - should panic ... ok ``` ##### Todo - [ ] Add negative tests (e.g. confirm you can hit the network if you configure filesystem only restrictions)	2025-04-24 15:33:45 -07:00
Michael Bolin	31d0d7a305	feat: initial import of Rust implementation of Codex CLI in codex-rs/ (#629 ) As stated in `codex-rs/README.md`: Today, Codex CLI is written in TypeScript and requires Node.js 22+ to run it. For a number of users, this runtime requirement inhibits adoption: they would be better served by a standalone executable. As maintainers, we want Codex to run efficiently in a wide range of environments with minimal overhead. We also want to take advantage of operating system-specific APIs to provide better sandboxing, where possible. To that end, we are moving forward with a Rust implementation of Codex CLI contained in this folder, which has the following benefits: - The CLI compiles to small, standalone, platform-specific binaries. - Can make direct, native calls to [seccomp](https://man7.org/linux/man-pages/man2/seccomp.2.html) and [landlock](https://man7.org/linux/man-pages/man7/landlock.7.html) in order to support sandboxing on Linux. - No runtime garbage collection, resulting in lower memory consumption and better, more predictable performance. Currently, the Rust implementation is materially behind the TypeScript implementation in functionality, so continue to use the TypeScript implmentation for the time being. We will publish native executables via GitHub Releases as soon as we feel the Rust version is usable.	2025-04-24 13:31:40 -07:00

7 Commits