valknar/llmx - llmx - dev.pivoine.art

Author	SHA1	Message	Date
Michael Bolin	0a00b5ed29	fix: overhaul SandboxPolicy and config loading in Rust (#732 ) Previous to this PR, `SandboxPolicy` was a bit difficult to work with: `237f8a11e1/codex-rs/core/src/protocol.rs (L98-L108)` Specifically: * It was an `enum` and therefore options were mutually exclusive as opposed to additive. * It defined things in terms of what the agent _could not_ do as opposed to what they _could_ do. This made things hard to support because we would prefer to build up a sandbox config by starting with something extremely restrictive and only granting permissions for things the user as explicitly allowed. This PR changes things substantially by redefining the policy in terms of two concepts: * A `SandboxPermission` enum that defines permissions that can be granted to the agent/sandbox. * A `SandboxPolicy` that internally stores a `Vec<SandboxPermission>`, but externally exposes a simpler API that can be used to configure Seatbelt/Landlock. Previous to this PR, we supported a `--sandbox` flag that effectively mapped to an enum value in `SandboxPolicy`. Though now that `SandboxPolicy` is a wrapper around `Vec<SandboxPermission>`, the single `--sandbox` flag no longer makes sense. While I could have turned it into a flag that the user can specify multiple times, I think the current values to use with such a flag are long and potentially messy, so for the moment, I have dropped support for `--sandbox` altogether and we can bring it back once we have figured out the naming thing. Since `--sandbox` is gone, users now have to specify `--full-auto` to get a sandbox that allows writes in `cwd`. Admittedly, there is no clean way to specify the equivalent of `--full-auto` in your `config.toml` right now, so we will have to revisit that, as well. Because `Config` presents a `SandboxPolicy` field and `SandboxPolicy` changed considerably, I had to overhaul how config loading works, as well. There are now two distinct concepts, `ConfigToml` and `Config`: * `ConfigToml` is the deserialization of `~/.codex/config.toml`. As one might expect, every field is `Optional` and it is `#[derive(Deserialize, Default)]`. Consistent use of `Optional` makes it clear what the user has specified explicitly. * `Config` is the "normalized config" and is produced by merging `ConfigToml` with `ConfigOverrides`. Where `ConfigToml` contains a raw `Option<Vec<SandboxPermission>>`, `Config` presents only the final `SandboxPolicy`. The changes to `core/src/exec.rs` and `core/src/linux.rs` merit extra special attention to ensure we are faithfully mapping the `SandboxPolicy` to the Seatbelt and Landlock configs, respectively. Also, take note that `core/src/seatbelt_readonly_policy.sbpl` has been renamed to `codex-rs/core/src/seatbelt_base_policy.sbpl` and that `(allow file-read*)` has been removed from the `.sbpl` file as now this is added to the policy in `core/src/exec.rs` when `sandbox_policy.has_full_disk_read_access()` is `true`.	2025-04-29 15:01:16 -07:00
Michael Bolin	e79549f039	feat: add `debug landlock` subcommand comparable to `debug seatbelt` (#715 ) This PR adds a `debug landlock` subcommand to the Codex CLI for testing how Codex would execute a command using the specified sandbox policy. Built and ran this code in the `rust:latest` Docker container. In the container, hitting the network with vanilla `curl` succeeds: ``` $ curl google.com <HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8"> <TITLE>301 Moved</TITLE></HEAD><BODY> <H1>301 Moved</H1> The document has moved <A HREF="http://www.google.com/">here</A>. </BODY></HTML> ``` whereas this fails, as expected: ``` $ cargo run -- debug landlock -s network-restricted -- curl google.com curl: (6) getaddrinfo() thread failed to start ```	2025-04-28 16:37:05 -07:00
Michael Bolin	cca1122ddc	fix: make the TUI the default/"interactive" CLI in Rust (#711 ) Originally, the `interactive` crate was going to be a placeholder for building out a UX that was comparable to that of the existing TypeScript CLI. Though after researching how Ratatui works, that seems difficult to do because it is designed around the idea that it will redraw the full screen buffer each time (and so any scrolling should be "internal" to your Ratatui app) whereas the TypeScript CLI expects to render the full history of the conversation every time() (which is why you can use your terminal scrollbar to scroll it). While it is possible to use Ratatui in a way that acts more like what the TypeScript CLI is doing, it is awkward and seemingly results in tedious code, so I think we should abandon that approach. As such, this PR deletes the `interactive/` folder and the code that depended on it. Further, since we added support for mousewheel scrolling in the TUI in https://github.com/openai/codex/pull/641, it certainly feels much better and the need for scroll support via the terminal scrollbar is greatly diminished. This is now a more appropriate default UX for the "multitool" CLI. () Incidentally, I haven't verified this, but I think this results in O(N^2) work in rendering, which seems potentially problematic for long conversations.	2025-04-28 13:46:22 -07:00
Michael Bolin	4eda4dd772	feat: load defaults into Config and introduce ConfigOverrides (#677 ) This changes how instantiating `Config` works and also adds `approval_policy` and `sandbox_policy` as fields. The idea is: * All fields of `Config` have appropriate default values. * `Config` is initially loaded from `~/.codex/config.toml`, so values in `config.toml` will override those defaults. * Clients must instantiate `Config` via `Config::load_with_overrides(ConfigOverrides)` where `ConfigOverrides` has optional overrides that are expected to be settable based on CLI flags. The `Config` should be defined early in the program and then passed down. Now functions like `init_codex()` take fewer individual parameters because they can just take a `Config`. Also, `Config::load()` used to fail silently if `~/.codex/config.toml` had a parse error and fell back to the default config. This seemed really bad because it wasn't clear why the values in my `config.toml` weren't getting picked up. I changed things so that `load_with_overrides()` returns `Result<Config>` and verified that the various CLIs print a reasonable error if `config.toml` is malformed. Finally, I also updated the TUI to show which sandbox value is being used, as we do for other key values like model and approval. This was also a reminder that the various values of `--sandbox` are honored on Linux but not macOS today, so I added some TODOs about fixing that.	2025-04-27 21:47:50 -07:00
Michael Bolin	31d0d7a305	feat: initial import of Rust implementation of Codex CLI in codex-rs/ (#629 ) As stated in `codex-rs/README.md`: Today, Codex CLI is written in TypeScript and requires Node.js 22+ to run it. For a number of users, this runtime requirement inhibits adoption: they would be better served by a standalone executable. As maintainers, we want Codex to run efficiently in a wide range of environments with minimal overhead. We also want to take advantage of operating system-specific APIs to provide better sandboxing, where possible. To that end, we are moving forward with a Rust implementation of Codex CLI contained in this folder, which has the following benefits: - The CLI compiles to small, standalone, platform-specific binaries. - Can make direct, native calls to [seccomp](https://man7.org/linux/man-pages/man2/seccomp.2.html) and [landlock](https://man7.org/linux/man-pages/man7/landlock.7.html) in order to support sandboxing on Linux. - No runtime garbage collection, resulting in lower memory consumption and better, more predictable performance. Currently, the Rust implementation is materially behind the TypeScript implementation in functionality, so continue to use the TypeScript implmentation for the time being. We will publish native executables via GitHub Releases as soon as we feel the Rust version is usable.	2025-04-24 13:31:40 -07:00

5 Commits