valknar/llmx - llmx - dev.pivoine.art

Author	SHA1	Message	Date
Michael Bolin	e2efe8da9c	feat: introduce --compute-indices flag to codex-file-search (#1419 ) This is a small quality-of-life feature, the addition of `--compute-indices` to the CLI, which, if enabled, will compute and set the `indices` field for each `FileMatch` returned by `run()`. Note we only bother to compute `indices` once we have the top N results because there could be a lot of intermediate "top N" results during the search that are ultimately discarded. When set, the indices are included in the JSON output when `--json` is specified and the matching indices are displayed in bold when `--json` is not specified.	2025-06-28 14:39:29 -07:00
Michael Bolin	5a0f236ca4	feat: add support for @ to do file search (#1401 ) Introduces support for `@` to trigger a fuzzy-filename search in the composer. Under the hood, this leverages https://crates.io/crates/nucleo-matcher to do the fuzzy matching and https://crates.io/crates/ignore to build up the list of file candidates (so that it respects `.gitignore`). For simplicity (at least for now), we do not do any caching between searches like VS Code does for its file search: `1d89ed699b/src/vs/workbench/services/search/node/rawSearchService.ts (L212-L218)` Because we do not do any caching, I saw queries take up to three seconds on large repositories with hundreds of thousands of files. To that end, we do not perform searches synchronously on each keystroke, but instead dispatch an event to do the search on a background thread that asynchronously reports back to the UI when the results are available. This is largely handled by the `FileSearchManager` introduced in this PR, which also has logic for debouncing requests so there is at most one search in flight at a time. While we could potentially polish and tune this feature further, it may already be overengineered for how it will be used, in practice, so we can improve things going forward if it turns out that this is not "good enough" in the wild. Note this feature does not work like `@` in the TypeScript CLI, which was more like directory-based tab completion. In the Rust CLI, `@` triggers a full-repo fuzzy-filename search. Fixes https://github.com/openai/codex/issues/1261.	2025-06-28 13:47:42 -07:00
Michael Bolin	296996d74e	feat: standalone file search CLI (#1386 ) Standalone fuzzy filename search library that should be helpful in addressing https://github.com/openai/codex/issues/1261.	2025-06-25 13:29:03 -07:00
Michael Bolin	0776d78357	feat: redesign sandbox config (#1373 ) This is a major redesign of how sandbox configuration works and aims to fix https://github.com/openai/codex/issues/1248. Specifically, it replaces `sandbox_permissions` in `config.toml` (and the `-s`/`--sandbox-permission` CLI flags) with a "table" with effectively three variants: ```toml # Safest option: full disk is read-only, but writes and network access are disallowed. [sandbox] mode = "read-only" # The cwd of the Codex task is writable, as well as $TMPDIR on macOS. # writable_roots can be used to specify additional writable folders. [sandbox] mode = "workspace-write" writable_roots = [] # Optional, defaults to the empty list. network_access = false # Optional, defaults to false. # Disable sandboxing: use at your own risk!!! [sandbox] mode = "danger-full-access" ``` This should make sandboxing easier to reason about. While we have dropped support for `-s`, the way it works now is: - no flags => `read-only` - `--full-auto` => `workspace-write` - currently, there is no way to specify `danger-full-access` via a CLI flag, but we will revisit that as part of https://github.com/openai/codex/issues/1254 Outstanding issue: - As noted in the `TODO` on `SandboxPolicy::is_unrestricted()`, we are still conflating sandbox preferences with approval preferences in that case, which needs to be cleaned up.	2025-06-24 16:59:47 -07:00
Michael Bolin	515b6331bd	feat: add support for login with ChatGPT (#1212 ) This does not implement the full Login with ChatGPT experience, but it should unblock people. What works * The `codex` multitool now has a `login` subcommand, so you can run `codex login`, which should write `CODEX_HOME/auth.json` if you complete the flow successfully. The TUI will now read the `OPENAI_API_KEY` from `auth.json`. * The TUI should refresh the token if it has expired and the necessary information is in `auth.json`. * There is a `LoginScreen` in the TUI that tells you to run `codex login` if both (1) your model provider expects to use `OPENAI_API_KEY` as its env var, and (2) `OPENAI_API_KEY` is not set. What does not work * The `LoginScreen` does not support the login flow from within the TUI. Instead, it tells you to quit, run `codex login`, and then run `codex` again. * `codex exec` does read from `auth.json` yet, nor does it direct the user to go through the login flow if `OPENAI_API_KEY` is not be found. * The `maybeRedeemCredits()` function from `get-api-key.tsx` has not been ported from TypeScript to `login_with_chatgpt.py` yet: `a67a67f325/codex-cli/src/utils/get-api-key.tsx (L84-L89)` Implementation Currently, the OAuth flow requires running a local webserver on `127.0.0.1:1455`. It seemed wasteful to incur the additional binary cost of a webserver dependency in the Rust CLI just to support login, so instead we implement this logic in Python, as Python has a `http.server` module as part of its standard library. Specifically, we bundle the contents of a single Python file as a string in the Rust CLI and then use it to spawn a subprocess as `python3 -c {{SOURCE_FOR_PYTHON_SERVER}}`. As such, the most significant files in this PR are: ``` codex-rs/login/src/login_with_chatgpt.py codex-rs/login/src/lib.rs ``` Now that the CLI may load `OPENAI_API_KEY` from the environment _or_ `CODEX_HOME/auth.json`, we need a new abstraction for reading/writing this variable, so we introduce: ``` codex-rs/core/src/openai_api_key.rs ``` Note that `std::env::set_var()` is [rightfully] `unsafe` in Rust 2024, so we use a LazyLock<RwLock<Option<String>>> to store `OPENAI_API_KEY` so it is read in a thread-safe manner. Ultimately, it should be possible to go through the entire login flow from the TUI. This PR introduces a placeholder `LoginScreen` UI for that right now, though the new `codex login` subcommand introduced in this PR should be a viable workaround until the UI is ready. Testing Because the login flow is currently implemented in a standalone Python file, you can test it without building any Rust code as follows: ``` rm -rf /tmp/codex_home && mkdir /tmp/codex_home CODEX_HOME=/tmp/codex_home python3 codex-rs/login/src/login_with_chatgpt.py ``` For reference: * the original TypeScript implementation was introduced in https://github.com/openai/codex/pull/963 * support for redeeming credits was later added in https://github.com/openai/codex/pull/974	2025-06-04 08:44:17 -07:00
Reilly Wood	a67a67f325	codex-rs: make tool calls prettier (#1211 ) This PR overhauls how active tool calls and completed tool calls are displayed: 1. More use of colour to indicate success/failure and distinguish between components like tool name+arguments 2. Previously, the entire `CallToolResult` was serialized to JSON and pretty-printed. Now, we extract each individual `CallToolResultContent` and print those 1. The previous solution was wasting space by unnecessarily showing details of the `CallToolResult` struct to users, without formatting the actual tool call results nicely 2. We're now able to show users more information from tool results in less space, with nicer formatting when tools return JSON results ### Before: <img width="1251" alt="Screenshot 2025-06-03 at 11 24 26" src="https://github.com/user-attachments/assets/5a58f222-219c-4c53-ace7-d887194e30cf" /> ### After: <img width="1265" alt="image" src="https://github.com/user-attachments/assets/99fe54d0-9ebe-406a-855b-7aa529b91274" /> ## Future Work 1. Integrate image tool result handling better. We should be able to display images even if they're not the first `CallToolResultContent` 2. Users should have some way to view the full version of truncated tool results 3. It would be nice to add some left padding for tool results, make it more clear that they are results. This is doable, just a little fiddly due to the way `first_visible_line` scrolling works 4. There's almost certainly a better way to format JSON than "all on 1 line with spaces to make Ratatui wrapping work". But I think that works OK for now.	2025-06-03 14:29:26 -07:00
Michael Bolin	5a5aa89914	chore: replace regex with regex-lite, where appropriate (#1200 ) As explained on https://crates.io/crates/regex-lite, `regex-lite` is a lighter alternative to `regex` and seems to be sufficient for our purposes.	2025-06-02 17:11:45 -07:00
Michael Bolin	0f3cc8f842	feat: make reasoning effort/summaries configurable (#1199 ) Previous to this PR, we always set `reasoning` when making a request using the Responses API: `d7245cbbc9/codex-rs/core/src/client.rs (L108-L111)` Though if you tried to use the Rust CLI with `--model gpt-4.1`, this would fail with: ```shell "Unsupported parameter: 'reasoning.effort' is not supported with this model." ``` We take a cue from the TypeScript CLI, which does a check on the model name: `d7245cbbc9/codex-cli/src/utils/agent/agent-loop.ts (L786-L789)` This PR does a similar check, though also adds support for the following config options: ``` model_reasoning_effort = "low" \| "medium" \| "high" \| "none" model_reasoning_summary = "auto" \| "concise" \| "detailed" \| "none" ``` This way, if you have a model whose name happens to start with `"o"` (or `"codex"`?), you can set these to `"none"` to explicitly disable reasoning, if necessary. (That said, it seems unlikely anyone would use the Responses API with non-OpenAI models, but we provide an escape hatch, anyway.) This PR also updates both the TUI and `codex exec` to show `reasoning effort` and `reasoning summaries` in the header.	2025-06-02 16:01:34 -07:00
Michael Bolin	a768a6a41d	fix: introduce ResponseInputItem::McpToolCallOutput variant (#1151 ) The output of an MCP server tool call can be one of several types, but to date, we treated all outputs as text by showing the serialized JSON as the "tool output" in Codex: `25a9949c49/codex-rs/mcp-types/src/lib.rs (L96-L101)` This PR adds support for the `ImageContent` variant so we can now display an image output from an MCP tool call. In making this change, we introduce a new `ResponseInputItem::McpToolCallOutput` variant so that we can work with the `mcp_types::CallToolResult` directly when the function call is made to an MCP server. Though arguably the more significant change is the introduction of `HistoryCell::CompletedMcpToolCallWithImageOutput`, which is a cell that uses `ratatui_image` to render an image into the terminal. To support this, we introduce `ImageRenderCache`, cache a `ratatui_image::picker::Picker`, and `ensure_image_cache()` to cache the appropriate scaled image data and dimensions based on the current terminal size. To test, I created a minimal `package.json`: ```json { "name": "kitty-mcp", "version": "1.0.0", "type": "module", "description": "MCP that returns image of kitty", "main": "index.js", "dependencies": { "@modelcontextprotocol/sdk": "^1.12.0" } } ``` with the following `index.js` to define the MCP server: ```js #!/usr/bin/env node import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js"; import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js"; import { readFile } from "node:fs/promises"; import { join } from "node:path"; const IMAGE_URI = "image://Ada.png"; const server = new McpServer({ name: "Demo", version: "1.0.0", }); server.tool( "get-cat-image", "If you need a cat image, this tool will provide one.", async () => ({ content: [ { type: "image", data: await getAdaPngBase64(), mimeType: "image/png" }, ], }) ); server.resource("Ada the Cat", IMAGE_URI, async (uri) => { const base64Image = await getAdaPngBase64(); return { contents: [ { uri: uri.href, mimeType: "image/png", blob: base64Image, }, ], }; }); async function getAdaPngBase64() { const __dirname = new URL(".", import.meta.url).pathname; // From `9705ce2c59/assets/Ada.png` const filePath = join(__dirname, "Ada.png"); const imageData = await readFile(filePath); const base64Image = imageData.toString("base64"); return base64Image; } const transport = new StdioServerTransport(); await server.connect(transport); ``` With the local changes from this PR, I added the following to my `config.toml`: ```toml [mcp_servers.kitty] command = "node" args = ["/Users/mbolin/code/kitty-mcp/index.js"] ``` Running the TUI from source: ``` cargo run --bin codex -- --model o3 'I need a picture of a cat' ``` I get: <img width="732" alt="image" src="https://github.com/user-attachments/assets/bf80b721-9ca0-4d81-aec7-77d6899e2869" /> Now, that said, I have only tested in iTerm and there is definitely some funny business with getting an accurate character-to-pixel ratio (sometimes the `CompletedMcpToolCallWithImageOutput` thinks it needs 10 rows to render instead of 4), so there is still work to be done here.	2025-05-28 19:03:17 -07:00
Michael Bolin	d60f350cf8	feat: add support for -c/--config to override individual config items (#1137 ) This PR introduces support for `-c`/`--config` so users can override individual config values on the command line using `--config name=value`. Example: ``` codex --config model=o4-mini ``` Making it possible to set arbitrary config values on the command line results in a more flexible configuration scheme and makes it easier to provide single-line examples that can be copy-pasted from documentation. Effectively, it means there are four levels of configuration for some values: - Default value (e.g., `model` currently defaults to `o4-mini`) - Value in `config.toml` (e.g., user could override the default to be `model = "o3"` in their `config.toml`) - Specifying `-c` or `--config` to override `model` (e.g., user can include `-c model=o3` in their list of args to Codex) - If available, a config-specific flag can be used, which takes precedence over `-c` (e.g., user can specify `--model o3` in their list of args to Codex) Now that it is possible to specify anything that could be configured in `config.toml` on the command line using `-c`, we do not need to have a custom flag for every possible config option (which can clutter the output of `--help`). To that end, as part of this PR, we drop support for the `--disable-response-storage` flag, as users can now specify `-c disable_response_storage=true` to get the equivalent functionality. Under the hood, this works by loading the `config.toml` into a `toml::Value`. Then for each `key=value`, we create a small synthetic TOML file with `value` so that we can run the TOML parser to get the equivalent `toml::Value`. We then parse `key` to determine the point in the original `toml::Value` to do the insert/replace. Once all of the overrides from `-c` args have been applied, the `toml::Value` is deserialized into a `ConfigToml` and then the `ConfigOverrides` are applied, as before.	2025-05-27 23:11:44 -07:00
Michael Bolin	89ef4efdcf	fix: overhaul how we spawn commands under seccomp/landlock on Linux (#1086 ) Historically, we spawned the Seatbelt and Landlock sandboxes in substantially different ways: For Seatbelt, we would run `/usr/bin/sandbox-exec` with our policy specified as an arg followed by the original command: `d1de7bb383/codex-rs/core/src/exec.rs (L147-L219)` For Landlock/Seccomp, we would do `tokio::runtime::Builder::new_current_thread()`, _invoke Landlock/Seccomp APIs to modify the permissions of that new thread_, and then spawn the command: `d1de7bb383/codex-rs/core/src/exec_linux.rs (L28-L49)` While it is neat that Landlock/Seccomp supports applying a policy to only one thread without having to apply it to the entire process, it requires us to maintain two different codepaths and is a bit harder to reason about. The tipping point was https://github.com/openai/codex/pull/1061, in which we had to start building up the `env` in an unexpected way for the existing Landlock/Seccomp approach to continue to work. This PR overhauls things so that we do similar things for Mac and Linux. It turned out that we were already building our own "helper binary" comparable to Mac's `sandbox-exec` as part of the `cli` crate: `d1de7bb383/codex-rs/cli/Cargo.toml (L10-L12)` We originally created this to build a small binary to include with the Node.js version of the Codex CLI to provide support for Linux sandboxing. Though the sticky bit is that, at this point, we still want to deploy the Rust version of Codex as a single, standalone binary rather than a CLI and a supporting sandboxing binary. To satisfy this goal, we use "the arg0 trick," in which we: * use `std::env::current_exe()` to get the path to the CLI that is currently running * use the CLI as the `program` for the `Command` * set `"codex-linux-sandbox"` as arg0 for the `Command` A CLI that supports sandboxing should check arg0 at the start of the program. If it is `"codex-linux-sandbox"`, it must invoke `codex_linux_sandbox::run_main()`, which runs the CLI as if it were `codex-linux-sandbox`. When acting as `codex-linux-sandbox`, we make the appropriate Landlock/Seccomp API calls and then use `execvp(3)` to spawn the original command, so do _replace_ the process rather than spawn a subprocess. Incidentally, we do this before starting the Tokio runtime, so the process should only have one thread when `execvp(3)` is called. Because the `core` crate that needs to spawn the Linux sandboxing is not a CLI in its own right, this means that every CLI that includes `core` and relies on this behavior has to (1) implement it and (2) provide the path to the sandboxing executable. While the path is almost always `std::env::current_exe()`, we needed to make this configurable for integration tests, so `Config` now has a `codex_linux_sandbox_exe: Option<PathBuf>` property to facilitate threading this through, introduced in https://github.com/openai/codex/pull/1089. This common pattern is now captured in `codex_linux_sandbox::run_with_sandbox()` and all of the `main.rs` functions that should use it have been updated as part of this PR. The `codex-linux-sandbox` crate added to the Cargo workspace as part of this PR now has the bulk of the Landlock/Seccomp logic, which makes `core` a bit simpler. Indeed, `core/src/exec_linux.rs` and `core/src/landlock.rs` were removed/ported as part of this PR. I also moved the unit tests for this code into an integration test, `linux-sandbox/tests/landlock.rs`, in which I use `env!("CARGO_BIN_EXE_codex-linux-sandbox")` as the value for `codex_linux_sandbox_exe` since `std::env::current_exe()` is not appropriate in that case.	2025-05-23 11:37:07 -07:00
Michael Bolin	cb379d7797	feat: introduce support for shell_environment_policy in config.toml (#1061 ) To date, when handling `shell` and `local_shell` tool calls, we were spawning new processes using the environment inherited from the Codex process itself. This means that the sensitive `OPENAI_API_KEY` that Codex needs to talk to OpenAI models was made available to everything run by `shell` and `local_shell`. While there are cases where that might be useful, it does not seem like a good default. This PR introduces a complex `shell_environment_policy` config option to control the `env` used with these tool calls. It is inevitably a bit complex so that it is possible to override individual components of the policy so without having to restate the entire thing. Details are in the updated `README.md` in this PR, but here is the relevant bit that explains the individual fields of `shell_environment_policy`: \| Field \| Type \| Default \| Description \| \| ------------------------- \| -------------------------- \| ------- \| ----------------------------------------------------------------------------------------------------------------------------------------------- \| \| `inherit` \| string \| `core` \| Starting template for the environment:<br>`core` (`HOME`, `PATH`, `USER`, …), `all` (clone full parent env), or `none` (start empty). \| \| `ignore_default_excludes` \| boolean \| `false` \| When `false`, Codex removes any var whose name contains `KEY`, `SECRET`, or `TOKEN` (case-insensitive) before other rules run. \| \| `exclude` \| array<string> \| `[]` \| Case-insensitive glob patterns to drop after the default filter.<br>Examples: `"AWS_"`, `"AZURE_"`. \| \| `set` \| table<string,string> \| `{}` \| Explicit key/value overrides or additions – always win over inherited values. \| \| `include_only` \| array<string> \| `[]` \| If non-empty, a whitelist of patterns; only variables that match _one_ pattern survive the final step. (Generally used with `inherit = "all"`.) \| In particular, note that the default is `inherit = "core"`, so: * if you have extra env variables that you want to inherit from the parent process, use `inherit = "all"` and then specify `include_only` * if you have extra env variables where you want to hardcode the values, the default `inherit = "core"` will work fine, but then you need to specify `set` This configuration is not battle-tested, so we will probably still have to play with it a bit. `core/src/exec_env.rs` has the critical business logic as well as unit tests. Though if nothing else, previous to this change: ``` $ cargo run --bin codex -- debug seatbelt -- printenv OPENAI_API_KEY # ...prints OPENAI_API_KEY... ``` But after this change it does not print anything (as desired). One final thing to call out about this PR is that the `configure_command!` macro we use in `core/src/exec.rs` has to do some complex logic with respect to how it builds up the `env` for the process being spawned under Landlock/seccomp. Specifically, doing `cmd.env_clear()` followed by `cmd.envs(&$env_map)` (which is arguably the most intuitive way to do it) caused the Landlock unit tests to fail because the processes spawned by the unit tests started failing in unexpected ways! If we forgo `env_clear()` in favor of updating env vars one at a time, the tests still pass. The comment in the code talks about this a bit, and while I would like to investigate this more, I need to move on for the moment, but I do plan to come back to it to fully understand what is going on. For example, this suggests that we might not be able to spawn a C program that calls `env_clear()`, which would be...weird. We may still have to fiddle with our Landlock config if that is the case.	2025-05-22 09:51:19 -07:00
Michael Bolin	1e39189393	feat: add support for file_opener option in Rust, similiar to #911 (#957 ) This ports the enhancement introduced in https://github.com/openai/codex/pull/911 (and the fixes in https://github.com/openai/codex/pull/919) for the TypeScript CLI to the Rust one.	2025-05-16 11:33:08 -07:00
Michael Bolin	30cbfdfa87	chore: update exec crate to use std::time instead of chrono (#952 ) When I originally wrote `elapsed.rs`, I realized we were using both `std::time` and `chrono` with no real benefit of having both. We should try to keep the `exec` subcommand trim (as it also buildable as a standalone executable), so this helps tighten things up.	2025-05-16 08:14:50 -07:00
Michael Bolin	ce2ecbe72f	feat: record messages from user in ~/.codex/history.jsonl (#939 ) This is a large change to support a "history" feature like you would expect in a shell like Bash. History events are recorded in `$CODEX_HOME/history.jsonl`. Because it is a JSONL file, it is straightforward to append new entries (as opposed to the TypeScript file that uses `$CODEX_HOME/history.json`, so to be valid JSON, each new entry entails rewriting the entire file). Because it is possible for there to be multiple instances of Codex CLI writing to `history.jsonl` at once, we use advisory file locking when working with `history.jsonl` in `codex-rs/core/src/message_history.rs`. Because we believe history is a sufficiently useful feature, we enable it by default. Though to provide some safety, we set the file permissions of `history.jsonl` to be `o600` so that other users on the system cannot read the user's history. We do not yet support a default list of `SENSITIVE_PATTERNS` as the TypeScript CLI does: `3fdf9df133/codex-cli/src/utils/storage/command-history.ts (L10-L17)` We are going to take a more conservative approach to this list in the Rust CLI. For example, while `/\b[A-Za-z0-9-_]{20,}\b/` might exclude sensitive information like API tokens, it would also exclude valuable information such as references to Git commits. As noted in the updated documentation, users can opt-out of history by adding the following to `config.toml`: ```toml [history] persistence = "none" ``` Because `history.jsonl` could, in theory, be quite large, we take a[n arguably overly pedantic] approach in reading history entries into memory. Specifically, we start by telling the client the current number of entries in the history file (`history_entry_count`) as well as the inode (`history_log_id`) of `history.jsonl` (see the new fields on `SessionConfiguredEvent`). The client is responsible for keeping new entries in memory to create a "local history," but if the user hits up enough times to go "past" the end of local history, then the client should use the new `GetHistoryEntryRequest` in the protocol to fetch older entries. Specifically, it should pass the `history_log_id` it was given originally and work backwards from `history_entry_count`. (It should really fetch history in batches rather than one-at-a-time, but that is something we can improve upon in subsequent PRs.) The motivation behind this crazy scheme is that it is designed to defend against: * The `history.jsonl` being truncated during the session such that the index into the history is no longer consistent with what had been read up to that point. We do not yet have logic to enforce a `max_bytes` for `history.jsonl`, but once we do, we will aspire to implement it in a way that should result in a new inode for the file on most systems. * New items from concurrent Codex CLI sessions amending to the history. Because, in absence of truncation, `history.jsonl` is an append-only log, so long as the client reads backwards from `history_entry_count`, it should always get a consistent view of history. (That said, it will not be able to read _new_ commands from concurrent sessions, but perhaps we will introduce a `/` command to reload latest history or something down the road.) Admittedly, my testing of this feature thus far has been fairly light. I expect we will find bugs and introduce enhancements/fixes going forward.	2025-05-15 16:26:23 -07:00
Michael Bolin	497c5396c0	feat: add mcp subcommand to CLI to run Codex as an MCP server (#934 ) Previously, running Codex as an MCP server required a standalone binary in our Cargo workspace, but this PR makes it available as a subcommand (`mcp`) of the main CLI. Ran this with: ``` RUST_LOG=debug npx @modelcontextprotocol/inspector cargo run --bin codex -- mcp ``` and verified it worked as expected in the inspector at `http://127.0.0.1:6274/`.	2025-05-14 13:15:41 -07:00
Michael Bolin	a12e4b0b31	feat: add support for commands in the Rust TUI (#935 ) Introduces support for slash commands like in the TypeScript CLI. We do not support the full set of commands yet, but the core abstraction is there now. In particular, we have a `SlashCommand` enum and due to thoughtful use of the [strum](https://crates.io/crates/strum) crate, it requires minimal boilerplate to add a new command to the list. The key new piece of UI is `CommandPopup`, though the keyboard events are still handled by `ChatComposer`. The behavior is roughly as follows: * if the first character in the composer is `/`, the command popup is displayed (if you really want to send a message to Codex that starts with a `/`, simply put a space before the `/`) * while the popup is displayed, up/down can be used to change the selection of the popup * if there is a selection, hitting tab completes the command, but does not send it * if there is a selection, hitting enter sends the command * if the prefix of the composer matches a command, the command will be visible in the popup so the user can see the description (commands could take arguments, so additional text may appear after the command name itself) https://github.com/user-attachments/assets/39c3e6ee-eeb7-4ef7-a911-466d8184975f Incidentally, Codex wrote almost all the code for this PR!	2025-05-14 12:55:49 -07:00
Michael Bolin	e6c206d19d	fix: tighten up some logic around session timestamps and ids (#922 ) * update `SessionConfigured` event to include the UUID for the session * show the UUID in the Rust TUI * use local timestamps in log files instead of UTC * include timestamps in log file names for easier discovery	2025-05-13 19:22:16 -07:00
Michael Bolin	3c03c25e56	feat: introduce --profile for Rust CLI (#921 ) This introduces a much-needed "profile" concept where users can specify a collection of options under one name and then pass that via `--profile` to the CLI. This PR introduces the `ConfigProfile` struct and makes it a field of `CargoToml`. It further updates `Config::load_from_base_config_with_overrides()` to respect `ConfigProfile`, overriding default values where appropriate. A detailed unit test is added at the end of `config.rs` to verify this behavior. Details on how to use this feature have also been added to `codex-rs/README.md`.	2025-05-13 16:52:52 -07:00
Michael Bolin	9fdf2fa066	fix: remove clap dependency from core crate (#860 )	2025-05-07 19:33:09 -07:00
Michael Bolin	42617f8726	feat: save session transcripts when using Rust CLI (#845 ) This adds support for saving transcripts when using the Rust CLI. Like the TypeScript CLI, it saves the transcript to `~/.codex/sessions`, though it uses JSONL for the file format (and `.jsonl` for the file extension) so that even if Codex crashes, what was written to the `.jsonl` file should generally still be valid JSONL content.	2025-05-07 13:49:15 -07:00
Michael Bolin	0360b4d0d7	feat: introduce the use of tui-markdown (#851 ) This introduces the use of the `tui-markdown` crate to parse an assistant message as Markdown and style it using ANSI for a better user experience. As shown in the screenshot below, it has support for syntax highlighting for _tagged_ fenced code blocks: <img width="907" alt="image" src="https://github.com/user-attachments/assets/900dc229-80bb-46e8-b1bb-efee4c70ba3c" /> That said, `tui-markdown` is not as configurable (or stylish!) as https://www.npmjs.com/package/marked-terminal, which is what we use in the TypeScript CLI. In particular: * The styles are hardcoded and `tui_markdown::from_str()` does not take any options whatsoever. It uses "bold white" for inline code style which does not stand out as much as the yellow used by `marked-terminal`: `65402cbda7/tui-markdown/src/lib.rs (L464)` I asked Codex to take a first pass at this and it came up with: https://github.com/joshka/tui-markdown/pull/80 * If a fenced code block is not tagged, then it does not get highlighted. I would rather add some logic here: `65402cbda7/tui-markdown/src/lib.rs (L262)` that uses something like https://pypi.org/project/guesslang/ to examine the value of `text` and try to use the appropriate syntax highlighter. * When we have a fenced code block, we do not want to show the opening and closing triple backticks in the output. To unblock ourselves, we might want to bundle our own fork of `tui-markdown` temporarily until we figure out what the shape of the API should be and then try to upstream it.	2025-05-07 10:46:32 -07:00
jcoens-openai	a080d7b0fd	Update submodules version to come from the workspace (#850 ) Tie the version of submodules to the workspace version.	2025-05-07 10:08:06 -07:00
Michael Bolin	c577e94b67	chore: introduce codex-common crate (#843 ) I started this PR because I wanted to share the `format_duration()` utility function in `codex-rs/exec/src/event_processor.rs` with the TUI. The question was: where to put it? `core` should have as few dependencies as possible, so moving it there would introduce a dependency on `chrono`, which seemed undesirable. `core` already had this `cli` feature to deal with a similar situation around sharing common utility functions, so I decided to: * make `core` feature-free * introduce `common` * `common` can have as many "special interest" features as it needs, each of which can declare their own deps * the first two features of common are `cli` and `elapsed` In practice, this meant updating a number of `Cargo.toml` files, replacing this line: ```toml codex-core = { path = "../core", features = ["cli"] } ``` with these: ```toml codex-core = { path = "../core" } codex-common = { path = "../common", features = ["cli"] } ``` Moving `format_duration()` into its own file gave it some "breathing room" to add a unit test, so I had Codex generate some tests and new support for durations over 1 minute.	2025-05-06 17:38:56 -07:00
Michael Bolin	7d8b38b37b	feat: show MCP tool calls in `codex exec` subcommand (#841 ) This is analogous to the change for the TUI in https://github.com/openai/codex/pull/836, but for `codex exec`. To test, I ran: ``` cargo run --bin codex-exec -- 'what is the weather in wellesley ma tomorrow' ``` and saw: ![image](https://github.com/user-attachments/assets/5714e07f-88c7-4dd9-aa0d-be54c1670533)	2025-05-06 16:52:43 -07:00
Michael Bolin	88e7ca5f2b	feat: show MCP tool calls in TUI (#836 ) Adds logic for the `McpToolCallBegin` and `McpToolCallEnd` events in `codex-rs/tui/src/chatwidget.rs` so they get entries in the conversation history in the TUI. Building on top of https://github.com/openai/codex/pull/829, here is the result of running: ``` cargo run --bin codex -- 'what is the weather in san francisco tomorrow' ``` ![image](https://github.com/user-attachments/assets/db4a79bb-4988-46cb-acb2-446d5ba9e058)	2025-05-06 16:12:15 -07:00
Michael Bolin	147a940449	feat: support mcp_servers in config.toml (#829 ) This adds initial support for MCP servers in the style of Claude Desktop and Cursor. Note this PR is the bare minimum to get things working end to end: all configured MCP servers are launched every time Codex is run, there is no recovery for MCP servers that crash, etc. (Also, I took some shortcuts to change some fields of `Session` to be `pub(crate)`, which also means there are circular deps between `codex.rs` and `mcp_tool_call.rs`, but I will clean that up in a subsequent PR.) `codex-rs/README.md` is updated as part of this PR to explain how to use this feature. There is a bit of plumbing to route the new settings from `Config` to the business logic in `codex.rs`. The most significant chunks for new code are in `mcp_connection_manager.rs` (which defines the `McpConnectionManager` struct) and `mcp_tool_call.rs`, which is responsible for tool calls. This PR also introduces new `McpToolCallBegin` and `McpToolCallEnd` event types to the protocol, but does not add any handlers for them. (See https://github.com/openai/codex/pull/836 for initial usage.) To test, I added the following to my `~/.codex/config.toml`: ```toml # Local build of https://github.com/hideya/mcp-server-weather-js [mcp_servers.weather] command = "/Users/mbolin/code/mcp-server-weather-js/dist/index.js" args = [] ``` And then I ran the following: ``` codex-rs$ cargo run --bin codex exec 'what is the weather in san francisco' [2025-05-06T22:40:05] Task started: 1 [2025-05-06T22:40:18] Agent message: Here’s the latest National Weather Service forecast for San Francisco (downtown, near 37.77° N, 122.42° W): This Afternoon (Tue): • Sunny, high near 69 °F • West-southwest wind around 12 mph Tonight: • Partly cloudy, low around 52 °F • SW wind 7–10 mph ... ``` Note that Codex itself is not able to make network calls, so it would not normally be able to get live weather information like this. However, the weather MCP is [currently] not run under the Codex sandbox, so it is able to hit `api.weather.gov` and fetch current weather information. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/829). * #836 * __->__ #829	2025-05-06 15:47:59 -07:00
Michael Bolin	5f1b8f707c	feat: update McpClient::new_stdio_client() to accept an env (#831 ) Cleans up the signature for `new_stdio_client()` to more closely mirror how MCP servers are declared in config files (`command`, `args`, `env`). Also takes a cue from Claude Code where the MCP server is launched with a restricted `env` so that it only includes "safe" things like `USER` and `PATH` (see the `create_env_for_mcp_server()` function introduced in this PR for details) by default, as it is common for developers to have sensitive API keys present in their environment that should only be forwarded to the MCP server when the user has explicitly configured it to do so. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/831). * #829 * __->__ #831	2025-05-06 11:14:47 -07:00
Michael Bolin	2cf7aeeeb6	feat: initial McpClient for Rust (#822 ) This PR introduces an initial `McpClient` that we will use to give Codex itself programmatic access to foreign MCPs. This does not wire it up in Codex itself yet, but the new `mcp-client` crate includes a `main.rs` for basic testing for now. Manually tested by sending a `tools/list` request to Codex's own MCP server: ``` codex-rs$ cargo build codex-rs$ cargo run --bin codex-mcp-client ./target/debug/codex-mcp-server { "tools": [ { "description": "Run a Codex session. Accepts configuration parameters matching the Codex Config struct.", "inputSchema": { "properties": { "approval-policy": { "description": "Execution approval policy expressed as the kebab-case variant name (`unless-allow-listed`, `auto-edit`, `on-failure`, `never`).", "enum": [ "auto-edit", "unless-allow-listed", "on-failure", "never" ], "type": "string" }, "cwd": { "description": "Working directory for the session. If relative, it is resolved against the server process's current working directory.", "type": "string" }, "disable-response-storage": { "description": "Disable server-side response storage.", "type": "boolean" }, "model": { "description": "Optional override for the model name (e.g. \"o3\", \"o4-mini\")", "type": "string" }, "prompt": { "description": "The initial user prompt to start the Codex conversation.", "type": "string" }, "sandbox-permissions": { "description": "Sandbox permissions using the same string values accepted by the CLI (e.g. \"disk-write-cwd\", \"network-full-access\").", "items": { "enum": [ "disk-full-read-access", "disk-write-cwd", "disk-write-platform-user-temp-folder", "disk-write-platform-global-temp-folder", "disk-full-write-access", "network-full-access" ], "type": "string" }, "type": "array" } }, "required": [ "prompt" ], "type": "object" }, "name": "codex" } ] } ```	2025-05-05 12:52:55 -07:00
Michael Bolin	2b72d05c5e	feat: make Codex available as a tool when running it as an MCP server (#811 ) This PR replaces the placeholder `"echo"` tool call in the MCP server with a `"codex"` tool that calls Codex. Events such as `ExecApprovalRequest` and `ApplyPatchApprovalRequest` are not handled properly yet, but I have `approval_policy = "never"` set in my `~/.codex/config.toml` such that those codepaths are not exercised. The schema for this MPC tool is defined by a new `CodexToolCallParam` struct introduced in this PR. It is fairly similar to `ConfigOverrides`, as the param is used to help create the `Config` used to start the Codex session, though it also includes the `prompt` used to kick off the session. This PR also introduces the use of the third-party `schemars` crate to generate the JSON schema, which is verified in the `verify_codex_tool_json_schema()` unit test. Events that are dispatched during the Codex session are sent back to the MCP client as MCP notifications. This gives the client a way to monitor progress as the tool call itself may take minutes to complete depending on the complexity of the task requested by the user. In the video below, I launched the server via: ```shell mcp-server$ RUST_LOG=debug npx @modelcontextprotocol/inspector cargo run -- ``` In the video, you can see the flow of: * requesting the list of tools * choosing the codex tool * entering a value for prompt and then making the tool call Note that I left the other fields blank because when unspecified, the values in my `~/.codex/config.toml` were used: https://github.com/user-attachments/assets/1975058c-b004-43ef-8c8d-800a953b8192 Note that while using the inspector, I did run into https://github.com/modelcontextprotocol/inspector/issues/293, though the tip about ensuring I had only one instance of the MCP Inspector tab open in my browser seemed to fix things.	2025-05-05 07:16:19 -07:00
Michael Bolin	21cd953dbd	feat: introduce mcp-server crate (#792 ) This introduces the `mcp-server` crate, which contains a barebones MCP server that provides an `echo` tool that echoes the user's request back to them. To test it out, I launched [modelcontextprotocol/inspector](https://github.com/modelcontextprotocol/inspector) like so: ``` mcp-server$ npx @modelcontextprotocol/inspector cargo run -- ``` and opened up `http://127.0.0.1:6274` in my browser: ![image](https://github.com/user-attachments/assets/83fc55d4-25c2-4497-80cd-e9702283ff93) I also had to make a small fix to `mcp-types`, adding `#[serde(untagged)]` to a number of `enum`s.	2025-05-02 17:25:58 -07:00
Michael Bolin	83961e0299	feat: introduce mcp-types crate (#787 ) This adds our own `mcp-types` crate to our Cargo workspace. We vendor in the [`2025-03-26/schema.json`](`05f2045136/schema/2025-03-26/schema.json`) from the MCP repo and introduce a `generate_mcp_types.py` script to codegen the `lib.rs` from the JSON schema. Test coverage is currently light, but I plan to refine things as we start making use of this crate. And yes, I am aware that https://github.com/modelcontextprotocol/rust-sdk exists, though the published https://crates.io/crates/rmcp appears to be a competing effort. While things are up in the air, it seems better for us to control our own version of this code. Incidentally, Codex did a lot of the work for this PR. I told it to never edit `lib.rs` directly and instead to update `generate_mcp_types.py` and then re-run it to update `lib.rs`. It followed these instructions and once things were working end-to-end, I iteratively asked for changes to the tests until the API looked reasonable (and the code worked). Codex was responsible for figuring out what to do to `generate_mcp_types.py` to achieve the requested test/API changes.	2025-05-02 13:33:14 -07:00
Michael Bolin	b571249867	chore: script to create a Rust release (#759 ) For now, keep things simple such that we never update the `version` in the `Cargo.toml` for the workspace root on the `main` branch. Instead, create a new branch for a release, push one commit that updates the `version`, and then tag that branch to kick off a release. To test, I ran this script and created this release job: https://github.com/openai/codex/actions/runs/14762580641	2025-04-30 12:39:03 -07:00
Michael Bolin	8f7a54501c	chore: Rust release, set prerelease:false and version=0.0.2504301132 (#755 ) The generated DotSlash file has URLs that refer to `https://github.com/openai/codex/releases/`, so let's set `prerelease:false` (but keep `draft:true` for now) so those URLs should work. Also updated `version` in Cargo workspace so I will kick off a build once this lands.	2025-04-30 11:53:03 -07:00
Michael Bolin	c432d9ef81	chore: remove the REPL crate/subcommand (#754 ) @oai-ragona and I discussed it, and we feel the REPL crate has served its purpose, so we're going to delete the code and future archaeologists can find it in Git history.	2025-04-30 10:15:50 -07:00
Michael Bolin	e42dacbdc8	fix: add another place where $dest was missing in rust-release.yml (#747 ) I thought https://github.com/openai/codex/pull/745 was the last fix I needed, but apparently not.	2025-04-29 20:23:54 -07:00
Michael Bolin	5122fe647f	chore: fix errors in .github/workflows/rust-release.yml and prep 0.0.2504292006 release (#745 ) Apparently I made two key mistakes in https://github.com/openai/codex/pull/740 (fixed in this PR): * I forgot to redefine `$dest` in the `Stage Linux-only artifacts` step * I did not define the `if` check correctly in the `Stage Linux-only artifacts` step This fixes both of those issues and bumps the workspace version to `0.0.2504292006` in preparation for another release attempt.	2025-04-29 20:12:23 -07:00
Michael Bolin	1a39568e03	chore: set Cargo workspace to version 0.0.2504291954 to create a scratch release (#744 )	2025-04-29 19:56:30 -07:00
Michael Bolin	85999d7277	chore: set Cargo workspace to version 0.0.2504291926 to create a scratch release (#741 ) Needed to exercise the new release process in https://github.com/openai/codex/pull/671.	2025-04-29 19:35:37 -07:00
Michael Bolin	27bc4516bf	feat: bring back -s option to specify sandbox permissions (#739 )	2025-04-29 18:42:52 -07:00
Michael Bolin	3b39964f81	feat: improve output of exec subcommand (#719 )	2025-04-29 09:59:35 -07:00
Fouad Matin	19928bc257	[codex-rs] fix: exit code 1 if no api key (#697 )	2025-04-28 21:42:06 -07:00
Michael Bolin	cca1122ddc	fix: make the TUI the default/"interactive" CLI in Rust (#711 ) Originally, the `interactive` crate was going to be a placeholder for building out a UX that was comparable to that of the existing TypeScript CLI. Though after researching how Ratatui works, that seems difficult to do because it is designed around the idea that it will redraw the full screen buffer each time (and so any scrolling should be "internal" to your Ratatui app) whereas the TypeScript CLI expects to render the full history of the conversation every time() (which is why you can use your terminal scrollbar to scroll it). While it is possible to use Ratatui in a way that acts more like what the TypeScript CLI is doing, it is awkward and seemingly results in tedious code, so I think we should abandon that approach. As such, this PR deletes the `interactive/` folder and the code that depended on it. Further, since we added support for mousewheel scrolling in the TUI in https://github.com/openai/codex/pull/641, it certainly feels much better and the need for scroll support via the terminal scrollbar is greatly diminished. This is now a more appropriate default UX for the "multitool" CLI. () Incidentally, I haven't verified this, but I think this results in O(N^2) work in rendering, which seems potentially problematic for long conversations.	2025-04-28 13:46:22 -07:00
Michael Bolin	ebd2ae4abd	fix: remove dependency on expanduser crate (#667 ) In putting up https://github.com/openai/codex/pull/665, I discovered that the `expanduser` crate does not compile on Windows. Looking into it, we do not seem to need it because we were only using it with a value that was passed in via a command-line flag, so the shell expands `~` for us before we see it, anyway. (I changed the type in `Cli` from `String` to `PathBuf`, to boot.) If we do need this sort of functionality in the future, https://docs.rs/shellexpand/latest/shellexpand/fn.tilde.html seems promising.	2025-04-25 14:20:21 -07:00
Michael Bolin	58f0e5ab74	feat: introduce codex_execpolicy crate for defining "safe" commands (#634 ) As described in detail in `codex-rs/execpolicy/README.md` introduced in this PR, `execpolicy` is a tool that lets you define a set of _patterns_ used to match [`execv(3)`](https://linux.die.net/man/3/execv) invocations. When a pattern is matched, `execpolicy` returns the parsed version in a structured form that is amenable to static analysis. The primary use case is to define patterns match commands that should be auto-approved by a tool such as Codex. This supports a richer pattern matching mechanism that the sort of prefix-matching we have done to date, e.g.: `5e40d9d221/codex-cli/src/approvals.ts (L333-L354)` Note we are still playing with the API and the `system_path` option in particular still needs some work.	2025-04-24 17:14:47 -07:00
Michael Bolin	31d0d7a305	feat: initial import of Rust implementation of Codex CLI in codex-rs/ (#629 ) As stated in `codex-rs/README.md`: Today, Codex CLI is written in TypeScript and requires Node.js 22+ to run it. For a number of users, this runtime requirement inhibits adoption: they would be better served by a standalone executable. As maintainers, we want Codex to run efficiently in a wide range of environments with minimal overhead. We also want to take advantage of operating system-specific APIs to provide better sandboxing, where possible. To that end, we are moving forward with a Rust implementation of Codex CLI contained in this folder, which has the following benefits: - The CLI compiles to small, standalone, platform-specific binaries. - Can make direct, native calls to [seccomp](https://man7.org/linux/man-pages/man2/seccomp.2.html) and [landlock](https://man7.org/linux/man-pages/man7/landlock.7.html) in order to support sandboxing on Linux. - No runtime garbage collection, resulting in lower memory consumption and better, more predictable performance. Currently, the Rust implementation is materially behind the TypeScript implementation in functionality, so continue to use the TypeScript implmentation for the time being. We will publish native executables via GitHub Releases as soon as we feel the Rust version is usable.	2025-04-24 13:31:40 -07:00

46 Commits