valknar/llmx - llmx - dev.pivoine.art

Author	SHA1	Message	Date
Michael Bolin	265fd89e31	fix: try to fix flakiness in test_shell_command_approval_triggers_elicitation (#2344 ) I still see flakiness in `test_shell_command_approval_triggers_elicitation()` on occasion where `MockServer` claims it has not received all of its expected requests. I recently introduced a similar type of test in #2264, `test_codex_jsonrpc_conversation_flow()`, which I have not seen flake (yet!), so this PR pulls over two things I did in that test: - increased `worker_threads` from `2` to `4` - added an assertion to make sure the `task_complete` notification is received Honestly, I'm still not sure why `MockServer` claims it sometimes does not receive all its expected requests given that we assert that the final `JSONRPCResponse` is read on the stream, but let's give this a shot. Assuming this fixes things, my hypothesis is that the increase in `worker_threads` helps because perhaps there are async tasks in `MockServer` that do not reliably complete fully when there are not enough threads available? If that is correct, it seems like the test would still be flaky, though perhaps with lower frequency?	2025-08-15 09:17:20 -07:00
Michael Bolin	6730592433	fix: introduce MutexExt::lock_unchecked() so we stop ignoring unwrap() throughout codex.rs (#2340 ) This way we are sure a dangerous `unwrap()` does not sneak in! --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2340). * #2345 * #2329 * #2343 * __->__ #2340 * #2338	2025-08-15 09:14:44 -07:00
Michael Bolin	26c8373821	fix: tighten up checks against writable folders for SandboxPolicy (#2338 ) I was looking at the implementation of `Session::get_writable_roots()`, which did not seem right, as it was a copy of writable roots, which is not guaranteed to be in sync with the `sandbox_policy` field. I looked at who was calling `get_writable_roots()` and its only call site was `apply_patch()` in `codex-rs/core/src/apply_patch.rs`, which took the roots and forwarded them to `assess_patch_safety()` in `safety.rs`. I updated `assess_patch_safety()` to take `sandbox_policy: &SandboxPolicy` instead of `writable_roots: &[PathBuf]` (and replaced `Session::get_writable_roots()` with `Session::get_sandbox_policy()`). Within `safety.rs`, it was fairly easy to update `is_write_patch_constrained_to_writable_paths()` to work with `SandboxPolicy`, and in particular, it is far more accurate because, for better or worse, `SandboxPolicy::get_writable_roots_with_cwd()` _returns an empty vec_ for `SandboxPolicy::DangerFullAccess`, suggesting that _nothing_ is writable when in reality _everything_ is writable. With this PR, `is_write_patch_constrained_to_writable_paths()` now does the right thing for each variant of `SandboxPolicy`. I thought this would be the end of the story, but it turned out that `test_writable_roots_constraint()` in `safety.rs` needed to be updated, as well. In particular, the test was writing to `std::env::current_dir()` instead of a `TempDir`, which I suspect was a holdover from earlier when `SandboxPolicy::WorkspaceWrite` would always make `TMPDIR` writable on macOS, which made it hard to write tests to verify `SandboxPolicy` in `TMPDIR`. Fortunately, we now have `exclude_tmpdir_env_var` as an option on `SandboxPolicy::WorkspaceWrite`, so I was able to update the test to preserve the existing behavior, but to no longer write to `std::env::current_dir()`. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2338). * #2345 * #2329 * #2343 * #2340 * __->__ #2338	2025-08-15 09:06:15 -07:00
Dylan	6df8e35314	[tools] Add apply_patch tool (#2303 ) ## Summary We've been seeing a number of issues and reports with our synthetic `apply_patch` tool, e.g. #802. Let's make this a real tool - in my anecdotal testing, it's critical for GPT-OSS models, but I'd like to make it the standard across GPT-5 and codex models as well. ## Testing - [x] Tested locally - [x] Integration test	2025-08-15 11:55:53 -04:00
Jeremy Rose	917e29803b	tui: include optional full command line in history display (#2334 ) Add env var to show the raw, unparsed command line under parsed commands. When we have transcript mode we should show the full command there, but this is useful for debugging.	2025-08-14 22:06:42 -07:00
pakrym-oai	5552688621	Format multiline commands (#2333 ) <img width="966" height="729" alt="image" src="https://github.com/user-attachments/assets/fa45b7e1-cd46-427f-b2bc-8501e9e4760b" /> <img width="797" height="530" alt="image" src="https://github.com/user-attachments/assets/6993eec5-e157-4df7-b558-15643ad10d64" />	2025-08-14 19:49:42 -07:00
pakrym-oai	76df07350a	Cleanup rust login server a bit more (#2331 ) Remove some extra abstractions. --------- Co-authored-by: easong-openai <easong@openai.com>	2025-08-14 19:42:14 -07:00
easong-openai	d0b907d399	re-implement session id in status (#2332 ) Basically the same thing as https://github.com/openai/codex/pull/2297	2025-08-15 02:14:46 +00:00
Parker Thompson	a075424437	Added `allow-expect-in-tests` / `allow-unwrap-in-tests` (#2328 ) This PR: * Added the clippy.toml to configure allowable expect / unwrap usage in tests * Removed as many expect/allow lines as possible from tests * moved a bunch of allows to expects where possible Note: in integration tests, non `#[test]` helper functions are not covered by this so we had to leave a few lingering `expect(expect_used` checks around	2025-08-14 17:59:01 -07:00
Jeremy Rose	8bdb4521c9	AGENTS.md more strongly suggests running targeted tests first (#2306 )	2025-08-15 00:51:32 +00:00
Michael Bolin	dd63d61a59	fix: trying to simplify rust-ci.yml (#2327 ) It turns out that https://github.com/openai/codex/pull/2324 did not quite work as intended. Chat's new idea is to have this catch-all "CI results" job and update our branch protection rules to require this instead.	2025-08-14 17:44:10 -07:00
Parker Thompson	c26d42ab69	Fix AF_UNIX, sockpair, recvfrom in linux sandbox (#2309 ) When using codex-tui on a linux system I was unable to run `cargo clippy` inside of codex due to: ``` [pid 3548377] socketpair(AF_UNIX, SOCK_SEQPACKET\|SOCK_CLOEXEC, 0, <unfinished ...> [pid 3548370] close(8 <unfinished ...> [pid 3548377] <... socketpair resumed>0x7ffb97f4ed60) = -1 EPERM (Operation not permitted) ``` And ``` 3611300 <... recvfrom resumed>0x708b8b5cffe0, 8, 0, NULL, NULL) = -1 EPERM (Operation not permitted) ``` This PR: * Fixes a bug that disallowed AF_UNIX to allow it on `socket()` * Adds recvfrom() to the syscall allow list, this should be fine since we disable opening new sockets. But we should validate there is not a open socket inheritance issue. * Allow socketpair to be called for AF_UNIX * Adds tests for AF_UNIX components * All of which allows running `cargo clippy` within the sandbox on linux, and possibly other tooling using a fork server model + AF_UNIX comms.	2025-08-14 17:12:41 -07:00
easong-openai	e9b597cfa3	Port login server to rust (#2294 ) Port the login server to rust. --------- Co-authored-by: pakrym-oai <pakrym@openai.com>	2025-08-14 17:11:26 -07:00
Jeremy Rose	afc377bae5	clear running commands in various places (#2325 ) we have a very unclear lifecycle for the chatwidget—this should only have to be added in one place! but this fixes the "hanging commands" issue where the active_exec_cell wasn't correctly cleared when commands finished. To repro w/o this PR: 1. prompt "run sleep 10" 2. once the command starts running, press <kbd>Esc</kbd> 3. prompt "run echo hi" Expected: ``` ✓ Completed └ ⌨️ echo hi codex hi ``` Actual: ``` ⚙︎ Working └ ⌨️ echo hi ▌ Ask Codex to do anything ``` i.e. the "Working" never changes to "Completed". The bug is fixed with this PR.	2025-08-15 00:01:19 +00:00
Michael Bolin	333803ed04	fix: ensure rust-ci always "runs" when a PR is submitted (#2324 ) Our existing path filters for `rust-ci.yml`: `235987843c/.github/workflows/rust-ci.yml (L1-L11)` made it so that PRs that touch only `README.md` would not trigger those builds, which is a problem because our branch protection rules are set as follows: <img width="1569" height="1883" alt="Screenshot 2025-08-14 at 4 45 59 PM" src="https://github.com/user-attachments/assets/5a61f8cc-cdaf-4341-abda-7faa7b46dbd4" /> With the existing setup, a change to `README.md` would get stuck in limbo because not all the CI jobs required to merge would get run. It turns out that we need to "run" all the jobs, but make them no-ops when the `codex-rs` and `.github` folders are untouched to get the best of both worlds. I asked chat how to fix this, as we want CI to be fast for documentation-only changes. It had two suggestions: - Use https://github.com/dorny/paths-filter or some other third-party action. - Write an inline Bash script to avoid a third-party dependency. This PR takes the latter approach so that we are clear about what we're running in CI.	2025-08-14 17:00:19 -07:00
Jeremy Rose	235987843c	add a timer to running exec commands (#2321 ) sometimes i switch back to codex and i don't know how long a command has been running. <img width="744" height="462" alt="Screenshot 2025-08-14 at 3 30 07 PM" src="https://github.com/user-attachments/assets/bd80947f-5a47-43e6-ad19-69c2995a2a29" />	2025-08-14 19:32:45 -04:00
Michael Bolin	6a0f709cff	fix: add call_id to ApprovalParams in mcp-server/src/wire_format.rs (#2322 ) Clients still need this field.	2025-08-14 16:09:12 -07:00
Michael Bolin	2ecca79663	fix: run python_multiprocessing_lock_works integration test on Mac and Linux (#2318 ) The high-order bit on this PR is that it makes it so `sandbox.rs` tests both Mac and Linux, as we introduce a general `spawn_command_under_sandbox()` function with platform-specific implementations for testing. An important, and interesting, discovery in porting the test to Linux is that (for reasons cited in the code comments), `/dev/shm` has to be added to `writable_roots` on Linux in order for `multiprocessing.Lock` to work there. Granting write access to `/dev/shm` comes with some degree of risk, so we do not make this the default for Codex CLI. Piggybacking on top of #2317, this moves the `python_multiprocessing_lock_works` test yet again, moving `codex-rs/core/tests/sandbox.rs` to `codex-rs/exec/tests/sandbox.rs` because in `codex-rs/exec/tests` we can use `cargo_bin()` like so: ``` let codex_linux_sandbox_exe = assert_cmd::cargo::cargo_bin("codex-exec"); ``` which is necessary so we can use `codex_linux_sandbox_exe` and therefore `spawn_command_under_linux_sandbox` in an integration test. This also moves `spawn_command_under_linux_sandbox()` out of `exec.rs` and into `landlock.rs`, which makes things more consistent with `seatbelt.rs` in `codex-core`. For reference, https://github.com/openai/codex/pull/1808 is the PR that made the change to Seatbelt to get this test to pass on Mac.	2025-08-14 15:47:48 -07:00
Michael Bolin	a8c7f5391c	fix: move general sandbox tests to codex-rs/core/tests/sandbox.rs (#2317 ) Previous to this PR, `codex-rs/core/tests/sandbox.rs` contained integration tests that were specific to Seatbelt. This PR moves those tests to `codex-rs/core/src/seatbelt.rs` and designates `codex-rs/core/tests/sandbox.rs` to be used as the home for cross-platform (well, Mac and Linux...) sandbox tests. To start, this migrates `python_multiprocessing_lock_works_under_seatbelt()` from #1823 to the new `sandbox.rs` because this is the type of thing that should work on both Mac _and_ Linux, though I still need to do some work to clean up the test so it works on both platforms.	2025-08-14 14:48:38 -07:00
David Z Hao	992e81d9b5	test(core): add seatbelt sem lock tests (#1823 ) ## Summary - add a unit test to ensure the macOS seatbelt policy allows POSIX semaphores - add a macOS-only test that runs a Python multiprocessing Lock under Seatbelt ## Testing - `cargo test -p codex_core seatbelt_base_policy_allows_ipc_posix_sem --no-fail-fast` (failed: failed to download from `https://static.crates.io/crates/tokio-stream/0.1.17/download`) - `cargo test -p codex_core seatbelt_base_policy_allows_ipc_posix_sem --no-fail-fast --offline` (failed: attempting to make an HTTP request, but --offline was specified) - `cargo test --all-features --no-fail-fast --offline` (failed: attempting to make an HTTP request, but --offline was specified) - `just fmt` (failed: command not found: just) - `just fix` (failed: command not found: just) Ran tests locally to confirm it passes on master and failed before my previous change ------ https://chatgpt.com/codex/tasks/task_i_6890f221e0a4833381cfb53e11499bcc	2025-08-14 14:23:06 -07:00
Jeremy Rose	7038827bf4	fix bash commands being incorrectly quoted in display (#2313 ) The "display format" of commands was sometimes producing incorrect quoting like `echo foo '>' bar`, which is importantly different from the actual command that was being run. This refactors ParsedCommand to have a string in `cmd` instead of a vec, as a `vec` can't accurately capture a full command.	2025-08-14 17:08:29 -04:00
Jeremy Rose	20cd61e2a4	use a central animation loop (#2268 ) instead of each shimmer needing to have its own animation thread, have render_ref schedule a new frame if it wants one and coalesce to the earliest next frame. this also makes the animations frame-timing-independent, based on start time instead of frame count.	2025-08-14 16:59:47 -04:00
Jeremy Rose	fd2b059504	text elements in textarea for pasted content (#2302 ) This improves handling of pasted content in the textarea. It's no longer possible to partially delete a placeholder (e.g. by ^W or ^D), nor is it possible to place the cursor inside a placeholder. Also, we now render placeholders in a different color to make them more clearly differentiated. https://github.com/user-attachments/assets/2051b3c3-963d-4781-a610-3afee522ae29	2025-08-14 20:58:51 +00:00
Michael Bolin	c25f3ea53e	fix: do not allow dotenv to create/modify environment variables starting with CODEX_ (#2308 ) This ensures Codex cannot drop a `.env` file with a value of `CODEX_HOME` that points to a folder that Codex can control.	2025-08-14 13:57:15 -07:00
Michael Bolin	8f11652458	fix: parallelize logic in Session::new() (#2305 ) #2291 made it so that `Session::new()` is on the critical path to `Codex::spawn()`, which means it is on the hot path to CLI startup. This refactors `Session::new()` to run a number of async tasks in parallel that were previously run serially to try to reduce latency.	2025-08-14 13:29:58 -07:00
aibrahim-oai	b62c2d9552	remove logs from composer by default (#2307 ) Currently the composer shows `handle_codex_event:<event name>` by default which feels confusing. Let's make it appear in trace.	2025-08-14 13:01:15 -07:00
Jeremy Rose	475ba13479	remove the · animation (#2271 ) the pulsing dot felt too noisy to me next to the shimmering "Working" text. we'll bring it back for streaming response text perhaps?	2025-08-14 19:30:41 +00:00
Dylan	544980c008	[context] Store context messages in rollouts (#2243 ) ## Summary Currently, we use request-time logic to determine the user_instructions and environment_context messages. This means that neither of these values can change over time as conversations go on. We want to add in additional details here, so we're migrating these to save these messages to the rollout file instead. This is simpler for the client, and allows us to append additional environment_context messages to each turn if we want ## Testing - [x] Integration test coverage - [x] Tested locally with a few turns, confirmed model could reference environment context and cached token metrics were reasonably high	2025-08-14 14:51:13 -04:00
Jeremy Rose	b42e679227	remove "status text" in bottom line (#2279 ) this used to hold the most recent log line, but it was kinda broken and not that useful.	2025-08-14 14:10:21 -04:00
Jeremy Rose	585f7b0679	HistoryCell is a trait (#2283 ) refactors HistoryCell to be a trait instead of an enum. Also collapse the many "degenerate" HistoryCell enums which were just a store of lines into a single PlainHistoryCell type. The goal here is to allow more ways of rendering history cells (e.g. expanded/collapsed/"live"), and I expect we will return to more varied types of HistoryCell as we develop this area.	2025-08-14 14:10:05 -04:00
Gabriel Peal	cdd33b2c04	Tag InputItem (#2304 ) Instead of: ``` { Text: { text: string } } ``` It is now: ``` { type: "text", data: { text: string } } ``` which makes for cleaner discriminated unions	2025-08-14 17:58:04 +00:00
Michael Bolin	cf7a7e63a3	exploration: create Session as part of Codex::spawn() (#2291 ) Historically, `Codex::spawn()` would create the instance of `Codex` and enforce, by construction, that `Op::ConfigureSession` was the first `Op` submitted via `submit()`. Then over in `submission_loop()`, it would handle the case for taking the parameters of `Op::ConfigureSession` and turning it into a `Session`. This approach has two challenges from a state management perspective: `f968a1327a/codex-rs/core/src/codex.rs (L718)` - The local `sess` variable in `submission_loop()` has to be `mut` and `Option<Arc<Session>>` because it is not invariant that a `Session` is present for the lifetime of the loop, so there is a lot of logic to deal with the case where `sess` is `None` (e.g., the `send_no_session_event` function and all of its callsites). - `submission_loop()` is written in such a way that `Op::ConfigureSession` could be observed multiple times, but in practice, it is only observed exactly once at the start of the loop. In this PR, we try to simplify the state management by _removing_ the `Op::ConfigureSession` enum variant and constructing the `Session` as part of `Codex::spawn()` so that it can be passed to `submission_loop()` as `Arc<Session>`. The original logic from the `Op::ConfigureSession` has largely been moved to the new `Session::new()` constructor. --- Incidentally, I also noticed that the handling of `Op::ConfigureSession` can result in events being dispatched in addition to `EventMsg::SessionConfigured`, as an `EventMsg::Error` is created for every MCP initialization error, so it is important to preserve that behavior: `f968a1327a/codex-rs/core/src/codex.rs (L901-L916)` Though admittedly, I believe this does not play nice with #2264, as these error messages will likely be dispatched before the client has a chance to call `addConversationListener`, so we likely need to make it so `newConversation` automatically creates the subscription, but we must also guarantee that the "ack" from `newConversation` is returned before any other conversation-related notifications are sent so the client knows what `conversation_id` to match on.	2025-08-14 09:55:28 -07:00
Michael Bolin	f968a1327a	feat: add support for an InterruptConversation request (#2287 ) This adds `ClientRequest::InterruptConversation`, which effectively maps directly to `Op::Interrupt`. --- * __->__ #2287 * #2286 * #2285	2025-08-13 23:12:03 -07:00
Michael Bolin	539f4b290e	fix: add support for exec and apply_patch approvals in the new wire format (#2286 ) Now when `CodexMessageProcessor` receives either a `EventMsg::ApplyPatchApprovalRequest` or a `EventMsg::ExecApprovalRequest`, it sends the appropriate request from the server to the client. When it gets a response, it forwards it on to the `CodexConversation`. Note this takes a lot of code from: https://github.com/openai/codex/blob/main/codex-rs/mcp-server/src/conversation_loop.rs https://github.com/openai/codex/blob/main/codex-rs/mcp-server/src/exec_approval.rs https://github.com/openai/codex/blob/main/codex-rs/mcp-server/src/patch_approval.rs I am copy/pasting for now because I am trying to consolidate around the new `wire_format.rs`, so I plan to delete these other files soon. Now that we have requests going both from client-to-server and server-to-client, I renamed `CodexRequest` to `ClientRequest`. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2286). * #2287 * __->__ #2286 * #2285	2025-08-13 23:00:50 -07:00
Michael Bolin	085f166707	fix: make all fields of Session private (#2285 ) As `Session` needs a bit of work, it will make things easier to move around if we can start by reducing the extent of its public API. This makes all the fields private, though adds three `pub(crate)` getters. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2285). * #2287 * #2286 * __->__ #2285	2025-08-13 22:53:54 -07:00
Kazuhiro Sera	6d0eb9128e	Use enhancement tag for feature requests (#2282 )	2025-08-14 12:08:35 +09:00
Gabriel Peal	e8ffecd632	Clarify PR/Contribution guidelines and issue templates (#2281 ) Co-authored-by: Dylan <dylan.hurd@openai.com>	2025-08-13 21:56:29 -04:00
pakrym-oai	f1be7978cf	Parse reasoning text content (#2277 ) Sometimes COT is returns as text content instead of `ReasoningText`. We should parse it but not serialize back on requests. --------- Co-authored-by: Ahmed Ibrahim <aibrahim@openai.com>	2025-08-13 18:39:58 -07:00
Michael Bolin	a62510e0ae	fix: verify notifications are sent with the conversationId set (#2278 ) This updates `CodexMessageProcessor` so that each notification it sends for a `EventMsg` from a `CodexConversation` such that: - The `params` always has an appropriate `conversationId` field. - The `method` is now includes the name of the `EventMsg` type rather than using `codex/event` as the `method` type for all notifications. (We currently prefix the method name with `codex/event/`, but I think that should go away once we formalize the notification schema in `wire_format.rs`.) As part of this, we update `test_codex_jsonrpc_conversation_flow()` to verify that the `task_finished` notification has made it through the system instead of sleeping for 5s and "hoping" the server finished processing the task. Note we have seen some flakiness in some of our other, similar integration tests, and I expect adding a similar check would help in those cases, as well.	2025-08-13 17:54:12 -07:00
Michael Bolin	e7bad650ff	feat: support traditional JSON-RPC request/response in MCP server (#2264 ) This introduces a new set of request types that our `codex mcp` supports. Note that these do not conform to MCP tool calls so that instead of having to send something like this: ```json { "jsonrpc": "2.0", "method": "tools/call", "id": 42, "params": { "name": "newConversation", "arguments": { "model": "gpt-5", "approvalPolicy": "on-request" } } } ``` we can send something like this: ```json { "jsonrpc": "2.0", "method": "newConversation", "id": 42, "params": { "model": "gpt-5", "approvalPolicy": "on-request" } } ``` Admittedly, this new format is not a valid MCP tool call, but we are OK with that right now. (That is, not everything we might want to request of `codex mcp` is something that is appropriate for an autonomous agent to do.) To start, this introduces four request types: - `newConversation` - `sendUserMessage` - `addConversationListener` - `removeConversationListener` The new `mcp-server/tests/codex_message_processor_flow.rs` shows how these can be used. The types are defined on the `CodexRequest` enum, so we introduce a new `CodexMessageProcessor` that is responsible for dealing with requests from this enum. The top-level `MessageProcessor` has been updated so that when `process_request()` is called, it first checks whether the request conforms to `CodexRequest` and dispatches it to `CodexMessageProcessor` if so. Note that I also decided to use `camelCase` for the on-the-wire format, as that seems to be the convention for MCP. For the moment, the new protocol is defined in `wire_format.rs` within the `mcp-server` crate, but in a subsequent PR, I will probably move it to its own crate to ensure the protocol has minimal dependencies and that we can codegen a schema from it. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2264). * #2278 * __->__ #2264	2025-08-13 17:36:29 -07:00
pakrym-oai	de2c6a2ce7	Enable reasoning for codex-prefixed models (#2275 ) ## Summary - enable reasoning for any model slug starting with `codex-` - provide default model info for `codex-` slugs - test that codex models are detected and support reasoning ## Testing - `just fmt` - `just fix` (fails: E0658 `let` expressions in this position are unstable)* - `cargo test --all-features` (fails: E0658 `let` expressions in this position are unstable) ------ https://chatgpt.com/codex/tasks/task_i_689d13f8705483208a6ed21c076868e1	2025-08-13 17:02:50 -07:00
Michael Bolin	3a0656df63	fix: skip `cargo test` for release builds on ordinary CI because it is slow, particularly with --all-features set (#2276 ) I put this PR together because I noticed I have to wait quite a bit longer on my PRs since we added https://github.com/openai/codex/pull/2242 to catch more build issues. I think we should think about reigning in our use of create features, but this should be good enough to speed things up for now.	2025-08-13 16:27:20 -07:00
Jeremy Rose	bb9ce3cb78	tui: standardize tree prefix glyphs to └ (#2274 ) Replace mixed `⎿` and `L` prefixes with `└` in TUI rendering. <img width="454" height="659" alt="Screenshot 2025-08-13 at 4 02 03 PM" src="https://github.com/user-attachments/assets/61c9c7da-830b-4040-bb79-a91be90870ca" />	2025-08-13 19:14:03 -04:00
aibrahim-oai	cbf972007a	use modifier dim instead of gray and .dim (#2273 ) gray color doesn't work very well with white terminals. `.dim` doesn't have an effect for some reason. after: <img width="1080" height="149" alt="image" src="https://github.com/user-attachments/assets/26c0f8bb-550d-4d71-bd06-11b3189bc1d7" /> Before <img width="1077" height="186" alt="image" src="https://github.com/user-attachments/assets/b1fba0c7-bc4d-4da1-9754-6c0a105e8cd1" />	2025-08-13 22:50:50 +00:00
pakrym-oai	41eb59a07d	Wait for requested delay in rate limit errors (#2266 ) Fixes: https://github.com/openai/codex/issues/2131 Response doesn't have the delay in a separate field (yet) so parse the message.	2025-08-13 15:43:54 -07:00
Michael Bolin	37fc4185ef	fix: update `OutgoingMessageSender::send_response()` to take `Serialize` (#2263 ) This makes `send_response()` easier to work with.	2025-08-13 14:29:13 -07:00
aibrahim-oai	d4533a0bb3	TUI: change the diff preview to have color fg not bg (#2270 ) <img width="328" height="95" alt="image" src="https://github.com/user-attachments/assets/70e1e6c2-a88f-4058-8763-85c3a02eedb4" />	2025-08-13 14:21:24 -07:00
Dylan	99a242ef41	[codex-cli] Add ripgrep as a dependency for node environment (#2237 ) ## Summary Ripgrep is our preferred tool for file search. When users install via `brew install codex`, it's automatically installed as a dependency. We want to ensure that users running via an npm install also have this tool! Microsoft has already solved this problem for VS Code - let's not reinvent the wheel. This approach of appending to the PATH directly might be a bit heavy-handed, but feels reasonably robust to a variety of environment concerns. Open to thoughts on better approaches here! ## Testing - [x] confirmed this import approach works with `node -e "const { rgPath } = require('@vscode/ripgrep'); require('child_process').spawn(rgPath, ['--version'], { stdio: 'inherit' })"` - [x] Ran codex.js locally with `rg` uninstalled, asked it to run `which rg`. Output below: ``` ⚡ Ran command which rg; echo $? ⎿ /Users/dylan.hurd/code/dh--npm-rg/node_modules/@vscode/ripgrep/bin/rg 0 codex Re-running to confirm the path and exit code. - Path: `/Users/dylan.hurd/code/dh--npm-rg/node_modules/@vscode/ripgrep/bin/rg` - Exit code: `0` ```	2025-08-13 13:49:27 -07:00
Michael Bolin	08ed618f72	chore: introduce ConversationManager as a clearinghouse for all conversations (#2240 ) This PR does two things because after I got deep into the first one I started pulling on the thread to the second: - Makes `ConversationManager` the place where all in-memory conversations are created and stored. Previously, `MessageProcessor` in the `codex-mcp-server` crate was doing this via its `session_map`, but this is something that should be done in `codex-core`. - It unwinds the `ctrl_c: tokio::sync::Notify` that was threaded throughout our code. I think this made sense at one time, but now that we handle Ctrl-C within the TUI and have a proper `Op::Interrupt` event, I don't think this was quite right, so I removed it. For `codex exec` and `codex proto`, we now use `tokio::signal::ctrl_c()` directly, but we no longer make `Notify` a field of `Codex` or `CodexConversation`. Changes of note: - Adds the files `conversation_manager.rs` and `codex_conversation.rs` to `codex-core`. - `Codex` and `CodexSpawnOk` are no longer exported from `codex-core`: other crates must use `CodexConversation` instead (which is created via `ConversationManager`). - `core/src/codex_wrapper.rs` has been deleted in favor of `ConversationManager`. - `ConversationManager::new_conversation()` returns `NewConversation`, which is in line with the `new_conversation` tool we want to add to the MCP server. Note `NewConversation` includes `SessionConfiguredEvent`, so we eliminate checks in cases like `codex-rs/core/tests/client.rs` to verify `SessionConfiguredEvent` is the first event because that is now internal to `ConversationManager`. - Quite a bit of code was deleted from `codex-rs/mcp-server/src/message_processor.rs` since it no longer has to manage multiple conversations itself: it goes through `ConversationManager` instead. - `core/tests/live_agent.rs` has been deleted because I had to update a bunch of tests and all the tests in here were ignored, and I don't think anyone ever ran them, so this was just technical debt, at this point. - Removed `notify_on_sigint()` from `util.rs` (and in a follow-up, I hope to refactor the blandly-named `util.rs` into more descriptive files). - In general, I started replacing local variables named `codex` as `conversation`, where appropriate, though admittedly I didn't do it through all the integration tests because that would have added a lot of noise to this PR. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2240). * #2264 * #2263 * __->__ #2240	2025-08-13 13:38:18 -07:00
ae	30ee24521b	fix: remove behavioral prompting from update_plan tool def (#2261 ) - Moved some of the content to the main prompt.	2025-08-13 19:05:13 +00:00

1 2 3 4 5 ...

825 Commits