valknar/llmx - llmx - dev.pivoine.art

Author	SHA1	Message	Date
Michael Bolin	970e466ab3	fix: switch to unbounded channel (#2874 ) #2747 encouraged me to audit our codebase for similar issues, as now I am particularly suspicious that our flaky tests are due to a racy deadlock. I asked Codex to audit our code, and one of its suggestions was this: > High-Risk Patterns > > All `send_` methods await on a bounded `mpsc::Sender<OutgoingMessage>`. If the writer blocks, the channel fills and the processor task blocks on send, stops draining incoming requests, and stdin reader eventually blocks on its send. This creates a backpressure deadlock cycle across the three tasks. > > Recommendations* > * Server outgoing path: break the backpressure cycle > * Option A (minimal risk): Change `OutgoingMessageSender` to use an unbounded channel to decouple producer from stdout. Add rate logging so floods are visible. > * Option B (bounded + drop policy): Change `send_` to try_send and drop messages (or coalesce) when the queue is full, logging a warning. This prevents processor stalls at the cost of losing messages under extreme backpressure. > Option C (two-stage buffer): Keep bounded channel, but have a dedicated “egress” task that drains an unbounded internal queue, writing to stdout with retries and a shutdown timeout. This centralizes backpressure policy. So this PR is Option A. Indeed, we previously used a bounded channel with a capacity of `128`, but as we discovered recently with #2776, there are certainly cases where we can get flooded with events. That said, `test_shell_command_approval_triggers_elicitation` just failed one one build when I put up this PR, so clearly we are not out of the woods yet... Update: I think I found the true source of the deadlock! See https://github.com/openai/codex/pull/2876	2025-08-28 22:20:10 -07:00
Michael Bolin	5d2d3002ef	fix: specify --profile to `cargo clippy` in CI (#2871 ) Today we had a breakage in the release build that went unnoticed by CI. Here is what happened: - https://github.com/openai/codex/pull/2242 originally added some logic to do release builds to prevent this from happening - https://github.com/openai/codex/pull/2276 undid that change to try to speed things up by removing the step to build all the individual crates in release mode, assuming the `cargo check` call was sufficient coverage, which it would have been, had it specified `--profile` This PR adds `--profile` to the `cargo check` step so we should get the desired coverage from our build matrix. Indeed, enabling this in our CI uncovered a warning that is only present in release mode that was going unnoticed.	2025-08-28 21:43:40 -07:00
dedrisian-oai	3f8184034f	Fix CI release build (#2864 )	2025-08-29 03:06:10 +00:00
unship	f7cb2f87a0	Bug fix: clone of incoming_tx can lead to deadlock (#2747 ) POC code ```rust use tokio::sync::mpsc; use std::time::Duration; #[tokio::main] async fn main() { println!("=== Test 1: Simulating original MCP server pattern ==="); test_original_pattern().await; } async fn test_original_pattern() { println!("Testing the original pattern from MCP server..."); // Create channel - this simulates the original incoming_tx/incoming_rx let (tx, mut rx) = mpsc::channel::<String>(10); // Task 1: Simulates stdin reader that will naturally terminate let stdin_task = tokio::spawn({ let tx_clone = tx.clone(); async move { println!(" stdin_task: Started, will send 3 messages then exit"); for i in 0..3 { let msg = format!("Message {}", i); if tx_clone.send(msg.clone()).await.is_err() { println!(" stdin_task: Receiver dropped, exiting"); break; } println!(" stdin_task: Sent {}", msg); tokio::time::sleep(Duration::from_millis(300)).await; } println!(" stdin_task: Finished (simulating EOF)"); // tx_clone is dropped here } }); // Task 2: Simulates message processor let processor_task = tokio::spawn(async move { println!(" processor_task: Started, waiting for messages"); while let Some(msg) = rx.recv().await { println!(" processor_task: Processing {}", msg); tokio::time::sleep(Duration::from_millis(100)).await; } println!(" processor_task: Finished (channel closed)"); }); // Task 3: Simulates stdout writer or other background task let background_task = tokio::spawn(async move { for i in 0..2 { tokio::time::sleep(Duration::from_millis(500)).await; println!(" background_task: Tick {}", i); } println!(" background_task: Finished"); }); println!(" main: Original tx is still alive here"); println!(" main: About to call tokio::join! - will this deadlock?"); // This is the pattern from the original code let _ = tokio::join!(stdin_task, processor_task, background_task); } ``` --------- Co-authored-by: Michael Bolin <bolinfest@gmail.com>	2025-08-28 19:28:17 -07:00
Ahmed Ibrahim	9dbe7284d2	Following up on #2371 post commit feedback (#2852 ) - Introduce websearch end to complement the begin - Moves the logic of adding the sebsearch tool to create_tools_json_for_responses_api - Making it the client responsibility to toggle the tool on or off - Other misc in #2371 post commit feedback - Show the query: <img width="1392" height="151" alt="image" src="https://github.com/user-attachments/assets/8457f1a6-f851-44cf-bcca-0d4fe460ce89" />	2025-08-28 19:24:38 -07:00
dedrisian-oai	b8e8454b3f	Custom /prompts (#2696 ) Adds custom `/prompts` to `~/.codex/prompts/<command>.md`. <img width="239" height="107" alt="Screenshot 2025-08-25 at 6 22 42 PM" src="https://github.com/user-attachments/assets/fe6ebbaa-1bf6-49d3-95f9-fdc53b752679" /> --- Details: 1. Adds `Op::ListCustomPrompts` to core. 2. Returns `ListCustomPromptsResponse` with list of `CustomPrompt` (name, content). 3. TUI calls the operation on load, and populates the custom prompts (excluding prompts that collide with builtins). 4. Selecting the custom prompt automatically sends the prompt to the agent.	2025-08-29 02:16:39 +00:00
HaxagonusD	bbcfd63aba	UI: Make slash commands bold in welcome message (#2762 ) ## What Make slash commands (/init, /status, /approvals, /model) bold and white in the welcome message for better visibility. <img width="990" height="286" alt="image" src="https://github.com/user-attachments/assets/13f90e96-b84a-4659-aab4-576d84a31af7" /> ## Why The current welcome message displays all text in a dimmed style, making the slash commands less prominent. Users need to quickly identify available commands when starting Codex. ## How Modified `tui/src/history_cell.rs` in the `new_session_info` function to: - Split each command line into separate spans - Apply bold white styling to command text (`/init`, `/status`, etc.) - Keep descriptions dimmed for visual contrast - Maintain existing layout and spacing ## Test plan - [ ] Run the TUI and verify commands appear bold in the welcome message - [ ] Ensure descriptions remain dimmed for readability - [ ] Confirm all existing tests pass	2025-08-28 18:12:41 -07:00
Eric Traut	6209d49520	Changed OAuth success screen to use the string "Codex" rather than "Codex CLI" (#2737 )	2025-08-28 21:21:10 +00:00
Ahmed Ibrahim	c9ca63dc1e	burst paste edge cases (#2683 ) This PR fixes two edge cases in managing burst paste (mainly on power shell). Bugs: - Needs an event key after paste to render the pasted items > ChatComposer::flush_paste_burst_if_due() flushes on timeout. Called: > - Pre-render in App on TuiEvent::Draw. > - Via a delayed frame > BottomPane::request_redraw_in(ChatComposer::recommended_paste_flush_delay()). - Parses two key events separately before starting parsing burst paste > When threshold is crossed, pull preceding burst chars out of the textarea and prepend to paste_burst_buffer, then keep buffering. - Integrates with #2567 to bring image pasting to windows.	2025-08-28 12:54:12 -07:00
Ahmed Ibrahim	ed06f90fb3	Race condition in compact (#2746 ) This fixes the flakiness in `summarize_context_three_requests_and_instructions` because we should trim history before sending task complete.	2025-08-28 12:53:00 -07:00
Michael Bolin	f09170b574	chore: print stderr from MCP server to test output using eprintln! (#2849 ) Related to https://github.com/openai/codex/pull/2848, I don't see the stderr from `codex mcp` colocated with the other stderr from `test_shell_command_approval_triggers_elicitation()` when it fails even though we have `RUST_LOG=debug` set when we spawn `codex mcp`: `1e9e703b96/codex-rs/mcp-server/tests/common/mcp_process.rs (L65)` Let's try this new logic which should be more explicit.	2025-08-28 12:43:13 -07:00
Michael Bolin	1e9e703b96	chore: try to make it easier to debug the flakiness of test_shell_command_approval_triggers_elicitation (#2848 ) `test_shell_command_approval_triggers_elicitation()` is one of a number of integration tests that we have observed to be flaky on GitHub CI, so this PR tries to reduce the flakiness _and_ to provide us with more information when it flakes. Specifically: - Changed the command that we use to trigger the elicitation from `git init` to `python3 -c 'import pathlib; pathlib.Path(r"{}").touch()'` because running `git` seems more likely to invite variance. - Increased the timeout to wait for the task response from 10s to 20s. - Added more logging.	2025-08-28 12:33:33 -07:00
Michael Bolin	74d2741729	chore: require uninlined_format_args from clippy (#2845 ) - added `uninlined_format_args` to `[workspace.lints.clippy]` in the `Cargo.toml` for the workspace - ran `cargo clippy --tests --fix` - ran `just fmt`	2025-08-28 11:25:23 -07:00
Jeremy Rose	e5611aab07	disallow some slash commands while a task is running (#2792 ) /new, /init, /models, /approvals, etc. don't work correctly during a turn. disable them.	2025-08-28 10:15:59 -07:00
dedrisian-oai	4e9ad23864	Add "View Image" tool (#2723 ) Adds a "View Image" tool so Codex can find and see images by itself: <img width="1772" height="420" alt="Screenshot 2025-08-26 at 10 40 04 AM" src="https://github.com/user-attachments/assets/7a459c7b-0b86-4125-82d9-05fbb35ade03" />	2025-08-27 17:41:23 -07:00
Jeremy Rose	3e309805ae	fix cursor after suspend (#2690 ) This was supposed to be fixed by #2569, but I think the actual fix got lost in the refactoring. Intended behavior: pressing ^Z moves the cursor below the viewport before suspending.	2025-08-27 14:17:10 -07:00
Jeremy Rose	488a40211a	fix (most) doubled lines and hanging list markers (#2789 ) This was mostly written by codex under heavy guidance via test cases drawn from logged session data and fuzzing. It also uncovered some bugs in tui_markdown, which will in some cases split a list marker from the list item content. We're not addressing those bugs for now.	2025-08-27 13:55:59 -07:00
Reuben Narad	6e4c9d5243	Added back codex-rs/config.md to link to new location (#2778 ) Quick fix: point old config.md to new location	2025-08-27 18:37:41 +00:00
Reuben Narad	459363e17b	README / docs refactor (#2724 ) This PR cleans up the monolithic README by breaking it into a set navigable pages under docs/ (install, getting started, configuration, authentication, sandboxing and approvals, platform details, FAQ, ZDR, contributing, license). The top‑level README is now more concise and intuitive, (with corrected screenshots). It also consolidates overlapping content from codex-rs/README.md into the top‑level docs and updates links accordingly. The codex-rs README remains in place for now as a pointer and for continuity. Finally, added an extensive config reference table at the bottom of docs/config.md. --------- Co-authored-by: easong-openai <easong@openai.com>	2025-08-27 10:30:39 -07:00
Michael Bolin	ffe585387b	fix: for now, limit the number of deltas sent back to the UI (#2776 ) This is a stopgap solution, but today, we are seeing the client get flooded with events. Since we already truncate the output we send to the model, it feels reasonable to limit how many deltas we send to the client.	2025-08-27 10:23:25 -07:00
Dylan	0cec0770e2	[mcp-server] Add GetConfig endpoint (#2725 ) ## Summary Adds a GetConfig request to the MCP Protocol, so MCP clients can evaluate the resolved config.toml settings which the harness is using. ## Testing - [x] Added an end to end test of the endpoint	2025-08-27 09:59:03 -07:00
Ahmed Ibrahim	2d2f66f9c5	Bug fix: deduplicate assistant messages (#2758 ) We are treating assistant messages in a different way than other messages which resulted in a duplicated history. See #2698	2025-08-27 01:29:16 -07:00
Ahmed Ibrahim	d0e06f74e2	send context window with task started (#2752 ) - Send context window with task started - Accounting for changing the model per turn	2025-08-27 00:04:21 -07:00
Gabriel Peal	4b6c6ce98f	Make git_diff_against_sha more robust (#2749 ) 1. Ignore custom git diff drivers users may have set 2. Allow diffing against filenames that start with a dash	2025-08-27 01:53:00 -04:00
easong-openai	5df04c8a13	Cache transcript wraps (#2739 ) Previously long transcripts would become unusable.	2025-08-26 22:20:09 -07:00
ae	3d8bca7814	feat: decrease testing when running interactively (#2707 )	2025-08-26 19:57:04 -07:00
Ahmed Ibrahim	3eb11c10d0	Don't send Exec deltas on apply patch (#2742 ) We are now sending exec deltas on apply patch which doesn't make sense.	2025-08-26 19:16:51 -07:00
mattsu	bd65c4db87	Fix crash when backspacing placeholders adjacent to multibyte text (#2674 ) Prevented panics when deleting placeholders near multibyte characters by clamping the cursor to a valid boundary and using get-based slicing Added a regression test to ensure backspacing after multibyte text leaves placeholders intact without crashing --------- Co-authored-by: Ahmed Ibrahim <aibrahim@openai.com>	2025-08-26 18:31:49 -07:00
Jeremy Rose	b367790d9b	fix emoji spacing (#2735 ) before: <img width="295" height="266" alt="Screenshot 2025-08-26 at 5 05 03 PM" src="https://github.com/user-attachments/assets/3e876f08-26d0-407e-a995-28fd072e288f" /> after: <img width="295" height="129" alt="Screenshot 2025-08-26 at 5 05 30 PM" src="https://github.com/user-attachments/assets/2a019d52-19ed-40ef-8155-4f02c400796a" />	2025-08-26 17:34:24 -07:00
Jeremy Rose	435154ce93	fix transcript lines being added to diff view (#2721 ) This fixes a bug where if you ran /diff while at turn was running, transcript lines would be added to the end of the diff view. Also, refactor to make this kind of issue less likely in future.	2025-08-27 00:03:11 +00:00
vinaybantupalli	fb3f6456cf	fix issue #2713 : adding support for alt+ctrl+h to delete backward word (#2717 ) This pr addresses the fix for https://github.com/openai/codex/issues/2713 ### Changes: - Added key handler for `Alt+Ctrl+H` → `delete_backward_word()` - Added test coverage in `delete_backward_word_alt_keys()` that verifies both: - Standard `Alt+Backspace` binding continues to work - New `Alt+Ctrl+H` binding works correctly for backward word deletion ### Testing: The test ensures both key combinations produce identical behavior: - Delete the previous word from "hello world" → "hello " - Cursor positioned correctly after deletion ### Backward Compatibility: This change is backward compatible - existing `Alt+Backspace` functionality remains unchanged while adding support for the terminal-specific `Alt+Ctrl+H` variant	2025-08-26 16:37:46 -07:00
Jeremy Rose	f2603a4e50	Esc while there are queued messages drops the messages back into the composer (#2687 ) https://github.com/user-attachments/assets/bbb427c4-cdc7-4997-a4ef-8156e8170742	2025-08-26 16:26:50 -07:00
Jeremy Rose	eb161116f0	tui: render keyboard icon with emoji variation selector (⌨️) (#2728 ) Use emoji variation selector (VS16) for the keyboard icon so it consistently renders as emoji (⌨️) rather than text (⌨) across terminals. Touches TUI command rendering for unknown parsed commands. No behavior change beyond display.	2025-08-26 16:11:21 -07:00
Wang	c229a67312	feat(core): Add `remove_conversation` to `ConversationManager` for ma… (#2613 ) ### What this PR does This PR introduces a new public method, remove_conversation(conversation_id: Uuid), to the ConversationManager. This allows consumers of the codex-core library to manually remove a conversation from the manager's in-memory storage. ### Why this change is needed I am currently adapting the Codex client to run as a long-lived server application. In this server environment, ConversationManager instances persist for extended periods, and new conversations are created for each incoming user request. The current implementation of ConversationManager stores all created conversations in a HashMap indefinitely, with no mechanism for removal. This leads to unbounded memory growth in a server context, as every new conversation permanently occupies memory. While an automatic TTL-based cleanup mechanism could be one solution, a simpler, more direct remove_conversation method provides the necessary control for my use case. It allows my server application to explicitly manage the lifecycle of conversations, such as cleaning them up after a request is fully processed or after a period of inactivity is detected at the application level. This change provides a minimal, non-intrusive way to address the memory management issue for server-like applications built on top of codex-core, giving developers the flexibility to implement their own cleanup logic. Signed-off-by: M4n5ter <m4n5terrr@gmail.com> Co-authored-by: Michael Bolin <mbolin@openai.com>	2025-08-26 15:16:43 -07:00
Jeremy Rose	db98d2ce25	enable alternate scroll in transcript mode (#2686 ) this allows the mouse wheel to scroll the transcript / diff views.	2025-08-26 11:47:00 -07:00
ae	274d9b413f	[feat] Simplfy command approval UI (#2708 ) - Removed the plain "No" option, which confused the model, since we already have the "No, provide feedback" option, which works better. # Before <img width="476" height="168" alt="image" src="https://github.com/user-attachments/assets/6e783d9f-dec9-4610-9cad-8442eb377a90" /> # After <img width="553" height="175" alt="image" src="https://github.com/user-attachments/assets/3cdae582-3366-47bc-9753-288930df2324" />	2025-08-26 10:08:06 -07:00
Eric Traut	d32e4f25cf	Added caps on retry config settings (#2701 ) The CLI supports config settings `stream_max_retries` and `request_max_retries` that allow users to override the default retry counts (4 and 5, respectively). However, there's currently no cap placed on these values. In theory, a user could configure an effectively infinite retry count which could hammer the server. This PR adds a reasonable cap (currently 100) to both of these values.	2025-08-25 22:51:01 -07:00
ae	a4d34235bc	[fix] emoji padding (#2702 ) - We use emojis as bullet icons of sorts, and in some common terminals like Terminal or iTerm, these can render with insufficient padding between the emoji and following text. - This PR makes emoji look better in Terminal and iTerm, at the expense of Ghostty. (All default fonts.) # Terminal <img width="420" height="123" alt="image" src="https://github.com/user-attachments/assets/93590703-e35a-4781-a697-881d7ec95598" /> # iTerm <img width="465" height="163" alt="image" src="https://github.com/user-attachments/assets/f11e6558-d2db-4727-bb7e-2b61eed0a3b1" /> # Ghostty <img width="485" height="142" alt="image" src="https://github.com/user-attachments/assets/7a7b021f-5238-4672-8066-16cd1da32dc6" />	2025-08-25 22:49:19 -07:00
ae	d085f73a2a	[feat] reduce bottom padding to 1 line (#2704 )	2025-08-25 22:47:26 -07:00
Eric Traut	ab9250e714	Improved user message for rate-limit errors (#2695 ) This PR improves the error message presented to the user when logged in with ChatGPT and a rate-limit error occurs. In particular, it provides the user with information about when the rate limit will be reset. It removes older code that attempted to do the same but relied on parsing of error messages that are not generated by the ChatGPT endpoint. The new code uses newly-added error fields.	2025-08-25 21:42:10 -07:00
Jeremy Rose	e5283b6126	single control flow for both Esc and Ctrl+C (#2691 ) Esc and Ctrl+C while a task is running should do the same thing. There were some cases where pressing Esc would leave a "stuck" widget in the history; this fixes that and cleans up the logic so there's just one path for interrupting the task. Also clean up some subtly mishandled key events (e.g. Ctrl+D would quit the app while an approval modal was showing if the textarea was empty). --------- Co-authored-by: Ahmed Ibrahim <aibrahim@openai.com>	2025-08-25 20:15:38 -07:00
Eric Traut	d63e44ae29	Fixed a bug that causes token refresh to not work in a seamless manner (#2699 ) This PR fixes a bug in the token refresh logic. Token refresh is performed in a retry loop so if we receive a 401 error, we refresh the token, then we go around the loop again and reissue the fetch with a fresh token. The bug is that we're not using the updated token on the second and subsequent times through the loop. The result is that we'll try to refresh the token a few more times until we hit the retry limit (default of 4). The 401 error is then passed back up to the caller. Subsequent calls will use the refreshed token, so the problem clears itself up. The fix is straightforward — make sure we use the updated auth information each time through the retry loop.	2025-08-25 19:18:16 -07:00
Jeremy Rose	17e5077507	do not show timeouts as "sandbox error"s (#2587 ) 🙅🫸 ``` ✗ Failed (exit -1) └ 🧪 cargo test --all-features -q sandbox error: command timed out ``` 😌👉 ``` ✗ Failed (exit -1) └ 🧪 cargo test --all-features -q error: command timed out ```	2025-08-25 17:52:23 -07:00
Jeremy Rose	b1079187e4	queued messages rendered italic (#2693 ) <img width="416" height="215" alt="Screenshot 2025-08-25 at 5 29 53 PM" src="https://github.com/user-attachments/assets/0f4178c9-6997-4e7a-bb30-0817b98d9748" />	2025-08-26 00:36:05 +00:00
Jeremy Rose	ae8f772ef2	do not schedule frames for Tui::Draw events in backtrack (#2692 ) this was causing continuous rerendering when a transcript overlay was present	2025-08-26 00:29:24 +00:00
dedrisian-oai	468a8b4c38	Copying / Dragging image files (MacOS Terminal + iTerm) (#2567 ) In this PR: - [x] Add support for dragging / copying image files into chat. - [x] Don't remove image placeholders when submitting. - [x] Add tests. Works for: - Image Files - Dragging MacOS Screenshots (Terminal, iTerm) Todos: - [ ] In some terminals (VSCode, WIndows Powershell, and remote SSH-ing), copy-pasting a file streams the escaped filepath as individual key events rather than a single Paste event. We'll need to have a function (in a separate PR) for detecting these paste events.	2025-08-25 16:39:42 -07:00
Gabriel Peal	cb32f9c64e	Add auth to send_user_turn (#2688 ) It is there for send_user_message but was omitted from send_user_turn. Presumably this was a mistake	2025-08-25 18:57:20 -04:00
Ahmed Ibrahim	907afc9425	Fix esc (#2661 ) Esc should have other functionalities when it's not used in a backtracking situation. i.e. to cancel pop up menu when selecting model/approvals or to interrupt an active turn.	2025-08-25 15:38:46 -07:00
Dylan	7f7d1e30f3	[exec] Clean up apply-patch tests (#2648 ) ## Summary These tests were getting a bit unwieldy, and they're starting to become load-bearing. Let's clean them up, and get them working solidly so we can easily expand this harness with new tests. ## Test Plan - [x] Tests continue to pass	2025-08-25 15:08:01 -07:00
Michael Bolin	568d6f819f	fix: use backslash as path separator on Windows (#2684 ) I noticed that when running `/status` on Windows, I saw something like: ``` Path: ~/src\codex ``` so now it should be: ``` Path: ~\src\codex ``` Admittedly, `~` is understood by PowerShell but not on Windows, in general, but it's much less verbose than `%USERPROFILE%`.	2025-08-25 14:47:17 -07:00

1 2 3 4 5 ...

656 Commits