## Summary
We've been seeing a number of issues and reports with our synthetic
`apply_patch` tool, e.g. #802. Let's make this a real tool - in my
anecdotal testing, it's critical for GPT-OSS models, but I'd like to
make it the standard across GPT-5 and codex models as well.
## Testing
- [x] Tested locally
- [x] Integration test
Add env var to show the raw, unparsed command line under parsed
commands. When we have transcript mode we should show the full command
there, but this is useful for debugging.
This PR:
* Added the clippy.toml to configure allowable expect / unwrap usage in
tests
* Removed as many expect/allow lines as possible from tests
* moved a bunch of allows to expects where possible
Note: in integration tests, non `#[test]` helper functions are not
covered by this so we had to leave a few lingering `expect(expect_used`
checks around
we have a very unclear lifecycle for the chatwidget—this should only
have to be added in one place! but this fixes the "hanging commands"
issue where the active_exec_cell wasn't correctly cleared when commands
finished.
To repro w/o this PR:
1. prompt "run sleep 10"
2. once the command starts running, press <kbd>Esc</kbd>
3. prompt "run echo hi"
Expected:
```
✓ Completed
└ ⌨️ echo hi
codex
hi
```
Actual:
```
⚙︎ Working
└ ⌨️ echo hi
▌ Ask Codex to do anything
```
i.e. the "Working" never changes to "Completed".
The bug is fixed with this PR.
The "display format" of commands was sometimes producing incorrect
quoting like `echo foo '>' bar`, which is importantly different from the
actual command that was being run. This refactors ParsedCommand to have
a string in `cmd` instead of a vec, as a `vec` can't accurately capture
a full command.
instead of each shimmer needing to have its own animation thread, have
render_ref schedule a new frame if it wants one and coalesce to the
earliest next frame. this also makes the animations
frame-timing-independent, based on start time instead of frame count.
This improves handling of pasted content in the textarea. It's no longer
possible to partially delete a placeholder (e.g. by ^W or ^D), nor is it
possible to place the cursor inside a placeholder. Also, we now render
placeholders in a different color to make them more clearly
differentiated.
https://github.com/user-attachments/assets/2051b3c3-963d-4781-a610-3afee522ae29
refactors HistoryCell to be a trait instead of an enum. Also collapse
the many "degenerate" HistoryCell enums which were just a store of lines
into a single PlainHistoryCell type.
The goal here is to allow more ways of rendering history cells (e.g.
expanded/collapsed/"live"), and I expect we will return to more varied
types of HistoryCell as we develop this area.
Sometimes COT is returns as text content instead of `ReasoningText`. We
should parse it but not serialize back on requests.
---------
Co-authored-by: Ahmed Ibrahim <aibrahim@openai.com>
This PR does two things because after I got deep into the first one I
started pulling on the thread to the second:
- Makes `ConversationManager` the place where all in-memory
conversations are created and stored. Previously, `MessageProcessor` in
the `codex-mcp-server` crate was doing this via its `session_map`, but
this is something that should be done in `codex-core`.
- It unwinds the `ctrl_c: tokio::sync::Notify` that was threaded
throughout our code. I think this made sense at one time, but now that
we handle Ctrl-C within the TUI and have a proper `Op::Interrupt` event,
I don't think this was quite right, so I removed it. For `codex exec`
and `codex proto`, we now use `tokio::signal::ctrl_c()` directly, but we
no longer make `Notify` a field of `Codex` or `CodexConversation`.
Changes of note:
- Adds the files `conversation_manager.rs` and `codex_conversation.rs`
to `codex-core`.
- `Codex` and `CodexSpawnOk` are no longer exported from `codex-core`:
other crates must use `CodexConversation` instead (which is created via
`ConversationManager`).
- `core/src/codex_wrapper.rs` has been deleted in favor of
`ConversationManager`.
- `ConversationManager::new_conversation()` returns `NewConversation`,
which is in line with the `new_conversation` tool we want to add to the
MCP server. Note `NewConversation` includes `SessionConfiguredEvent`, so
we eliminate checks in cases like `codex-rs/core/tests/client.rs` to
verify `SessionConfiguredEvent` is the first event because that is now
internal to `ConversationManager`.
- Quite a bit of code was deleted from
`codex-rs/mcp-server/src/message_processor.rs` since it no longer has to
manage multiple conversations itself: it goes through
`ConversationManager` instead.
- `core/tests/live_agent.rs` has been deleted because I had to update a
bunch of tests and all the tests in here were ignored, and I don't think
anyone ever ran them, so this was just technical debt, at this point.
- Removed `notify_on_sigint()` from `util.rs` (and in a follow-up, I
hope to refactor the blandly-named `util.rs` into more descriptive
files).
- In general, I started replacing local variables named `codex` as
`conversation`, where appropriate, though admittedly I didn't do it
through all the integration tests because that would have added a lot of
noise to this PR.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2240).
* #2264
* #2263
* __->__ #2240
## Summary
- support Ctrl-b and Ctrl-f to move the cursor left and right in the
chat composer text area
- test Ctrl-b/Ctrl-f cursor movements
## Testing
- `just fmt`
- `just fix` *(fails: `let` expressions in this position are unstable)*
- `cargo test --all-features` *(fails: `let` expressions in this
position are unstable)*
------
https://chatgpt.com/codex/tasks/task_i_689cbd1d7968832e876fff169891e486
Wait for newlines, then render markdown on a line by line basis. Word wrap it for the current terminal size and then spit it out line by line into the UI. Also adds tests and fixes some UI regressions.
## Summary
- Display "Update plan" instead of "Update to do" when the plan is
updated in the TUI
## Testing
- `just fmt`
- `just fix` *(fails: E0658 `let` expressions in this position are
unstable)*
- `cargo test --all-features` *(fails: E0658 `let` expressions in this
position are unstable)*
------
https://chatgpt.com/codex/tasks/task_i_6897f78fc5908322be488f02db42a5b9
Moves `use codex_core::protocol::EventMsg` inside the block annotated
with `#[cfg(debug_assertions)]` since that was the only place in the
file that was using it.
This eliminates the `warning: unused import:` when building with `cargo
build --release` in `cargo-rs/tui`.
Note this was not breaking CI because we do not build release builds on
CI since we're impatient :P
Right now, every time an exec ends, we emit it to history which makes it
immutable. In order to be able to update or merge successive tool calls
(which will be useful after https://github.com/openai/codex/pull/2095),
we need to retain it as the active cell.
This also changes the cell to contain the metadata necessary to render
it so it can be updated rather than baking in the final text lines when
the cell is created.
Part 1: https://github.com/openai/codex/pull/2095
Part 3: https://github.com/openai/codex/pull/2110
# Note for reviewers
The bulk of this PR is in in the new file, `parse_command.rs`. This file
is designed to be written TDD and implemented with Codex. Do not worry
about reviewing the code, just review the unit tests (if you want). If
any cases are missing, we'll add more tests and have Codex fix them.
I think the best approach will be to land and iterate. I have some
follow-ups I want to do after this lands. The next PR after this will
let us merge (and dedupe) multiple sequential cells of the same such as
multiple read commands. The deduping will also be important because the
model often reads the same file multiple times in a row in chunks
===
This PR formats common commands like reading, formatting, testing, etc
more nicely:
It tries to extract things like file names, tests and falls back to the
cmd if it doesn't. It also only shows stdout/err if the command failed.
<img width="770" height="238" alt="CleanShot 2025-08-09 at 16 05 15"
src="https://github.com/user-attachments/assets/0ead179a-8910-486b-aa3d-7d26264d751e"
/>
<img width="348" height="158" alt="CleanShot 2025-08-09 at 16 05 32"
src="https://github.com/user-attachments/assets/4302681b-5e87-4ff3-85b4-0252c6c485a9"
/>
<img width="834" height="324" alt="CleanShot 2025-08-09 at 16 05 56 2"
src="https://github.com/user-attachments/assets/09fb3517-7bd6-40f6-a126-4172106b700f"
/>
Part 2: https://github.com/openai/codex/pull/2097
Part 3: https://github.com/openai/codex/pull/2110
This PR updates ChatWidget to ensure that when AgentMessage,
AgentReasoning, or AgentReasoningRawContent events arrive without any
streamed deltas, the final text from the event is rendered before the
stream is finalized. Previously, these handlers ignored the event text
in such cases, relying solely on prior deltas.
<img width="603" height="189" alt="image"
src="https://github.com/user-attachments/assets/868516f2-7963-4603-9af4-adb1b1eda61e"
/>
- I had a recent conversation where the one-liner showed using 11M
tokens! But looking into it 10M were cached. So I looked into it and I
think we had a regression here. ->
- Use blended total tokens for chat composer usage display
- Compute remaining context using tokens_in_context_window helper
------
https://chatgpt.com/codex/tasks/task_i_68981a16c0a4832cbf416017390930e5
## Summary
- allow Esc to interrupt the current session when a task is running
- document Esc as an interrupt key in status indicator
## Testing
- `just fmt`
- `just fix` *(fails: E0658 `let` expressions in this position are
unstable)*
- `cargo test --all-features` *(fails: E0658 `let` expressions in this
position are unstable)*
------
https://chatgpt.com/codex/tasks/task_i_689698cf605883208f57b0317ff6a303
This pull request implements a fix from #2000, as well as fixed an
additional problem with path lengths on windows that prevents the login
from displaying.
---------
Co-authored-by: Michael Bolin <bolinfest@gmail.com>
Co-authored-by: Michael Bolin <mbolin@openai.com>
We wait until we have an entire newline, then format it with markdown and stream in to the UI. This reduces time to first token but is the right thing to do with our current rendering model IMO. Also lets us add word wrapping!