Reported height was `20` instead of `21`, so `area.height >=
MIN_ANIMATION_HEIGHT` was `false` and therefore `show_animation` was
`false`, so the animation never displayed.
Changes:
- skip the welcome animation when the terminal area is below 60x21
- skip the model upgrade animation when the terminal area is below 60x24
to avoid clipping
---------
Co-authored-by: Michael Bolin <mbolin@openai.com>
Summary
- common: use exact equality for Swiftfox exclusion to avoid hiding
future slugs that merely contain the substring
- core: treat missing internal_storage.json as expected (debug), warn
only on real IO/parse errors
- tui: drop DEBUG_HIGH gate; always consider showing rollout prompt, but
suppress under ApiKey auth mode
Adding the ability to resume conversations.
we have one verb `resume`.
Behavior:
`tui`:
`codex resume`: opens session picker
`codex resume --last`: continue last message
`codex resume <session id>`: continue conversation with `session id`
`exec`:
`codex resume --last`: continue last conversation
`codex resume <session id>`: continue conversation with `session id`
Implementation:
- I added a function to find the path in `~/.codex/sessions/` with a
`UUID`. This is helpful in resuming with session id.
- Added the above mentioned flags
- Added lots of testing
No (intended) functional change.
This refactors the transcript view to hold a list of HistoryCells
instead of a list of Lines. This simplifies and makes much of the logic
more robust, as well as laying the groundwork for future changes, e.g.
live-updating history cells in the transcript.
Similar to #2879 in goal. Fixes#2755.
## 📝 Review Mode -- Core
This PR introduces the Core implementation for Review mode:
- New op `Op::Review { prompt: String }:` spawns a child review task
with isolated context, a review‑specific system prompt, and a
`Config.review_model`.
- `EnteredReviewMode`: emitted when the child review session starts.
Every event from this point onwards reflects the review session.
- `ExitedReviewMode(Option<ReviewOutputEvent>)`: emitted when the review
finishes or is interrupted, with optional structured findings:
```json
{
"findings": [
{
"title": "<≤ 80 chars, imperative>",
"body": "<valid Markdown explaining *why* this is a problem; cite files/lines/functions>",
"confidence_score": <float 0.0-1.0>,
"priority": <int 0-3>,
"code_location": {
"absolute_file_path": "<file path>",
"line_range": {"start": <int>, "end": <int>}
}
}
],
"overall_correctness": "patch is correct" | "patch is incorrect",
"overall_explanation": "<1-3 sentence explanation justifying the overall_correctness verdict>",
"overall_confidence_score": <float 0.0-1.0>
}
```
## Questions
### Why separate out its own message history?
We want the review thread to match the training of our review models as
much as possible -- that means using a custom prompt, removing user
instructions, and starting a clean chat history.
We also want to make sure the review thread doesn't leak into the parent
thread.
### Why do this as a mode, vs. sub-agents?
1. We want review to be a synchronous task, so it's fine for now to do a
bespoke implementation.
2. We're still unclear about the final structure for sub-agents. We'd
prefer to land this quickly and then refactor into sub-agents without
rushing that implementation.
`ClientRequest::NewConversation` picks up the reasoning level from the user's defaults in `config.toml`, so it should be reported in `NewConversationResponse`.
Created this PR by:
- adding `redundant_clone` to `[workspace.lints.clippy]` in
`cargo-rs/Cargol.toml`
- running `cargo clippy --tests --fix`
- running `just fmt`
Though I had to clean up one instance of the following that resulted:
```rust
let codex = codex;
```
This PR does the following:
* Adds the ability to paste or type an API key.
* Removes the `preferred_auth_method` config option. The last login
method is always persisted in auth.json, so this isn't needed.
* If OPENAI_API_KEY env variable is defined, the value is used to
prepopulate the new UI. The env variable is otherwise ignored by the
CLI.
* Adds a new MCP server entry point "login_api_key" so we can implement
this same API key behavior for the VS Code extension.
<img width="473" height="140" alt="Screenshot 2025-09-04 at 3 51 04 PM"
src="https://github.com/user-attachments/assets/c11bbd5b-8a4d-4d71-90fd-34130460f9d9"
/>
<img width="726" height="254" alt="Screenshot 2025-09-04 at 3 51 32 PM"
src="https://github.com/user-attachments/assets/6cc76b34-309a-4387-acbc-15ee5c756db9"
/>
This PR changes get history op to get path. Then, forking will use a
path. This will help us have one unified codepath for resuming/forking
conversations. Will also help in having rollout history in order. It
also fixes a bug where you won't see the UI when resuming after forking.
Also, simplify the streaming behavior.
This fixes a number of display issues with streaming markdown, and paves
the way for better markdown features (e.g. customizable styles, syntax
highlighting, markdown-aware wrapping).
Not currently supported:
- footnotes
- tables
- reference-style links
This PR adds an `images` field to the existing `UserMessageEvent` so we
can encode zero or more images associated with a user message. This
allows images to be restored when conversations are restored.
The previous config approach had a few issues:
1. It is part of the config but not designed to be used externally
2. It had to be wired through many places (look at the +/- on this PR
3. It wasn't guaranteed to be set consistently everywhere because we
don't have a super well defined way that configs stack. For example, the
extension would configure during newConversation but anything that
happened outside of that (like login) wouldn't get it.
This env var approach is cleaner and also creates one less thing we have
to deal with when coming up with a better holistic story around configs.
One downside is that I removed the unit test testing for the override
because I don't want to deal with setting the global env or spawning
child processes and figuring out how to introspect their originator
header. The new code is sufficiently simple and I tested it e2e that I
feel as if this is still worth it.