Commit Graph

326 Commits

Author SHA1 Message Date
dependabot[bot]
9ad2e726fc chore(deps): bump thiserror from 2.0.12 to 2.0.16 in /codex-rs (#2667)
Bumps [thiserror](https://github.com/dtolnay/thiserror) from 2.0.12 to
2.0.16.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/dtolnay/thiserror/releases">thiserror's
releases</a>.</em></p>
<blockquote>
<h2>2.0.16</h2>
<ul>
<li>Add to &quot;no-std&quot; crates.io category (<a
href="https://redirect.github.com/dtolnay/thiserror/issues/429">#429</a>)</li>
</ul>
<h2>2.0.15</h2>
<ul>
<li>Prevent <code>Error::provide</code> API becoming unavailable from a
future new compiler lint (<a
href="https://redirect.github.com/dtolnay/thiserror/issues/427">#427</a>)</li>
</ul>
<h2>2.0.14</h2>
<ul>
<li>Allow build-script cleanup failure with NFSv3 output directory to be
non-fatal (<a
href="https://redirect.github.com/dtolnay/thiserror/issues/426">#426</a>)</li>
</ul>
<h2>2.0.13</h2>
<ul>
<li>Documentation improvements</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="40b58536cc"><code>40b5853</code></a>
Release 2.0.16</li>
<li><a
href="83dfb5f99b"><code>83dfb5f</code></a>
Merge pull request <a
href="https://redirect.github.com/dtolnay/thiserror/issues/429">#429</a>
from dtolnay/nostd</li>
<li><a
href="9b4a99fb90"><code>9b4a99f</code></a>
Add to &quot;no-std&quot; crates.io category</li>
<li><a
href="f6145ebe84"><code>f6145eb</code></a>
Release 2.0.15</li>
<li><a
href="2717177976"><code>2717177</code></a>
Merge pull request <a
href="https://redirect.github.com/dtolnay/thiserror/issues/427">#427</a>
from dtolnay/caplints</li>
<li><a
href="2cd13e6767"><code>2cd13e6</code></a>
Make error_generic_member_access compatible with -Dwarnings</li>
<li><a
href="eea6799e2d"><code>eea6799</code></a>
Release 2.0.14</li>
<li><a
href="a2aa6d7a57"><code>a2aa6d7</code></a>
Merge pull request <a
href="https://redirect.github.com/dtolnay/thiserror/issues/426">#426</a>
from dtolnay/enotempty</li>
<li><a
href="f00ebc57be"><code>f00ebc5</code></a>
Allow build-script cleanup failure with NFSv3 output directory to be
non-fatal</li>
<li><a
href="61f28da3df"><code>61f28da</code></a>
Release 2.0.13</li>
<li>Additional commits viewable in <a
href="https://github.com/dtolnay/thiserror/compare/2.0.12...2.0.16">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=thiserror&package-manager=cargo&previous-version=2.0.12&new-version=2.0.16)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-02 23:50:53 -07:00
pchuri
6aa306c584 feat: add stable file locking using std::fs APIs (#2894)
## Summary

This PR implements advisory file locking for the message history using
Rust 1.89+ stabilized std::fs::File locking APIs, eliminating the need
for external dependencies.

## Key Changes

- **Stable API Usage**: Uses std::fs::File::try_lock() and
try_lock_shared() APIs stabilized in Rust 1.89
- **Cross-Platform Compatibility**: 
  - Unix systems use try_lock_shared() for advisory read locks
  - Windows systems use try_lock() due to different lock semantics
- **Retry Logic**: Maintains existing retry behavior for concurrent
access scenarios
- **No External Dependencies**: Removes need for external file locking
crates

## Technical Details

The implementation provides advisory file locking to prevent corruption
when multiple Codex processes attempt to write to the message history
file simultaneously. The locking is platform-aware to handle differences
in Windows vs Unix file locking behavior.

## Testing

-  Builds successfully on all platforms
-  Existing message history tests pass
-  File locking retry logic verified

Related to discussion in #2773 about using stabilized Rust APIs instead
of external dependencies.

---------

Co-authored-by: Michael Bolin <bolinfest@gmail.com>
2025-09-02 23:46:27 -07:00
Jeremy Rose
53413c728e parse cd foo && ... for exec and apply_patch (#3083)
sometimes the model likes to run "cd foo && ..." instead of using the
workdir parameter of exec. handle them roughly the same.
2025-09-03 05:26:06 +00:00
Dominik Kundel
b127a3643f Improve gpt-oss compatibility (#2461)
The gpt-oss models require reasoning with subsequent Chat Completions
requests because otherwise the model forgets why the tools were called.
This change fixes that and also adds some additional missing
documentation around how to handle context windows in Ollama and how to
show the CoT if you desire to.
2025-09-02 19:49:03 -07:00
Anton Panasenko
a93a907c7e [feat] use experimental reasoning summary (#3071)
<img width="1512" height="442" alt="Screenshot 2025-09-02 at 3 49 46 PM"
src="https://github.com/user-attachments/assets/26c3c1cf-b7ed-4520-a12a-8d38a8e0c318"
/>
2025-09-02 18:47:14 -07:00
pakrym-oai
03e2796ca4 Move CodexAuth and AuthManager to the core crate (#3074)
Fix a long standing layering issue.
2025-09-02 18:36:19 -07:00
Eric Traut
051f185ce3 Added back the logic to handle rate-limit errors when using API key (#3070)
A previous PR removed this when adding rate-limit errors for the ChatGPT
auth path.
2025-09-02 17:50:15 -07:00
Dylan
6f75114695 [apply-patch] Fix lark grammar (#2651)
## Summary
Fixes an issue with the lark grammar definition for the apply_patch
freeform tool. This does NOT change the defaults, merely patches the
root cause of the issue we were seeing with empty lines, and an issue
with config flowing through correctly.

Specifically, the following requires that a line is non-empty:
```
add_line: "+" /(.+)/ LF -> line
```
but many changes _should_ involve creating/updating empty lines. The new
definition is:
```
add_line: "+" /(.*)/ LF -> line
```

## Testing
- [x] Tested locally, reproduced the issue without the update and
confirmed that the model will produce empty lines wiht the new lark
grammar
2025-09-02 17:38:19 -07:00
Ahmed Ibrahim
431a10fc50 chore: unify history loading (#2736)
We have two ways of loading conversation with a previous history. Fork
conversation and the experimental resume that we had before. In this PR,
I am unifying their code path. The path is getting the history items and
recording them in a brand new conversation. This PR also constraint the
rollout recorder responsibilities to be only recording to the disk and
loading from the disk.

The PR also fixes a current bug when we have two forking in a row:
History 1:
<Environment Context>
UserMessage_1
UserMessage_2
UserMessage_3

**Fork with n = 1 (only remove one element)**
History 2:
<Environment Context>
UserMessage_1
UserMessage_2
<Environment Context>

**Fork with n = 1 (only remove one element)**
History 2:
<Environment Context>
UserMessage_1
UserMessage_2
**<Environment Context>**

This shouldn't happen but because we were appending the `<Environment
Context>` after each spawning and it's considered as _user message_.
Now, we don't add this message if restoring and old conversation.
2025-09-02 22:44:29 +00:00
Jeremy Rose
e442ecedab rework message styling (#2877)
https://github.com/user-attachments/assets/cf07f62b-1895-44bb-b9c3-7a12032eb371
2025-09-02 17:29:58 +00:00
dependabot[bot]
1cc6b97227 chore(deps): bump regex-lite from 0.1.6 to 0.1.7 in /codex-rs (#3010)
Bumps [regex-lite](https://github.com/rust-lang/regex) from 0.1.6 to
0.1.7.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/rust-lang/regex/blob/master/CHANGELOG.md">regex-lite's
changelog</a>.</em></p>
<blockquote>
<h1>0.1.79</h1>
<ul>
<li>Require regex-syntax 0.3.8.</li>
</ul>
<h1>0.1.78</h1>
<ul>
<li>[PR <a
href="https://redirect.github.com/rust-lang/regex/issues/290">#290</a>](<a
href="https://redirect.github.com/rust-lang/regex/pull/290">rust-lang/regex#290</a>):
Fixes bug <a
href="https://redirect.github.com/rust-lang/regex/issues/289">#289</a>,
which caused some regexes with a certain combination
of literals to match incorrectly.</li>
</ul>
<h1>0.1.77</h1>
<ul>
<li>[PR <a
href="https://redirect.github.com/rust-lang/regex/issues/281">#281</a>](<a
href="https://redirect.github.com/rust-lang/regex/pull/281">rust-lang/regex#281</a>):
Fixes bug <a
href="https://redirect.github.com/rust-lang/regex/issues/280">#280</a>
by disabling all literal optimizations when a pattern
is partially anchored.</li>
</ul>
<h1>0.1.76</h1>
<ul>
<li>Tweak criteria for using the Teddy literal matcher.</li>
</ul>
<h1>0.1.75</h1>
<ul>
<li>[PR <a
href="https://redirect.github.com/rust-lang/regex/issues/275">#275</a>](<a
href="https://redirect.github.com/rust-lang/regex/pull/275">rust-lang/regex#275</a>):
Improves match verification performance in the Teddy SIMD searcher.</li>
<li>[PR <a
href="https://redirect.github.com/rust-lang/regex/issues/278">#278</a>](<a
href="https://redirect.github.com/rust-lang/regex/pull/278">rust-lang/regex#278</a>):
Replaces slow substring loop in the Teddy SIMD searcher with
Aho-Corasick.</li>
<li>Implemented DoubleEndedIterator on regex set match iterators.</li>
</ul>
<h1>0.1.74</h1>
<ul>
<li>Release regex-syntax 0.3.5 with a minor bug fix.</li>
<li>Fix bug <a
href="https://redirect.github.com/rust-lang/regex/issues/272">#272</a>.</li>
<li>Fix bug <a
href="https://redirect.github.com/rust-lang/regex/issues/277">#277</a>.</li>
<li>[PR <a
href="https://redirect.github.com/rust-lang/regex/issues/270">#270</a>](<a
href="https://redirect.github.com/rust-lang/regex/pull/270">rust-lang/regex#270</a>):
Fixes bugs <a
href="https://redirect.github.com/rust-lang/regex/issues/264">#264</a>,
<a
href="https://redirect.github.com/rust-lang/regex/issues/268">#268</a>
and an unreported where the DFA cache size could be
drastically underestimated in some cases (leading to high unexpected
memory
usage).</li>
</ul>
<h1>0.1.73</h1>
<ul>
<li>Release <code>regex-syntax 0.3.4</code>.</li>
<li>Bump <code>regex-syntax</code> dependency version for
<code>regex</code> to <code>0.3.4</code>.</li>
</ul>
<h1>0.1.72</h1>
<ul>
<li>[PR <a
href="https://redirect.github.com/rust-lang/regex/issues/262">#262</a>](<a
href="https://redirect.github.com/rust-lang/regex/pull/262">rust-lang/regex#262</a>):
Fixes a number of small bugs caught by fuzz testing (AFL).</li>
</ul>
<h1>0.1.71</h1>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="45c3da7681"><code>45c3da7</code></a>
regex-lite-0.1.7</li>
<li><a
href="873ed800c5"><code>873ed80</code></a>
regex-automata-0.4.10</li>
<li><a
href="ea834f8e1f"><code>ea834f8</code></a>
regex-syntax-0.8.6</li>
<li><a
href="86836fbe84"><code>86836fb</code></a>
changelog: 1.11.2</li>
<li><a
href="63a26c1a7f"><code>63a26c1</code></a>
cargo: ensure that 'perf' doesn't enable 'std' implicitly (<a
href="https://redirect.github.com/rust-lang/regex/issues/1150">#1150</a>)</li>
<li><a
href="dd96592be2"><code>dd96592</code></a>
doc: clarify CRLF mode effect</li>
<li><a
href="931dae0192"><code>931dae0</code></a>
cargo: point <code>repository</code> metadata to clonable URLs</li>
<li><a
href="a66fde6e80"><code>a66fde6</code></a>
doc: remove references to non-existent parameters</li>
<li><a
href="1873e96a7b"><code>1873e96</code></a>
automata: add <code>DFA::set_prefilter</code> method to the DFA
types</li>
<li><a
href="89ff15310b"><code>89ff153</code></a>
doc: fix misspelling typo</li>
<li>Additional commits viewable in <a
href="https://github.com/rust-lang/regex/compare/regex-lite-0.1.6...regex-lite-0.1.7">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=regex-lite&package-manager=cargo&previous-version=0.1.6&new-version=0.1.7)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-02 09:09:17 -07:00
Michael Bolin
c988ce28fe fix: drop Mutex before calling tx_approve.send() (#2876) 2025-08-28 22:49:29 -07:00
Ahmed Ibrahim
9dbe7284d2 Following up on #2371 post commit feedback (#2852)
- Introduce websearch end to complement the begin 
- Moves the logic of adding the sebsearch tool to
create_tools_json_for_responses_api
- Making it the client responsibility to toggle the tool on or off 
- Other misc in #2371 post commit feedback
- Show the query:

<img width="1392" height="151" alt="image"
src="https://github.com/user-attachments/assets/8457f1a6-f851-44cf-bcca-0d4fe460ce89"
/>
2025-08-28 19:24:38 -07:00
dedrisian-oai
b8e8454b3f Custom /prompts (#2696)
Adds custom `/prompts` to `~/.codex/prompts/<command>.md`.

<img width="239" height="107" alt="Screenshot 2025-08-25 at 6 22 42 PM"
src="https://github.com/user-attachments/assets/fe6ebbaa-1bf6-49d3-95f9-fdc53b752679"
/>

---

Details:

1. Adds `Op::ListCustomPrompts` to core.
2. Returns `ListCustomPromptsResponse` with list of `CustomPrompt`
(name, content).
3. TUI calls the operation on load, and populates the custom prompts
(excluding prompts that collide with builtins).
4. Selecting the custom prompt automatically sends the prompt to the
agent.
2025-08-29 02:16:39 +00:00
Ahmed Ibrahim
c9ca63dc1e burst paste edge cases (#2683)
This PR fixes two edge cases in managing burst paste (mainly on power
shell).
Bugs:
- Needs an event key after paste to render the pasted items

> ChatComposer::flush_paste_burst_if_due() flushes on timeout. Called:
>     - Pre-render in App on TuiEvent::Draw.
>     - Via a delayed frame
>
BottomPane::request_redraw_in(ChatComposer::recommended_paste_flush_delay()).

- Parses two key events separately before starting parsing burst paste

> When threshold is crossed, pull preceding burst chars out of the
textarea and prepend to paste_burst_buffer, then keep buffering.

- Integrates with #2567 to bring image pasting to windows.
2025-08-28 12:54:12 -07:00
Ahmed Ibrahim
ed06f90fb3 Race condition in compact (#2746)
This fixes the flakiness in
`summarize_context_three_requests_and_instructions` because we should
trim history before sending task complete.
2025-08-28 12:53:00 -07:00
Michael Bolin
74d2741729 chore: require uninlined_format_args from clippy (#2845)
- added `uninlined_format_args` to `[workspace.lints.clippy]` in the
`Cargo.toml` for the workspace
- ran `cargo clippy --tests --fix`
- ran `just fmt`
2025-08-28 11:25:23 -07:00
dedrisian-oai
4e9ad23864 Add "View Image" tool (#2723)
Adds a "View Image" tool so Codex can find and see images by itself:

<img width="1772" height="420" alt="Screenshot 2025-08-26 at 10 40
04 AM"
src="https://github.com/user-attachments/assets/7a459c7b-0b86-4125-82d9-05fbb35ade03"
/>
2025-08-27 17:41:23 -07:00
Michael Bolin
ffe585387b fix: for now, limit the number of deltas sent back to the UI (#2776)
This is a stopgap solution, but today, we are seeing the client get
flooded with events. Since we already truncate the output we send to the
model, it feels reasonable to limit how many deltas we send to the
client.
2025-08-27 10:23:25 -07:00
Ahmed Ibrahim
2d2f66f9c5 Bug fix: deduplicate assistant messages (#2758)
We are treating assistant messages in a different way than other
messages which resulted in a duplicated history.

See #2698
2025-08-27 01:29:16 -07:00
Ahmed Ibrahim
d0e06f74e2 send context window with task started (#2752)
- Send context window with task started
- Accounting for changing the model per turn
2025-08-27 00:04:21 -07:00
Gabriel Peal
4b6c6ce98f Make git_diff_against_sha more robust (#2749)
1. Ignore custom git diff drivers users may have set
2. Allow diffing against filenames that start with a dash
2025-08-27 01:53:00 -04:00
ae
3d8bca7814 feat: decrease testing when running interactively (#2707) 2025-08-26 19:57:04 -07:00
Ahmed Ibrahim
3eb11c10d0 Don't send Exec deltas on apply patch (#2742)
We are now sending exec deltas on apply patch which doesn't make sense.
2025-08-26 19:16:51 -07:00
Wang
c229a67312 feat(core): Add remove_conversation to ConversationManager for ma… (#2613)
### What this PR does

This PR introduces a new public method,
remove_conversation(conversation_id: Uuid), to the ConversationManager.
This allows consumers of the codex-core library to manually remove a
conversation from the manager's in-memory storage.

### Why this change is needed
I am currently adapting the Codex client to run as a long-lived server
application. In this server environment, ConversationManager instances
persist for extended periods, and new conversations are created for each
incoming user request.

The current implementation of ConversationManager stores all created
conversations in a HashMap indefinitely, with no mechanism for removal.
This leads to unbounded memory growth in a server context, as every new
conversation permanently occupies memory.

While an automatic TTL-based cleanup mechanism could be one solution, a
simpler, more direct remove_conversation method provides the necessary
control for my use case. It allows my server application to explicitly
manage the lifecycle of conversations, such as cleaning them up after a
request is fully processed or after a period of inactivity is detected
at the application level.

This change provides a minimal, non-intrusive way to address the memory
management issue for server-like applications built on top of
codex-core, giving developers the flexibility to implement their own
cleanup logic.

Signed-off-by: M4n5ter <m4n5terrr@gmail.com>
Co-authored-by: Michael Bolin <mbolin@openai.com>
2025-08-26 15:16:43 -07:00
Eric Traut
d32e4f25cf Added caps on retry config settings (#2701)
The CLI supports config settings `stream_max_retries` and
`request_max_retries` that allow users to override the default retry
counts (4 and 5, respectively). However, there's currently no cap placed
on these values. In theory, a user could configure an effectively
infinite retry count which could hammer the server. This PR adds a
reasonable cap (currently 100) to both of these values.
2025-08-25 22:51:01 -07:00
Eric Traut
ab9250e714 Improved user message for rate-limit errors (#2695)
This PR improves the error message presented to the user when logged in
with ChatGPT and a rate-limit error occurs. In particular, it provides
the user with information about when the rate limit will be reset. It
removes older code that attempted to do the same but relied on parsing
of error messages that are not generated by the ChatGPT endpoint. The
new code uses newly-added error fields.
2025-08-25 21:42:10 -07:00
Eric Traut
d63e44ae29 Fixed a bug that causes token refresh to not work in a seamless manner (#2699)
This PR fixes a bug in the token refresh logic. Token refresh is
performed in a retry loop so if we receive a 401 error, we refresh the
token, then we go around the loop again and reissue the fetch with a
fresh token. The bug is that we're not using the updated token on the
second and subsequent times through the loop. The result is that we'll
try to refresh the token a few more times until we hit the retry limit
(default of 4). The 401 error is then passed back up to the caller.
Subsequent calls will use the refreshed token, so the problem clears
itself up.

The fix is straightforward — make sure we use the updated auth
information each time through the retry loop.
2025-08-25 19:18:16 -07:00
Jeremy Rose
17e5077507 do not show timeouts as "sandbox error"s (#2587)
🙅🫸
```
✗ Failed (exit -1)
  └ 🧪 cargo test --all-features -q
    sandbox error: command timed out
```

😌👉
```
✗ Failed (exit -1)
  └ 🧪 cargo test --all-features -q
    error: command timed out
```
2025-08-25 17:52:23 -07:00
Gabriel Peal
cb32f9c64e Add auth to send_user_turn (#2688)
It is there for send_user_message but was omitted from send_user_turn.
Presumably this was a mistake
2025-08-25 18:57:20 -04:00
Odysseas Yiakoumis
a6c346b9e1 avoid error when /compact response has no token_usage (#2417) (#2640)
**Context**  
When running `/compact`, `drain_to_completed` would throw an error if
`token_usage` was `None` in `ResponseEvent::Completed`. This made the
command fail even though everything else had succeeded.

**What changed**  
- Instead of erroring, we now just check `if let Some(token_usage)`
before sending the event.
- If it’s missing, we skip it and move on.  

**Why**  
This makes `AgentTask::compact()` behave in the same way as
`AgentTask::spawn()`, which also doesn’t error out when `token_usage`
isn’t available. Keeps things consistent and avoids unnecessary
failures.

**Fixes**  
Closes #2417

---------

Co-authored-by: Ahmed Ibrahim <aibrahim@openai.com>
2025-08-25 18:42:22 +00:00
dependabot[bot]
7d67e54628 chore(deps): bump toml_edit from 0.23.3 to 0.23.4 in /codex-rs (#2665) 2025-08-25 08:20:30 -07:00
Michael Bolin
295ca27e98 fix: Scope ExecSessionManager to Session instead of using global singleton (#2664)
The `SessionManager` in `exec_command` owns a number of
`ExecCommandSession` objects where `ExecCommandSession` has a
non-trivial implementation of `Drop`, so we want to be able to drop an
individual `SessionManager` to help ensure things get cleaned up in a
timely fashion. To that end, we should have one `SessionManager` per
session rather than one global one for the lifetime of the CLI process.
2025-08-24 22:52:49 -07:00
Michael Bolin
7b20db942a fix: build is broken on main; introduce ToolsConfigParams to help fix (#2663)
`ToolsConfig::new()` taking a large number of boolean params was hard to
manage and it finally bit us (see
https://github.com/openai/codex/pull/2660). This changes
`ToolsConfig::new()` so that it takes a struct (and also reduces the
visibility of some members, where possible).
2025-08-24 22:43:42 -07:00
Uhyeon Park
ee2ccb5cb6 Fix cache hit rate by making MCP tools order deterministic (#2611)
Fixes https://github.com/openai/codex/issues/2610

This PR sorts the tools in `get_openai_tools` by name to ensure a
consistent MCP tool order.

Currently, MCP servers are stored in a HashMap, which does not guarantee
ordering. As a result, the tool order changes across turns, effectively
breaking prompt caching in multi-turn sessions.

An alternative solution would be to replace the HashMap with an ordered
structure, but that would require a much larger code change. Given that
it is unrealistic to have so many MCP tools that sorting would cause
performance issues, this lightweight fix is chosen instead.

By ensuring deterministic tool order, this change should significantly
improve cache hit rates and prevent users from hitting usage limits too
quickly. (For reference, my own sessions last week reached the limit
unusually fast, with cache hit rates falling below 1%.)

## Result

After this fix, sessions with MCP servers now show caching behavior
almost identical to sessions without MCP servers.
Without MCP             |  With MCP
:-------------------------:|:-------------------------:
<img width="1368" height="1634" alt="image"
src="https://github.com/user-attachments/assets/26edab45-7be8-4d6a-b471-558016615fc8"
/> | <img width="1356" height="1632" alt="image"
src="https://github.com/user-attachments/assets/5f3634e0-3888-420b-9aaf-deefd9397b40"
/>
2025-08-24 19:56:24 -07:00
ae
8b49346657 fix: update gpt-5 stats (#2649)
- To match what's on <https://platform.openai.com/docs/models/gpt-5>.
2025-08-24 16:45:41 -07:00
dependabot[bot]
e49116a4c5 chore(deps): bump whoami from 1.6.0 to 1.6.1 in /codex-rs (#2497)
Bumps [whoami](https://github.com/ardaku/whoami) from 1.6.0 to 1.6.1.
<details>
<summary>Commits</summary>
<ul>
<li>See full diff in <a
href="https://github.com/ardaku/whoami/commits">compare view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=whoami&package-manager=cargo&previous-version=1.6.0&new-version=1.6.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-24 14:38:30 -07:00
Dylan
4157788310 [apply_patch] disable default freeform tool (#2643)
## Summary
We're seeing some issues in the freeform tool - let's disable by default
until it stabilizes.

## Testing
- [x] Ran locally, confirmed codex-cli could make edits
2025-08-24 11:12:37 -07:00
Jeremy Rose
32bbbbad61 test: faster test execution in codex-core (#2633)
this dramatically improves time to run `cargo test -p codex-core` (~25x
speedup).

before:
```
cargo test -p codex-core  35.96s user 68.63s system 19% cpu 8:49.80 total
```

after:
```
cargo test -p codex-core  5.51s user 8.16s system 63% cpu 21.407 total
```

both tests measured "hot", i.e. on a 2nd run with no filesystem changes,
to exclude compile times.

approach inspired by [Delete Cargo Integration
Tests](https://matklad.github.io/2021/02/27/delete-cargo-integration-tests.html),
we move all test cases in tests/ into a single suite in order to have a
single binary, as there is significant overhead for each test binary
executed, and because test execution is only parallelized with a single
binary.
2025-08-24 11:10:53 -07:00
Reuben Narad
363636f5eb Add web search tool (#2371)
Adds web_search tool, enabling the model to use Responses API web_search
tool.
- Disabled by default, enabled by --search flag
- When --search is passed, exposes web_search_request function tool to
the model, which triggers user approval. When approved, the model can
use the web_search tool for the remainder of the turn
<img width="1033" height="294" alt="image"
src="https://github.com/user-attachments/assets/62ac6563-b946-465c-ba5d-9325af28b28f"
/>

---------

Co-authored-by: easong-openai <easong@openai.com>
2025-08-23 22:58:56 -07:00
Ahmed Ibrahim
957d44918d send-aggregated output (#2364)
We want to send an aggregated output of stderr and stdout so we don't
have to aggregate it stderr+stdout as we lose order sometimes.

---------

Co-authored-by: Gabriel Peal <gpeal@users.noreply.github.com>
2025-08-23 16:54:31 +00:00
Michael Bolin
e3b03eaccb feat: StreamableShell with exec_command and write_stdin tools (#2574) 2025-08-22 18:10:55 -07:00
Ahmed Ibrahim
311ad0ce26 fork conversation from a previous message (#2575)
This can be the underlying logic in order to start a conversation from a
previous message. will need some love in the UI.

Base for building this: #2588
2025-08-22 17:06:09 -07:00
Jeremy Rose
d994019f3f tui: coalesce command output; show unabridged commands in transcript (#2590)
https://github.com/user-attachments/assets/effec7c7-732a-4b61-a2ae-3cb297b6b19b
2025-08-22 16:32:31 -07:00
Ahmed Ibrahim
097782c775 Move models.rs to protocol (#2595)
Moving models.rs to protocol so we can use them in `Codex` operations
2025-08-22 22:18:54 +00:00
Michael Bolin
8ba8089592 fix: prefer sending MCP structuredContent as the function call response, if available (#2594)
Prior to this change, when we got a `CallToolResult` from an MCP server,
we JSON-serialized its `content` field as the `content` to send back to
the model as part of the function call output that we send back to the
model. This meant that we were dropping the `structuredContent` on the
floor.

Though reading
https://modelcontextprotocol.io/specification/2025-06-18/schema#tool, it
appears that if `outputSchema` is specified, then `structuredContent`
should be set, which seems to be a "higher-fidelity" response to the
function call. This PR updates our handling of `CallToolResult` to
prefer using the JSON-serialization of `structuredContent`, if present,
using `content` as a fallback.

Also, it appears that the sense of `success` was inverted prior to this
PR!
2025-08-22 14:10:18 -07:00
Jeremy Rose
57c498159a test: simplify tests in config.rs (#2586)
this is much easier to read, thanks @bolinfest for the suggestion.
2025-08-22 14:04:21 -07:00
Dylan
6f0b499594 [config] Detect git worktrees for project trust (#2585)
## Summary
When resolving our current directory as a project, we want to be a
little bit more clever:
1. If we're in a sub-directory of a git repo, resolve our project
against the root of the git repo
2. If we're in a git worktree, resolve the project against the root of
the git repo

## Testing
- [x] Added unit tests
- [x] Confirmed locally with a git worktree (the one i was using for
this feature)
2025-08-22 13:54:51 -07:00
Dylan
236c4f76a6 [apply_patch] freeform apply_patch tool (#2576)
## Summary
GPT-5 introduced the concept of [custom
tools](https://platform.openai.com/docs/guides/function-calling#custom-tools),
which allow the model to send a raw string result back, simplifying
json-escape issues. We are migrating gpt-5 to use this by default.

However, gpt-oss models do not support custom tools, only normal
functions. So we keep both tool definitions, and provide whichever one
the model family supports.

## Testing
- [x] Tested locally with various models
- [x] Unit tests pass
2025-08-22 13:42:34 -07:00
Eric Traut
dc42ec0eb4 Add AuthManager and enhance GetAuthStatus command (#2577)
This PR adds a central `AuthManager` struct that manages the auth
information used across conversations and the MCP server. Prior to this,
each conversation and the MCP server got their own private snapshots of
the auth information, and changes to one (such as a logout or token
refresh) were not seen by others.

This is especially problematic when multiple instances of the CLI are
run. For example, consider the case where you start CLI 1 and log in to
ChatGPT account X and then start CLI 2 and log out and then log in to
ChatGPT account Y. The conversation in CLI 1 is still using account X,
but if you create a new conversation, it will suddenly (and
unexpectedly) switch to account Y.

With the `AuthManager`, auth information is read from disk at the time
the `ConversationManager` is constructed, and it is cached in memory.
All new conversations use this same auth information, as do any token
refreshes.

The `AuthManager` is also used by the MCP server's GetAuthStatus
command, which now returns the auth method currently used by the MCP
server.

This PR also includes an enhancement to the GetAuthStatus command. It
now accepts two new (optional) input parameters: `include_token` and
`refresh_token`. Callers can use this to request the in-use auth token
and can optionally request to refresh the token.

The PR also adds tests for the login and auth APIs that I recently added
to the MCP server.
2025-08-22 13:10:11 -07:00