Commit Graph

447 Commits

Author SHA1 Message Date
jif-oai
e5fe50d3ce chore: unify cargo versions (#4044)
Unify cargo versions at root
2025-09-22 16:47:01 +00:00
pakrym-oai
14a115d488 Add non_sandbox_test helper (#3880)
Makes tests shorter
2025-09-22 14:50:41 +00:00
dedrisian-oai
5996ee0e5f feat: Add more /review options (#3961)
Adds the following options:

1. Review current changes
2. Review a specific commit
3. Review against a base branch (PR style)
4. Custom instructions

<img width="487" height="330" alt="Screenshot 2025-09-20 at 2 11 36 PM"
src="https://github.com/user-attachments/assets/edb0aaa5-5747-47fa-881f-cc4c4f7fe8bc"
/>

---

\+ Adds the following UI helpers:

1. Makes list selection searchable
2. Adds navigation to the bottom pane, so you could add a stack of
popups
3. Basic custom prompt view
2025-09-21 20:18:35 -07:00
Ahmed Ibrahim
04504d8218 Forward Rate limits to the UI (#3965)
We currently get information about rate limits in the response headers.
We want to forward them to the clients to have better transparency.
UI/UX plans have been discussed and this information is needed.
2025-09-20 21:26:16 -07:00
pakrym-oai
9b18875a42 Use helpers instead of fixtures (#3888)
Move to using test helper method everywhere.
2025-09-19 06:46:25 -07:00
pakrym-oai
881c7978f1 Move responses mocking helpers to a shared lib (#3878)
These are generally useful
2025-09-18 17:53:14 -07:00
Ahmed Ibrahim
a7fda70053 Use a unified shell tell to not break cache (#3814)
Currently, we change the tool description according to the sandbox
policy and approval policy. This breaks the cache when the user hits
`/approvals`. This PR does the following:
- Always use the shell with escalation parameter:
- removes `create_shell_tool_for_sandbox` and always uses unified tool
via `create_shell_tool`
- Reject the func call when the model uses escalation parameter when it
cannot.
2025-09-19 00:08:28 +00:00
Michael Bolin
de64f5f007 fix: update try_parse_word_only_commands_sequence() to return commands in order (#3881)
Incidentally, we had a test for this in
`accepts_multiple_commands_with_allowed_operators()`, but it was
verifying the bad behavior. Oops!
2025-09-18 16:07:38 -07:00
Michael Bolin
8595237505 fix: ensure cwd for conversation and sandbox are separate concerns (#3874)
Previous to this PR, both of these functions take a single `cwd`:


71038381aa/codex-rs/core/src/seatbelt.rs (L19-L25)


71038381aa/codex-rs/core/src/landlock.rs (L16-L23)

whereas `cwd` and `sandbox_cwd` should be set independently (fixed in
this PR).

Added `sandbox_distinguishes_command_and_policy_cwds()` to
`codex-rs/exec/tests/suite/sandbox.rs` to verify this.
2025-09-18 14:37:06 -07:00
dedrisian-oai
62258df92f feat: /review (#3774)
Adds `/review` action in TUI

<img width="637" height="370" alt="Screenshot 2025-09-17 at 12 41 19 AM"
src="https://github.com/user-attachments/assets/b1979a6e-844a-4b97-ab20-107c185aec1d"
/>
2025-09-18 14:14:16 -07:00
Jeremy Rose
b34e906396 Reland "refactor transcript view to handle HistoryCells" (#3753)
Reland of #3538
2025-09-18 20:55:53 +00:00
Jeremy Rose
71038381aa fix error on missing notifications in [tui] (#3867)
Fixes #3811.
2025-09-18 11:25:09 -07:00
jif-oai
277fc6254e chore: use tokio mutex and async function to prevent blocking a worker (#3850)
### Why Use `tokio::sync::Mutex`

`std::sync::Mutex` are not _async-aware_. As a result, they will block
the entire thread instead of just yielding the task. Furthermore they
can be poisoned which is not the case of `tokio` Mutex.
This allows the Tokio runtime to continue running other tasks while
waiting for the lock, preventing deadlocks and performance bottlenecks.

In general, this is preferred in async environment
2025-09-18 18:21:52 +01:00
jif-oai
992b531180 fix: some nit Rust reference issues (#3849)
Fix some small references issue. No behavioural change. Just making the
code cleaner
2025-09-18 18:18:06 +01:00
jif-oai
4a5d6f7c71 Make ESC button work when auto-compaction (#3857)
Only emit a task finished when the compaction comes from a `/compact`
2025-09-18 15:34:16 +00:00
pakrym-oai
d4aba772cb Switch to uuid_v7 and tighten ConversationId usage (#3819)
Make sure conversations have a timestamp.
2025-09-18 14:37:03 +00:00
jif-oai
4c97eeb32a bug: Ignore tests for now (#3777)
Ignore flaky / long tests for now
2025-09-18 10:43:45 +01:00
Thibault Sottiaux
c9505488a1 chore: update "Codex CLI harness, sandboxing, and approvals" section (#3822) 2025-09-17 16:48:20 -07:00
dedrisian-oai
72733e34c4 Add dev message upon review out (#3758)
Proposal: We want to record a dev message like so:

```
{
      "type": "message",
      "role": "user",
      "content": [
        {
          "type": "input_text",
          "text": "<user_action>
  <context>User initiated a review task. Here's the full review output from reviewer model. User may select one or more comments to resolve.</context>
  <action>review</action>
  <results>
  {findings_str}
  </results>
</user_action>"
        }
      ]
    },
```

Without showing in the chat transcript.

Rough idea, but it fixes issue where the user finishes a review thread,
and asks the parent "fix the rest of the review issues" thinking that
the parent knows about it.

### Question: Why not a tool call?

Because the agent didn't make the call, it was a human. + we haven't
implemented sub-agents yet, and we'll need to think about the way we
represent these human-led tool calls for the agent.
2025-09-16 18:43:32 -07:00
dedrisian-oai
7fe4021f95 Review mode core updates (#3701)
1. Adds the environment prompt (including cwd) to review thread
2. Prepends the review prompt as a user message (temporary fix so the
instructions are not replaced on backend)
3. Sets reasoning to low
4. Sets default review model to `gpt-5-codex`
2025-09-16 13:36:51 -07:00
Dylan
11285655c4 fix: Record EnvironmentContext in SendUserTurn (#3678)
## Summary
SendUserTurn has not been correctly handling updates to policies. While
the tui protocol handles this in `Op::OverrideTurnContext`, the
SendUserTurn should be appending `EnvironmentContext` messages when the
sandbox settings change. MCP client behavior should match the cli
behavior, so we update `SendUserTurn` message to match.

## Testing
- [x] Added prompt caching tests
2025-09-16 11:32:20 -07:00
Ahmed Ibrahim
244687303b Persist search items (#3745)
Let's record the search items because they are part of the history.
2025-09-16 18:02:15 +00:00
Dylan
a8026d3846 fix: read-only escalations (#3673)
## Summary
Splitting out this smaller fix from #2694 - fixes the sandbox
permissions so Chat / read-only mode tool definition matches
expectations

## Testing 
- [x] Tested locally

<img width="1271" height="629" alt="Screenshot 2025-09-15 at 2 51 19 PM"
src="https://github.com/user-attachments/assets/fcb247e4-30b6-4199-80d7-a2876d79ad7d"
/>
2025-09-15 19:01:10 -07:00
dependabot[bot]
404c126fc3 chore(deps): bump wildmatch from 2.4.0 to 2.5.0 in /codex-rs (#3619)
Bumps [wildmatch](https://github.com/becheran/wildmatch) from 2.4.0 to
2.5.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/becheran/wildmatch/releases">wildmatch's
releases</a>.</em></p>
<blockquote>
<h2>v2.5.0</h2>
<p><a
href="https://redirect.github.com/becheran/wildmatch/pull/27">becheran/wildmatch#27</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="b39902c120"><code>b39902c</code></a>
chore: Release wildmatch version 2.5.0</li>
<li><a
href="87a8cf4c80"><code>87a8cf4</code></a>
Merge pull request <a
href="https://redirect.github.com/becheran/wildmatch/issues/28">#28</a>
from smichaku/micha/fix-unicode-case-insensitive-matching</li>
<li><a
href="a3ab4903f5"><code>a3ab490</code></a>
fix: Fix unicode matching for non-ASCII characters</li>
<li>See full diff in <a
href="https://github.com/becheran/wildmatch/compare/v2.4.0...v2.5.0">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=wildmatch&package-manager=cargo&previous-version=2.4.0&new-version=2.5.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-15 12:57:17 -07:00
Jeremy Rose
0560079c41 notifications on approvals and turn end (#3329)
uses OSC 9 to notify when a turn ends or approval is required. won't
work in vs code or terminal.app but iterm2/kitty/wezterm supports it :)
2025-09-15 10:22:02 -07:00
ae
5c583fe89b feat: tweak onboarding strings (#3650) 2025-09-15 08:49:37 -07:00
pakrym-oai
b1c291e2bb Add file reference guidelines to gpt-5 prompt (#3651) 2025-09-15 08:35:30 -07:00
Michael Bolin
f037b2fd56 chore: rename (#3648) 2025-09-15 08:17:13 -07:00
Thibault Sottiaux
d60cbed691 fix: add references (#3633) 2025-09-15 07:48:22 -07:00
jimmyfraiture2
d555b68469 fix: race condition unified exec (#3644)
Fix race condition without storing an rx in the session
2025-09-15 06:52:39 -07:00
Thibault Sottiaux
6039f8a126 feat: tighten preset filter, tame storage load logs, enable rollout prompt by default (#3628)
Summary
- common: use exact equality for Swiftfox exclusion to avoid hiding
future slugs that merely contain the substring
- core: treat missing internal_storage.json as expected (debug), warn
only on real IO/parse errors
- tui: drop DEBUG_HIGH gate; always consider showing rollout prompt, but
suppress under ApiKey auth mode
2025-09-14 23:05:41 -07:00
Ahmed Ibrahim
50262a44ce Show abort in the resume (#3629)
Show abort error when resuming a session
2025-09-15 05:24:30 +00:00
easong-openai
6a8e743d57 initial mcp add interface (#3543)
Adds `codex mcp add`, `codex mcp list`, `codex mcp remove`. Currently writes to global config.
2025-09-15 04:30:56 +00:00
Thibault Sottiaux
a797051921 chore: update swiftfox_prompt.md (#3624) 2025-09-15 04:10:35 +00:00
Ahmed Ibrahim
26f1246a89 Revert "refactor transcript view to handle HistoryCells" (#3614)
Reverts openai/codex#3538
It panics on forking first message. It also calculates the index in a
wrong way.
2025-09-15 03:39:36 +00:00
Eric Traut
900bb01486 When logging in using ChatGPT, make sure to overwrite API key (#3611)
When logging in using ChatGPT using the `codex login` command, a
successful login should write a new `auth.json` file with the ChatGPT
token information. The old code attempted to retain the API key and
merge the token information into the existing `auth.json` file. With the
new simplified login mechanism, `auth.json` should have auth information
for only ChatGPT or API Key, not both.

The `codex login --api-key <key>` code path was already doing the right
thing here, but the `codex login` command was incorrect. This PR fixes
the problem and adds test cases for both commands.
2025-09-14 19:48:18 -07:00
Dylan
b6673838e8 fix: model family and apply_patch consistency (#3603)
## Summary
Resolves a merge conflict between #3597 and #3560, and adds tests to
double check our apply_patch configuration.

## Testing
- [x] Added unit tests

---------

Co-authored-by: dedrisian-oai <dedrisian@openai.com>
2025-09-14 18:20:37 -07:00
Fouad Matin
5185d69f13 fix(core): flaky test completed_commands_do_not_persist_sessions (#3596)
Fix flaky test:
```
        FAIL [   2.641s] codex-core unified_exec::tests::completed_commands_do_not_persist_sessions
  stdout ───

    running 1 test
    test unified_exec::tests::completed_commands_do_not_persist_sessions ... FAILED

    failures:

    failures:
        unified_exec::tests::completed_commands_do_not_persist_sessions

    test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 235 filtered out; finished in 2.63s
    
  stderr ───

    thread 'unified_exec::tests::completed_commands_do_not_persist_sessions' panicked at core/src/unified_exec/mod.rs:582:9:
    assertion failed: result.output.contains("codex")
```
2025-09-14 18:04:05 -07:00
dedrisian-oai
2aa84b8891 Fix EventMsg Optional (#3604) 2025-09-15 00:34:33 +00:00
pakrym-oai
9177bdae5e Only one branch for swiftfox (#3601)
Make each model family have a single branch.
2025-09-14 16:56:22 -07:00
Ahmed Ibrahim
a30e5e40ee enable-resume (#3537)
Adding the ability to resume conversations.
we have one verb `resume`. 

Behavior:

`tui`:
`codex resume`: opens session picker
`codex resume --last`: continue last message
`codex resume <session id>`: continue conversation with `session id`

`exec`:
`codex resume --last`: continue last conversation
`codex resume <session id>`: continue conversation with `session id`

Implementation:
- I added a function to find the path in `~/.codex/sessions/` with a
`UUID`. This is helpful in resuming with session id.
- Added the above mentioned flags
- Added lots of testing
2025-09-14 19:33:19 -04:00
dedrisian-oai
b2f6fc3b9a Fix flaky windows test (#3564)
There are exactly 4 types of flaky tests in Windows x86 right now:

1. `review_input_isolated_from_parent_history` => Times out waiting for
closing events
2. `review_does_not_emit_agent_message_on_structured_output` => Times
out waiting for closing events
3. `auto_compact_runs_after_token_limit_hit` => Times out waiting for
closing events
4. `auto_compact_runs_after_token_limit_hit` => Also has a problem where
auto compact should add a third request, but receives 4 requests.

1, 2, and 3 seem to be solved with increasing threads on windows runner
from 2 -> 4.

Don't know yet why # 4 is happening, but probably also because of
WireMock issues on windows causing races.
2025-09-14 23:20:25 +00:00
pakrym-oai
51f88fd04a Fix swiftfox model selector (#3598)
The model shouldn't be saved with a suffix. The effort is a separate
field.
2025-09-14 23:12:21 +00:00
pakrym-oai
916fdc2a37 Add per-model-family prompts (#3597)
Allows more flexibility in defining prompts.
2025-09-14 22:45:15 +00:00
pakrym-oai
863d9c237e Include command output when sending timeout to model (#3576)
Being able to see the output helps the model decide how to handle the
timeout.
2025-09-14 14:38:26 -07:00
Ahmed Ibrahim
bbea6bbf7e Handle resuming/forking after compact (#3533)
We need to construct the history different when compact happens. For
this, we need to just consider the history after compact and convert
compact to a response item.

This needs to change and use `build_compact_history` when this #3446 is
merged.
2025-09-14 13:23:31 +00:00
Jeremy Rose
4891ee29c5 refactor transcript view to handle HistoryCells (#3538)
No (intended) functional change.

This refactors the transcript view to hold a list of HistoryCells
instead of a list of Lines. This simplifies and makes much of the logic
more robust, as well as laying the groundwork for future changes, e.g.
live-updating history cells in the transcript.

Similar to #2879 in goal. Fixes #2755.
2025-09-13 19:23:14 -07:00
Thibault Sottiaux
bac8a427f3 chore: default swiftfox models to experimental reasoning summaries (#3560) 2025-09-13 23:40:54 +00:00
Thibault Sottiaux
14ab1063a7 chore: rename 2025-09-12 23:17:41 -07:00
Thibault Sottiaux
19b4ed3c96 w 2025-09-12 22:44:05 -07:00