Commit Graph

854 Commits

Author SHA1 Message Date
dependabot[bot]
f9d3dde478 chore(deps): bump anyhow from 1.0.98 to 1.0.99 in /codex-rs (#2405)
Bumps [anyhow](https://github.com/dtolnay/anyhow) from 1.0.98 to 1.0.99.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/dtolnay/anyhow/releases">anyhow's
releases</a>.</em></p>
<blockquote>
<h2>1.0.99</h2>
<ul>
<li>Allow build-script cleanup failure with NFSv3 output directory to be
non-fatal (<a
href="https://redirect.github.com/dtolnay/anyhow/issues/420">#420</a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="f2b963a759"><code>f2b963a</code></a>
Release 1.0.99</li>
<li><a
href="2c64c15e75"><code>2c64c15</code></a>
Merge pull request <a
href="https://redirect.github.com/dtolnay/anyhow/issues/420">#420</a>
from dtolnay/enotempty</li>
<li><a
href="8cf66f7936"><code>8cf66f7</code></a>
Allow build-script cleanup failure with NFSv3 output directory to be
non-fatal</li>
<li><a
href="f5e145c683"><code>f5e145c</code></a>
Revert &quot;Pin nightly toolchain used for miri job&quot;</li>
<li><a
href="1d7ef1db54"><code>1d7ef1d</code></a>
Update ui test suite to nightly-2025-06-30</li>
<li><a
href="69295727ce"><code>6929572</code></a>
Update ui test suite to nightly-2025-06-18</li>
<li><a
href="37224e3142"><code>37224e3</code></a>
Ignore mismatched_lifetime_syntaxes lint</li>
<li><a
href="11f0e81aaf"><code>11f0e81</code></a>
Pin nightly toolchain used for miri job</li>
<li><a
href="d04c999d63"><code>d04c999</code></a>
Raise required compiler for backtrace feature to rust 1.82</li>
<li><a
href="219d16330d"><code>219d163</code></a>
Update test suite to nightly-2025-05-01</li>
<li>See full diff in <a
href="https://github.com/dtolnay/anyhow/compare/1.0.98...1.0.99">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=anyhow&package-manager=cargo&previous-version=1.0.98&new-version=1.0.99)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-18 16:01:58 -07:00
Kazuhiro Sera
db30a6f5d8 Fix #2391 Add Ctrl+H as backspace keyboard shortcut (#2412)
This pull request resolves #2391. ctrl + h is not assigned to any other
operations at this moment, and this feature request sounds valid to me.
If we don't prefer having this, please feel free to close this.
2025-08-18 16:00:29 -07:00
Ahmed Ibrahim
ecb388045c Add cache tests for UserTurn (#2432) 2025-08-18 21:28:09 +00:00
Michael Bolin
fc6cfd5ecc protocol-ts (#2425) 2025-08-18 13:08:53 -07:00
Ahmed Ibrahim
c283f9f6ce Add an operation to override current task context (#2431)
- Added an operation to override current task context
- Added a test to check that cache stays the same
2025-08-18 19:59:19 +00:00
Ahmed Ibrahim
c9963b52e9 consolidate reasoning enums into one (#2428)
We have three enums for each of reasoning summaries and reasoning effort
with same values. They can be consolidated into one.
2025-08-18 11:50:17 -07:00
Michael Bolin
a4f76bd75a chore: add TS annotation to generated mcp-types (#2424)
Adds the `TS` annotation from https://crates.io/crates/ts-rs to all
types to facilitate codegen.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2424).
* __->__ #2424
* #2423
2025-08-18 09:38:47 -07:00
Michael Bolin
712bfa04ac chore: move mcp-server/src/wire_format.rs to protocol/src/mcp_protocol.rs (#2423)
The existing `wire_format.rs` should share more types with the
`codex-protocol` crate (like `AskForApproval` instead of maintaining a
parallel `CodexToolCallApprovalPolicy` enum), so this PR moves
`wire_format.rs` into `codex-protocol`, renaming it as
`mcp-protocol.rs`. We also de-dupe types, where appropriate.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2423).
* #2424
* __->__ #2423
2025-08-18 09:36:57 -07:00
ae
da69d50c60 fix: stop using ANSI blue (#2421)
- One less color.
- Replaced with cyan which looks better next to other cyan components.
2025-08-18 16:02:25 +00:00
dependabot[bot]
be6a4faa45 chore(deps-dev): bump @types/node from 24.2.1 to 24.3.0 in /.github/actions/codex (#2411)
Bumps
[@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node)
from 24.2.1 to 24.3.0.
<details>
<summary>Commits</summary>
<ul>
<li>See full diff in <a
href="https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=@types/node&package-manager=bun&previous-version=24.2.1&new-version=24.3.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-18 08:33:17 -07:00
ae
5bce369c4d fix: clean up styles & colors and define in styles.md (#2401)
New style guide:

  # Headers, primary, and secondary text
  
- **Headers:** Use `bold`. For markdown with various header levels,
leave in the `#` signs.
  - **Primary text:** Default.
  - **Secondary text:** Use `dim`.
  
  # Foreground colors
  
- **Default:** Most of the time, just use the default foreground color.
`reset` can help get it back.
- **Selection:** Use ANSI `blue`. (Ed & AE want to make this cyan too,
but we'll do that in a followup since it's riskier in different themes.)
  - **User input tips and status indicators:** Use ANSI `cyan`.
  - **Success and additions:** Use ANSI `green`.
  - **Errors, failures and deletions:** Use ANSI `red`.
  - **Codex:** Use ANSI `magenta`.
  
  # Avoid
  
- Avoid custom colors because there's no guarantee that they'll contrast
well or look good on various terminal color themes.
- Avoid ANSI `black`, `white`, `yellow` as foreground colors because the
terminal theme will do a better job. (Use `reset` if you need to in
order to get those.) The exception is if you need contrast rendering
over a manually colored background.
  
  (There are some rules to try to catch this in `clippy.toml`.)

# Testing

Tested in a variety of light and dark color themes in Terminal, iTerm2, and Ghostty.
2025-08-18 08:26:29 -07:00
Michael Bolin
a269754668 remove mcp-server/src/mcp_protocol.rs and the code that depends on it (#2360) 2025-08-18 00:29:18 -07:00
Michael Bolin
b581498882 fix: introduce EventMsg::TurnAborted (#2365)
Introduces `EventMsg::TurnAborted` that should be sent in response to
`Op::Interrupt`.

In the MCP server, updates the handling of a
`ClientRequest::InterruptConversation` request such that it sends the
`Op::Interrupt` but does not respond to the request until it sees an
`EventMsg::TurnAborted`.
2025-08-17 21:40:31 -07:00
Michael Bolin
71cae06e66 fix: refactor login/src/server.rs so process_request() is a separate function (#2388) 2025-08-17 12:32:56 -07:00
Eric Traut
350b00d54b Added MCP server command to enable authentication using ChatGPT (#2373)
This PR adds two new APIs for the MCP server: 1) loginChatGpt, and 2)
cancelLoginChatGpt. The first starts a login server and returns a local
URL that allows for browser-based authentication, and the second
provides a way to cancel the login attempt. If the login attempt
succeeds, a notification (in the form of an event) is sent to a
subscriber.

I also added a timeout mechanism for the existing login server. The
loginChatGpt code path uses a 10-minute timeout by default, so if the
user fails to complete the login flow in that timeframe, the login
server automatically shuts down. I tested the timeout code by manually
setting the timeout to a much lower number and confirming that it works
as expected when used e2e.
2025-08-17 10:03:52 -07:00
Eric Traut
1930ee720a Added launch profile for attaching to a running codex CLI process (#2372) 2025-08-15 23:35:01 -07:00
Jeremy Rose
7a80d3c96c replace /prompts with a rotating placeholder (#2314) 2025-08-15 19:37:10 -07:00
aibrahim-oai
d3078b9adc Show progress indicator for /diff command (#2245)
## Summary
- Show a temporary Working on diff state in the bottom pan 
- Add `DiffResult` app event and dispatch git diff asynchronously

## Testing
- `just fmt`
- `just fix` *(fails: `let` expressions in this position are unstable)*
- `cargo test --all-features` *(fails: `let` expressions in this
position are unstable)*

------
https://chatgpt.com/codex/tasks/task_i_689a839f32b88321840a893551d5fbef
2025-08-15 15:32:41 -07:00
Michael Bolin
379106d3eb fix: include an entry for windows-x86_64 in the generated DotSlash file (#2361)
Now that we are improving our Windows support, we should be including an
entry for it in the DotSlash file.

Though anyone using the DotSlash file with Windows should use the new
Windows shim introduced in https://github.com/facebook/dotslash/pull/46.
For more info, see https://dotslash-cli.com/docs/windows/.
2025-08-15 14:47:36 -07:00
LongYinan
b31c5033a9 chore: remove duplicated lockfile (#2336) 2025-08-15 13:54:47 -07:00
Jeremy Rose
1ad8ae2579 color the status letter in apply patch summary (#2337)
<img width="440" height="77" alt="Screenshot 2025-08-14 at 8 30 30 PM"
src="https://github.com/user-attachments/assets/c6169a3a-2e98-4ace-b7ee-918cf4368b7a"
/>
2025-08-15 20:25:48 +00:00
pakrym-oai
c1156a878b Remove duplicated "Successfully logged in message" (#2357) 2025-08-15 13:01:27 -07:00
Kazuhiro Sera
dcfdd2faf5 Fix #2296 Add "minimal" reasoning effort for GPT 5 models (#2326)
This pull request resolves #2296; I've confirmed if it works by:

1. Add settings to ~/.codex/config.toml:
```toml
model_reasoning_effort = "minimal"
```

2. Run the CLI:
```
cd codex-rs
cargo build && RUST_LOG=trace cargo run --bin codex
/status
tail -f ~/.codex/log/codex-tui.log
```

Co-authored-by: pakrym-oai <pakrym@openai.com>
2025-08-15 12:59:52 -07:00
Michael Bolin
d262244725 fix: introduce codex-protocol crate (#2355) 2025-08-15 12:44:40 -07:00
Jeremy Rose
7c26c8e091 tui: skip identical consecutive entries in local composer history (#2352)
This PR avoids inserting duplicate consecutive messages into the Chat
Composer's local history.
2025-08-15 10:55:44 -07:00
Michael Bolin
eda50d8372 feat: introduce ClientRequest::SendUserTurn (#2345)
This adds a new request type, `SendUserTurn`, that makes it possible to
submit a `Op::UserTurn` operation (introduced in #2329) to a
conversation. This PR also adds a new integration test that verifies
that changing from `AskForApproval::UnlessTrusted` to
`AskForApproval::Never` mid-conversation ensures that an elicitation is
no longer sent for running `python3 -c print(42)`.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2345).
* __->__ #2345
* #2329
* #2343
* #2340
* #2338
2025-08-15 10:05:58 -07:00
Michael Bolin
17aa394ae7 feat: introduce Op:UserTurn (#2329)
This introduces `Op::UserTurn`, which makes it possible to override many
of the fields that were set when the `Session` was originally created
when creating a new conversation turn. This is one way we could support
changing things like `model` or `cwd` in the middle of the conversation,
though we may want to consider making each field optional, or
alternatively having a separate `Op` that mutates the `TurnContext`
associated with a `submission_loop()`.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2329).
* #2345
* __->__ #2329
* #2343
* #2340
* #2338
2025-08-15 09:56:05 -07:00
Michael Bolin
13ed67cfc1 feat: introduce TurnContext (#2343)
This PR introduces `TurnContext`, which is designed to hold a set of
fields that should be constant for a turn of a conversation. Note that
the fields of `TurnContext` were previously governed by `Session`.

Ultimately, we want to enable users to change these values between turns
(changing model, approval policy, etc.), though in the current
implementation, the `TurnContext` is constant for the entire
conversation.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2345).
* #2345
* #2329
* __->__ #2343
* #2340
* #2338
2025-08-15 09:40:02 -07:00
Jeremy Rose
45d6c74682 tui: align diff display by always showing sign char and keeping fixed gutter (#2353)
diff lines without a sign char were misaligned.
2025-08-15 09:32:45 -07:00
Michael Bolin
265fd89e31 fix: try to fix flakiness in test_shell_command_approval_triggers_elicitation (#2344)
I still see flakiness in
`test_shell_command_approval_triggers_elicitation()` on occasion where
`MockServer` claims it has not received all of its expected requests.

I recently introduced a similar type of test in #2264,
`test_codex_jsonrpc_conversation_flow()`, which I have not seen flake
(yet!), so this PR pulls over two things I did in that test:

- increased `worker_threads` from `2` to `4`
- added an assertion to make sure the `task_complete` notification is
received

Honestly, I'm still not sure why `MockServer` claims it sometimes does
not receive all its expected requests given that we assert that the
final `JSONRPCResponse` is read on the stream, but let's give this a
shot.

Assuming this fixes things, my hypothesis is that the increase in
`worker_threads` helps because perhaps there are async tasks in
`MockServer` that do not reliably complete fully when there are not
enough threads available? If that is correct, it seems like the test
would still be flaky, though perhaps with lower frequency?
2025-08-15 09:17:20 -07:00
Michael Bolin
6730592433 fix: introduce MutexExt::lock_unchecked() so we stop ignoring unwrap() throughout codex.rs (#2340)
This way we are sure a dangerous `unwrap()` does not sneak in!

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2340).
* #2345
* #2329
* #2343
* __->__ #2340
* #2338
2025-08-15 09:14:44 -07:00
Michael Bolin
26c8373821 fix: tighten up checks against writable folders for SandboxPolicy (#2338)
I was looking at the implementation of `Session::get_writable_roots()`,
which did not seem right, as it was a copy of writable roots, which is
not guaranteed to be in sync with the `sandbox_policy` field.

I looked at who was calling `get_writable_roots()` and its only call
site was `apply_patch()` in `codex-rs/core/src/apply_patch.rs`, which
took the roots and forwarded them to `assess_patch_safety()` in
`safety.rs`. I updated `assess_patch_safety()` to take `sandbox_policy:
&SandboxPolicy` instead of `writable_roots: &[PathBuf]` (and replaced
`Session::get_writable_roots()` with `Session::get_sandbox_policy()`).

Within `safety.rs`, it was fairly easy to update
`is_write_patch_constrained_to_writable_paths()` to work with
`SandboxPolicy`, and in particular, it is far more accurate because, for
better or worse, `SandboxPolicy::get_writable_roots_with_cwd()` _returns
an empty vec_ for `SandboxPolicy::DangerFullAccess`, suggesting that
_nothing_ is writable when in reality _everything_ is writable. With
this PR, `is_write_patch_constrained_to_writable_paths()` now does the
right thing for each variant of `SandboxPolicy`.

I thought this would be the end of the story, but it turned out that
`test_writable_roots_constraint()` in `safety.rs` needed to be updated,
as well. In particular, the test was writing to
`std::env::current_dir()` instead of a `TempDir`, which I suspect was a
holdover from earlier when `SandboxPolicy::WorkspaceWrite` would always
make `TMPDIR` writable on macOS, which made it hard to write tests to
verify `SandboxPolicy` in `TMPDIR`. Fortunately, we now have
`exclude_tmpdir_env_var` as an option on
`SandboxPolicy::WorkspaceWrite`, so I was able to update the test to
preserve the existing behavior, but to no longer write to
`std::env::current_dir()`.







---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2338).
* #2345
* #2329
* #2343
* #2340
* __->__ #2338
2025-08-15 09:06:15 -07:00
Dylan
6df8e35314 [tools] Add apply_patch tool (#2303)
## Summary
We've been seeing a number of issues and reports with our synthetic
`apply_patch` tool, e.g. #802. Let's make this a real tool - in my
anecdotal testing, it's critical for GPT-OSS models, but I'd like to
make it the standard across GPT-5 and codex models as well.

## Testing
- [x] Tested locally
- [x] Integration test
2025-08-15 11:55:53 -04:00
Jeremy Rose
917e29803b tui: include optional full command line in history display (#2334)
Add env var to show the raw, unparsed command line under parsed
commands. When we have transcript mode we should show the full command
there, but this is useful for debugging.
2025-08-14 22:06:42 -07:00
pakrym-oai
5552688621 Format multiline commands (#2333)
<img width="966" height="729" alt="image"
src="https://github.com/user-attachments/assets/fa45b7e1-cd46-427f-b2bc-8501e9e4760b"
/>
<img width="797" height="530" alt="image"
src="https://github.com/user-attachments/assets/6993eec5-e157-4df7-b558-15643ad10d64"
/>
2025-08-14 19:49:42 -07:00
pakrym-oai
76df07350a Cleanup rust login server a bit more (#2331)
Remove some extra abstractions.

---------

Co-authored-by: easong-openai <easong@openai.com>
2025-08-14 19:42:14 -07:00
easong-openai
d0b907d399 re-implement session id in status (#2332)
Basically the same thing as https://github.com/openai/codex/pull/2297
2025-08-15 02:14:46 +00:00
Parker Thompson
a075424437 Added allow-expect-in-tests / allow-unwrap-in-tests (#2328)
This PR:
* Added the clippy.toml to configure allowable expect / unwrap usage in
tests
* Removed as many expect/allow lines as possible from tests
* moved a bunch of allows to expects where possible

Note: in integration tests, non `#[test]` helper functions are not
covered by this so we had to leave a few lingering `expect(expect_used`
checks around
2025-08-14 17:59:01 -07:00
Jeremy Rose
8bdb4521c9 AGENTS.md more strongly suggests running targeted tests first (#2306) 2025-08-15 00:51:32 +00:00
Michael Bolin
dd63d61a59 fix: trying to simplify rust-ci.yml (#2327)
It turns out that https://github.com/openai/codex/pull/2324 did not
quite work as intended. Chat's new idea is to have this catch-all "CI
results" job and update our branch protection rules to require this
instead.
2025-08-14 17:44:10 -07:00
Parker Thompson
c26d42ab69 Fix AF_UNIX, sockpair, recvfrom in linux sandbox (#2309)
When using codex-tui on a linux system I was unable to run `cargo
clippy` inside of codex due to:
```
[pid 3548377] socketpair(AF_UNIX, SOCK_SEQPACKET|SOCK_CLOEXEC, 0,  <unfinished ...>
[pid 3548370] close(8 <unfinished ...>
[pid 3548377] <... socketpair resumed>0x7ffb97f4ed60) = -1 EPERM (Operation not permitted)
```
And
```
3611300 <... recvfrom resumed>0x708b8b5cffe0, 8, 0, NULL, NULL) = -1 EPERM (Operation not permitted)
```

This PR:
* Fixes a bug that disallowed AF_UNIX to allow it on `socket()`
* Adds recvfrom() to the syscall allow list, this should be fine since
we disable opening new sockets. But we should validate there is not a
open socket inheritance issue.
* Allow socketpair to be called for AF_UNIX
* Adds tests for AF_UNIX components
* All of which allows running `cargo clippy` within the sandbox on
linux, and possibly other tooling using a fork server model + AF_UNIX
comms.
2025-08-14 17:12:41 -07:00
easong-openai
e9b597cfa3 Port login server to rust (#2294)
Port the login server to rust.

---------

Co-authored-by: pakrym-oai <pakrym@openai.com>
2025-08-14 17:11:26 -07:00
Jeremy Rose
afc377bae5 clear running commands in various places (#2325)
we have a very unclear lifecycle for the chatwidget—this should only
have to be added in one place! but this fixes the "hanging commands"
issue where the active_exec_cell wasn't correctly cleared when commands
finished.

To repro w/o this PR:
1. prompt "run sleep 10"
2. once the command starts running, press <kbd>Esc</kbd>
3. prompt "run echo hi"

Expected: 

```
✓ Completed
  └ ⌨️ echo hi

codex
hi
```

Actual:

```
⚙︎ Working
  └ ⌨️ echo hi

▌ Ask Codex to do anything
```

i.e. the "Working" never changes to "Completed".

The bug is fixed with this PR.
2025-08-15 00:01:19 +00:00
Michael Bolin
333803ed04 fix: ensure rust-ci always "runs" when a PR is submitted (#2324)
Our existing path filters for `rust-ci.yml`:


235987843c/.github/workflows/rust-ci.yml (L1-L11)

made it so that PRs that touch only `README.md` would not trigger those
builds, which is a problem because our branch protection rules are set
as follows:

<img width="1569" height="1883" alt="Screenshot 2025-08-14 at 4 45
59 PM"
src="https://github.com/user-attachments/assets/5a61f8cc-cdaf-4341-abda-7faa7b46dbd4"
/>

With the existing setup, a change to `README.md` would get stuck in
limbo because not all the CI jobs required to merge would get run. It
turns out that we need to "run" all the jobs, but make them no-ops when
the `codex-rs` and `.github` folders are untouched to get the best of
both worlds.

I asked chat how to fix this, as we want CI to be fast for
documentation-only changes. It had two suggestions:

- Use https://github.com/dorny/paths-filter or some other third-party
action.
- Write an inline Bash script to avoid a third-party dependency.

This PR takes the latter approach so that we are clear about what we're
running in CI.
2025-08-14 17:00:19 -07:00
Jeremy Rose
235987843c add a timer to running exec commands (#2321)
sometimes i switch back to codex and i don't know how long a command has
been running.

<img width="744" height="462" alt="Screenshot 2025-08-14 at 3 30 07 PM"
src="https://github.com/user-attachments/assets/bd80947f-5a47-43e6-ad19-69c2995a2a29"
/>
2025-08-14 19:32:45 -04:00
Michael Bolin
6a0f709cff fix: add call_id to ApprovalParams in mcp-server/src/wire_format.rs (#2322)
Clients still need this field.
2025-08-14 16:09:12 -07:00
Michael Bolin
2ecca79663 fix: run python_multiprocessing_lock_works integration test on Mac and Linux (#2318)
The high-order bit on this PR is that it makes it so `sandbox.rs` tests
both Mac and Linux, as we introduce a general
`spawn_command_under_sandbox()` function with platform-specific
implementations for testing.

An important, and interesting, discovery in porting the test to Linux is
that (for reasons cited in the code comments), `/dev/shm` has to be
added to `writable_roots` on Linux in order for `multiprocessing.Lock`
to work there. Granting write access to `/dev/shm` comes with some
degree of risk, so we do not make this the default for Codex CLI.

Piggybacking on top of #2317, this moves the
`python_multiprocessing_lock_works` test yet again, moving
`codex-rs/core/tests/sandbox.rs` to `codex-rs/exec/tests/sandbox.rs`
because in `codex-rs/exec/tests` we can use `cargo_bin()` like so:

```
let codex_linux_sandbox_exe = assert_cmd::cargo::cargo_bin("codex-exec");
```

which is necessary so we can use `codex_linux_sandbox_exe` and therefore
`spawn_command_under_linux_sandbox` in an integration test.

This also moves `spawn_command_under_linux_sandbox()` out of `exec.rs`
and into `landlock.rs`, which makes things more consistent with
`seatbelt.rs` in `codex-core`.

For reference, https://github.com/openai/codex/pull/1808 is the PR that
made the change to Seatbelt to get this test to pass on Mac.
2025-08-14 15:47:48 -07:00
Michael Bolin
a8c7f5391c fix: move general sandbox tests to codex-rs/core/tests/sandbox.rs (#2317)
Previous to this PR, `codex-rs/core/tests/sandbox.rs` contained
integration tests that were specific to Seatbelt. This PR moves those
tests to `codex-rs/core/src/seatbelt.rs` and designates
`codex-rs/core/tests/sandbox.rs` to be used as the home for
cross-platform (well, Mac and Linux...) sandbox tests.

To start, this migrates
`python_multiprocessing_lock_works_under_seatbelt()` from #1823 to the
new `sandbox.rs` because this is the type of thing that should work on
both Mac _and_ Linux, though I still need to do some work to clean up
the test so it works on both platforms.
2025-08-14 14:48:38 -07:00
David Z Hao
992e81d9b5 test(core): add seatbelt sem lock tests (#1823)
## Summary
- add a unit test to ensure the macOS seatbelt policy allows POSIX
semaphores
- add a macOS-only test that runs a Python multiprocessing Lock under
Seatbelt

## Testing
- `cargo test -p codex_core seatbelt_base_policy_allows_ipc_posix_sem
--no-fail-fast` (failed: failed to download from
`https://static.crates.io/crates/tokio-stream/0.1.17/download`)
- `cargo test -p codex_core seatbelt_base_policy_allows_ipc_posix_sem
--no-fail-fast --offline` (failed: attempting to make an HTTP request,
but --offline was specified)
- `cargo test --all-features --no-fail-fast --offline` (failed:
attempting to make an HTTP request, but --offline was specified)
- `just fmt` (failed: command not found: just)
- `just fix` (failed: command not found: just)

Ran tests locally to confirm it passes on master and failed before my
previous change

------
https://chatgpt.com/codex/tasks/task_i_6890f221e0a4833381cfb53e11499bcc
2025-08-14 14:23:06 -07:00
Jeremy Rose
7038827bf4 fix bash commands being incorrectly quoted in display (#2313)
The "display format" of commands was sometimes producing incorrect
quoting like `echo foo '>' bar`, which is importantly different from the
actual command that was being run. This refactors ParsedCommand to have
a string in `cmd` instead of a vec, as a `vec` can't accurately capture
a full command.
2025-08-14 17:08:29 -04:00