Commit Graph

215 Commits

Author SHA1 Message Date
Michael Bolin
5907422d65 feat: annotate conversations with model_provider for filtering (#5658)
Because conversations that use the Responses API can have encrypted
reasoning messages, trying to resume a conversation with a different
provider could lead to confusing "failed to decrypt" errors. (This is
reproducible by starting a conversation using ChatGPT login and resuming
it as a conversation that uses OpenAI models via Azure.)

This changes `ListConversationsParams` to take a `model_providers:
Option<Vec<String>>` and adds `model_provider` on each
`ConversationSummary` it returns so these cases can be disambiguated.

Note this ended up making changes to
`codex-rs/core/src/rollout/tests.rs` because it had a number of cases
where it expected `Some` for the value of `next_cursor`, but the list of
rollouts was complete, so according to this docstring:


bcd64c7e72/codex-rs/app-server-protocol/src/protocol.rs (L334-L337)

If there are no more items to return, then `next_cursor` should be
`None`. This PR updates that logic.






---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/5658).
* #5803
* #5793
* __->__ #5658
2025-10-27 02:03:30 -07:00
Ahmed Ibrahim
f178805252 Add feedback upload request handling (#5682) 2025-10-27 05:53:39 +00:00
jif-oai
80783a7bb9 fix: flaky tests (#5625) 2025-10-24 13:56:41 +01:00
jif-oai
a6b9471548 feat: end events on unified exec (#5551) 2025-10-23 18:51:34 +01:00
Thibault Sottiaux
3059373e06 fix: resume lookup for gitignored CODEX_HOME (#5311)
Walk the sessions tree instead of using file_search so gitignored
CODEX_HOME directories can resume sessions. Add a regression test that
covers a .gitignore'd sessions directory.

Fixes #5247
Fixes #5412

---------

Co-authored-by: Owen Lin <owen@openai.com>
2025-10-23 17:04:40 +00:00
jif-oai
6745b12427 chore: testing on apply_path (#5557) 2025-10-23 17:00:48 +01:00
Ahmed Ibrahim
f59978ed3d Handle cancelling/aborting while processing a turn (#5543)
Currently we collect all all turn items in a vector, then we add it to
the history on success. This result in losing those items on errors
including aborting `ctrl+c`.

This PR:
- Adds the ability for the tool call to handle cancellation
- bubble the turn items up to where we are recording this info

Admittedly, this logic is an ad-hoc logic that doesn't handle a lot of
error edge cases. The right thing to do is recording to the history on
the spot as `items`/`tool calls output` come. However, this isn't
possible because of having different `task_kind` that has different
`conversation_histories`. The `try_run_turn` has no idea what thread are
we using. We cannot also pass an `arc` to the `conversation_histories`
because it's a private element of `state`.

That's said, `abort` is the most common case and we should cover it
until we remove `task kind`
2025-10-23 08:47:10 -07:00
Ahmed Ibrahim
273819aaae Move changing turn input functionalities to ConversationHistory (#5473)
We are doing some ad-hoc logic while dealing with conversation history.
Ideally, we shouldn't mutate `vec[responseitem]` manually at all and
should depend on `ConversationHistory` for those changes.

Those changes are:
- Adding input to the history
- Removing items from the history
- Correcting history

I am also adding some `error` logs for cases we shouldn't ideally face.
For example, we shouldn't be missing `toolcalls` or `outputs`. We
shouldn't hit `ContextWindowExceeded` while performing `compact`

This refactor will give us granular control over our context management.
2025-10-22 13:08:46 -07:00
pakrym-oai
3c90728a29 Add new thread items and rewire event parsing to use them (#5418)
1. Adds AgentMessage,  Reasoning,  WebSearch items.
2. Switches the ResponseItem parsing to use new items and then also emit
3. Removes user-item kind and filters out "special" (environment) user
items when returning to clients.
2025-10-22 10:14:50 -07:00
jif-oai
00b1e130b3 chore: align unified_exec (#5442)
Align `unified_exec` with b implementation
2025-10-22 11:50:18 +01:00
Michael Bolin
404cae7d40 feat: add experimental_bearer_token option to model provider definition (#5467)
While we do not want to encourage users to hardcode secrets in their
`config.toml` file, it should be possible to pass an API key
programmatically. For example, when using `codex app-server`, it is
possible to pass a "bag of configuration" as part of the
`NewConversationParams`:

682d05512f/codex-rs/app-server-protocol/src/protocol.rs (L248-L251)

When using `codex app-server`, it's not practical to change env vars of
the `codex app-server` process on the fly (which is how we usually read
API key values), so this helps with that.
2025-10-21 14:02:56 -07:00
pakrym-oai
5cd8803998 Add a baseline test for resume initial messages (#5466) 2025-10-21 11:45:01 -07:00
jif-oai
4bd68e4d9e feat: emit events for unified_exec (#5448) 2025-10-21 17:32:39 +01:00
pakrym-oai
1b10a3a1b2 Enable plan tool by default (#5384)
## Summary
- make the plan tool available by default by removing the feature flag
and always registering the handler
- drop plan-tool CLI and API toggles across the exec, TUI, MCP server,
and app server code paths
- update tests and configs to reflect the always-on plan tool and guard
workspace restriction tests against env leakage

## Testing
Manually tested the extension. 
------
https://chatgpt.com/codex/tasks/task_i_68f67a3ff2d083209562a773f814c1f9
2025-10-21 16:25:05 +00:00
Gabriel Peal
740b4a95f4 [MCP] Add configuration options to enable or disable specific tools (#5367)
Some MCP servers expose a lot of tools. In those cases, it is reasonable
to allow/denylist tools for Codex to use so it doesn't get overwhelmed
with too many tools.

The new configuration options available in the `mcp_server` toml table
are:
* `enabled_tools`
* `disabled_tools`

Fixes #4796
2025-10-20 15:35:36 -07:00
pakrym-oai
9c903c4716 Add ItemStarted/ItemCompleted events for UserInputItem (#5306)
Adds a new ItemStarted event and delivers UserMessage as the first item
type (more to come).


Renames `InputItem` to `UserInput` considering we're using the `Item`
suffix for actual items.
2025-10-20 13:34:44 -07:00
jif-oai
5e4f3bbb0b chore: rework tools execution workflow (#5278)
Re-work the tool execution flow. Read `orchestrator.rs` to understand
the structure
2025-10-20 20:57:37 +01:00
Owen Lin
c84fc83222 Use int timestamps for rate limit reset_at (#5383)
The backend will be returning unix timestamps (seconds since epoch)
instead of RFC 3339 strings. This will make it more ergonomic for
developers to integrate against - no string parsing.
2025-10-20 12:26:46 -07:00
Ahmed Ibrahim
049a61bcfc Auto compact at ~90% (#5292)
Users now hit a window exceeded limit and they usually don't know what
to do. This starts auto compact at ~90% of the window.
2025-10-20 11:29:49 -07:00
pakrym-oai
540abfa05e Expand approvals integration coverage (#5358)
Improve approval coverage
2025-10-20 17:11:43 +01:00
Gabriel Peal
0170860ef2 [MCP] Prefix MCP tools names with mcp__ (#5309)
This should make it more clear that specific tools come from MCP
servers.

#4806 requested that we add the server name but we already do that.

Fixes #4806
2025-10-19 20:41:55 -04:00
pakrym-oai
2287d2afde Create independent TurnContexts (#5308)
The goal of this change:
1. Unify user input and user turn implementation.
2. Have a single place where turn/session setting overrides are applied.
3. Have a single place where turn context is created.
4. Create TurnContext only for actual turn and have a separate structure
for current session settings (reuse ConfigureSession)
2025-10-18 17:43:08 -07:00
Thibault Sottiaux
0e08dd6055 fix: switch rate limit reset handling to timestamps (#5304)
This change ensures that we store the absolute time instead of relative
offsets of when the primary and secondary rate limits will reset.
Previously these got recalculated relative to current time, which leads
to the displayed reset times to change over time, including after doing
a codex resume.

For previously changed sessions, this will cause the reset times to not
show due to this being a breaking change:
<img width="524" height="55" alt="Screenshot 2025-10-17 at 5 14 18 PM"
src="https://github.com/user-attachments/assets/53ebd43e-da25-4fef-9c47-94a529d40265"
/>

Fixes https://github.com/openai/codex/issues/4761
2025-10-17 17:39:37 -07:00
Gabriel Peal
40fba1bb4c [MCP] Add support for resources (#5239)
This PR adds support for [MCP
resources](https://modelcontextprotocol.io/specification/2025-06-18/server/resources)
by adding three new tools for the model:
1. `list_resources`
2. `list_resource_templates`
3. `read_resource`

These 3 tools correspond to the [three primary MCP resource protocol
messages](https://modelcontextprotocol.io/specification/2025-06-18/server/resources#protocol-messages).

Example of listing and reading a GitHub resource tempalte
<img width="2984" height="804" alt="CleanShot 2025-10-15 at 17 31 10"
src="https://github.com/user-attachments/assets/89b7f215-2e2a-41c5-90dd-b932ac84a585"
/>

`/mcp` with Figma configured
<img width="2984" height="442" alt="CleanShot 2025-10-15 at 18 29 35"
src="https://github.com/user-attachments/assets/a7578080-2ed2-4c59-b9b4-d8461f90d8ee"
/>

Fixes #4956
2025-10-17 01:05:15 -04:00
Gabriel Peal
bdda762deb [MCP] Allow specifying cwd and additional env vars (#5246)
This makes stdio mcp servers more flexible by allowing users to specify
the cwd to run the server command from and adding additional environment
variables to be passed through to the server.

Example config using the test server in this repo:
```toml
[mcp_servers.test_stdio]
cwd = "/Users/<user>/code/codex/codex-rs"
command = "cargo"
args = ["run", "--bin", "test_stdio_server"]
env_vars = ["MCP_TEST_VALUE"]
```

@bolinfest I know you hate these env var tests but let's roll with this
for now. I may take a stab at the env guard + serial macro at some
point.
2025-10-17 00:24:43 -04:00
Gabriel Peal
a5d48a775b [MCP] Allow specifying custom headers with streamable http servers (#5241)
This adds two new config fields to streamable http mcp servers:
`http_headers`: a map of key to value
`env_http_headers` a map of key to env var which will be resolved at
request time

All headers will be passed to all MCP requests to that server just like
authorization headers.

There is a test ensuring that headers are not passed to other servers.

Fixes #5180
2025-10-16 23:15:47 -04:00
Anton Panasenko
c146585cdb [codex][otel] propagate user email in otel events (#5223)
include user email into otel events for proper user-level attribution in
case of workspace setup
2025-10-15 17:53:33 -07:00
jif-oai
8fed0b53c4 test: reduce time dependency on test harness (#5053)
Tightened the CLI integration tests to stop relying on wall-clock
sleeps—new fs watcher helper waits for session files instead of timing
out, and SSE mocks/fixtures make the flows deterministic.
2025-10-15 09:56:59 +01:00
Dylan
00debb6399 fix(core) use regex for all shell_serialization tests (#5189)
## Summary
Thought I switched all of these to using a regex instead, but missed 2.
This should address our [flakey test
problem](https://github.com/openai/codex/actions/runs/18511206616/job/52752341520?pr=5185).

## Test Plan
- [x] Only updated unit tests
2025-10-14 16:29:02 -07:00
Dylan
0a0a10d8b3 fix: apply_patch shell_serialization tests (#4786)
## Summary
Adds additional shell_serialization tests specifically for apply_patch
and other cases.

## Test Plan
- [x] These are all tests
2025-10-14 13:00:49 -07:00
jif-oai
f7b4e29609 feat: feature flag (#4948)
Add proper feature flag instead of having custom flags for everything.
This is just for experimental/wip part of the code
It can be used through CLI:
```bash
codex --enable unified_exec --disable view_image_tool
```

Or in the `config.toml`
```toml
# Global toggles applied to every profile unless overridden.
[features]
apply_patch_freeform = true
view_image_tool = false
```

Follow-up:
In a following PR, the goal is to have a default have `bundles` of
features that we can associate to a model
2025-10-14 17:50:00 +00:00
jif-oai
268a10f917 feat: add header for task kind (#5142)
Add a header in the responses API request for the task kind (compact,
review, ...) for observability purpose
The header name is `codex-task-type`
2025-10-14 15:17:00 +00:00
jif-oai
961ed31901 feat: make shortcut works even with capslock (#5049)
Shortcut where not working in caps-lock. Fixing this
2025-10-10 14:35:28 +00:00
jif-oai
85e7357973 fix: workflow cache (#5050)
Decouple cache saving to fix the `verify` steps never being run due to a
cache saving issue
2025-10-10 15:57:47 +02:00
jif-oai
3ddd4d47d0 fix: lagged output in unified_exec buffer (#4992)
Handle `Lagged` error when the broadcast buffer of the unified_exec is
full
2025-10-09 16:06:07 +00:00
jif-oai
ca6a0358de bug: sandbox denied error logs (#4874)
Check on STDOUT / STDERR or aggregated output for some logs when sanbox
is denied
2025-10-09 16:01:01 +00:00
Gabriel Peal
d3820f4782 [MCP] Add an enabled config field (#4917)
This lets users more easily toggle MCP servers.
2025-10-08 16:24:51 -04:00
jif-oai
687a13bbe5 feat: truncate on compact (#4942)
Truncate the message during compaction if it is just too large
Do it iteratively as tokenization is basically free on server-side
2025-10-08 18:11:08 +01:00
jif-oai
f52320be86 feat: grep_files as a tool (#4820)
Add `grep_files` to be able to perform more action in parallel
2025-10-08 11:02:50 +01:00
Gabriel Peal
a43ae86b6c [MCP] Add support for streamable http servers with codex mcp add and replace bearer token handling (#4904)
1. You can now add streamable http servers via the CLI
2. As part of this, I'm also changing the existing bearer_token plain
text config field with ane env var

```
mcp add github --url https://api.githubcopilot.com/mcp/ --bearer-token-env-var=GITHUB_PAT
```
2025-10-07 23:21:37 -04:00
dedrisian-oai
b016a3e7d8 Remove instruction hack for /review (#4896)
We use to put the review prompt in the first user message as well to
bypass statsig overrides, but now that's been resolved and instructions
are being respected, so we're duplicating the review instructions.
2025-10-07 12:47:00 -07:00
jif-oai
226215f36d feat: list_dir tool (#4817)
Add a tool to list_dir. It is useful because we can mark it as
non-mutating and so use it in parallel
2025-10-07 19:33:19 +01:00
jif-oai
338c2c873c bug: fix flaky test (#4878)
Fix flaky test by warming up the tools
2025-10-07 19:32:49 +01:00
pakrym-oai
35a770e871 Simplify request body assertions (#4845)
We'll have a lot more test like these
2025-10-07 09:56:39 +01:00
pakrym-oai
a90a58f7a1 Trim double Total output lines (#4787) 2025-10-05 16:41:55 -07:00
pakrym-oai
b2d81a7cac Make output assertions more explicit (#4784)
Match using precise regexes.
2025-10-05 16:01:38 -07:00
pakrym-oai
191d620707 Use response helpers when mounting SSE test responses (#4783)
## Summary
- replace manual wiremock SSE mounts in the compact suite with the
shared response helpers
- simplify the exec auth_env integration test by using the
mount_sse_once_match helper
- rely on mount_sse_sequence plus server request collection to replace
the bespoke SeqResponder utility in tests

## Testing
- just fmt

------
https://chatgpt.com/codex/tasks/task_i_68e2e238f2a88320a337f0b9e4098093
2025-10-05 21:58:16 +00:00
pakrym-oai
5c42419b02 Use assert_matches (#4756)
assert_matches is soon to be in std but is experimental for now.
2025-10-05 21:12:31 +00:00
pakrym-oai
aecbe0f333 Add helper for response created SSE events in tests (#4758)
## Summary
- add a reusable `ev_response_created` helper that builds
`response.created` SSE events for integration tests
- update the exec and core integration suites to use the new helper
instead of repeating manual JSON literals
- keep the streaming fixtures consistent by relying on the shared helper
in every touched test

## Testing
- `just fmt`


------
https://chatgpt.com/codex/tasks/task_i_68e1fe885bb883208aafffb94218da61
2025-10-05 21:11:43 +00:00
jif-oai
dc3c6bf62a feat: parallel tool calls (#4663)
Add parallel tool calls. This is configurable at model level and tool
level
2025-10-05 16:10:49 +00:00