Commit Graph

242 Commits

Author SHA1 Message Date
Anton Panasenko
9572cfc782 [codex] add developer instructions (#5897)
we are using developer instructions for code reviews, we need to pass
them in cli as well.
2025-10-30 11:18:31 -07:00
jif-oai
f4f9695978 feat: compaction prompt configurable (#5959)
```
 codex -c compact_prompt="Summarize in bullet points"
 ```
2025-10-30 14:24:24 +00:00
jif-oai
aa76003e28 chore: unify config crates (#5958) 2025-10-30 10:28:32 +00:00
pakrym-oai
3429e82e45 Add item streaming events (#5546)
Adds AgentMessageContentDelta, ReasoningContentDelta,
ReasoningRawContentDelta item streaming events while maintaining
compatibility for old events.

---------

Co-authored-by: Owen Lin <owen@openai.com>
2025-10-29 22:33:57 +00:00
Ahmed Ibrahim
13e1d0362d Delegate review to codex instance (#5572)
In this PR, I am exploring migrating task kind to an invocation of
Codex. The main reason would be getting rid off multiple
`ConversationHistory` state and streamlining our context/history
management.

This approach depends on opening a channel between the sub-codex and
codex. This channel is responsible for forwarding `interactive`
(`approvals`) and `non-interactive` events. The `task` is responsible
for handling those events.

This opens the door for implementing `codex as a tool`, replacing
`compact` and `review`, and potentially subagents.

One consideration is this code is very similar to `app-server` specially
in the approval part. If in the future we wanted an interactive
`sub-codex` we should consider using `codex-mcp`
2025-10-29 21:04:25 +00:00
Rasmus Rygaard
39e09c289d Add a wrapper around raw response items (#5923)
We currently have nested enums when sending raw response items in the
app-server protocol. This makes downstream schemas confusing because we
need to embed `type`-discriminated enums within each other.

This PR adds a small wrapper around the response item so we can keep the
schemas separate
2025-10-29 20:32:40 +00:00
jif-oai
060637b4d4 feat: deprecation warning (#5825)
<img width="955" height="311" alt="Screenshot 2025-10-28 at 14 26 25"
src="https://github.com/user-attachments/assets/99729b3d-3bc9-4503-aab3-8dc919220ab4"
/>
2025-10-29 12:29:28 +00:00
Abhishek Bhardwaj
89591e4246 feature: Add "!cmd" user shell execution (#2471)
feature: Add "!cmd" user shell execution

This change lets users run local shell commands directly from the TUI by
prefixing their input with ! (e.g. !ls). Output is truncated to keep the
exec cell usable, and Ctrl-C cleanly
  interrupts long-running commands (e.g. !sleep 10000).

**Summary of changes**

- Route Op::RunUserShellCommand through a dedicated UserShellCommandTask
(core/src/tasks/user_shell.rs), keeping the task logic out of codex.rs.
- Reuse the existing tool router: the task constructs a ToolCall for the
local_shell tool and relies on ShellHandler, so no manual MCP tool
lookup is required.
- Emit exec lifecycle events (ExecCommandBegin/ExecCommandEnd) so the
TUI can show command metadata, live output, and exit status.

**End-to-end flow**

  **TUI handling**

1. ChatWidget::submit_user_message (TUI) intercepts messages starting
with !.
2. Non-empty commands dispatch Op::RunUserShellCommand { command };
empty commands surface a help hint.
3. No UserInput items are created, so nothing is enqueued for the model.

  **Core submission loop**
4. The submission loop routes the op to handlers::run_user_shell_command
(core/src/codex.rs).
5. A fresh TurnContext is created and Session::spawn_user_shell_command
enqueues UserShellCommandTask.

  **Task execution**
6. UserShellCommandTask::run emits TaskStartedEvent, formats the
command, and prepares a ToolCall targeting local_shell.
  7. ToolCallRuntime::handle_tool_call dispatches to ShellHandler.

  **Shell tool runtime**
8. ShellHandler::run_exec_like launches the process via the unified exec
runtime, honoring sandbox and shell policies, and emits
ExecCommandBegin/End.
9. Stdout/stderr are captured for the UI, but the task does not turn the
resulting ToolOutput into a model response.

  **Completion**
10. After ExecCommandEnd, the task finishes without an assistant
message; the session marks it complete and the exec cell displays the
final output.

  **Conversation context**

- The command and its output never enter the conversation history or the
model prompt; the flow is local-only.
  - Only exec/task events are emitted for UI rendering.

**Demo video**


https://github.com/user-attachments/assets/fcd114b0-4304-4448-a367-a04c43e0b996
2025-10-29 00:31:20 -07:00
pakrym-oai
1b8f2543ac Filter out reasoning items from previous turns (#5857)
Reduces request size and prevents 400 errors when switching between API
orgs.

Based on Responses API behavior described in
https://cookbook.openai.com/examples/responses_api/reasoning_items#caching
2025-10-28 11:39:34 -07:00
jif-oai
5ba2a17576 chore: decompose submission loop (#5854) 2025-10-28 15:23:46 +00:00
Celia Chen
4a42c4e142 [Auth] Choose which auth storage to use based on config (#5792)
This PR is a follow-up to #5591. It allows users to choose which auth
storage mode they want by using the new
`cli_auth_credentials_store_mode` config.
2025-10-27 19:41:49 -07:00
Gabriel Peal
b0bdc04c30 [MCP] Render MCP tool call result images to the model (#5600)
It's pretty amazing we have gotten here without the ability for the
model to see image content from MCP tool calls.

This PR builds off of 4391 and fixes #4819. I would like @KKcorps to get
adequete credit here but I also want to get this fix in ASAP so I gave
him a week to update it and haven't gotten a response so I'm going to
take it across the finish line.


This test highlights how absured the current situation is. I asked the
model to read this image using the Chrome MCP
<img width="2378" height="674" alt="image"
src="https://github.com/user-attachments/assets/9ef52608-72a2-4423-9f5e-7ae36b2b56e0"
/>

After this change, it correctly outputs:
> Captured the page: image dhows a dark terminal-style UI labeled
`OpenAI Codex (v0.0.0)` with prompt `model: gpt-5-codex medium` and
working directory `/codex/codex-rs`
(and more)  

Before this change, it said:
> Took the full-page screenshot you asked for. It shows a long,
horizontally repeating pattern of stylized people in orange, light-blue,
and mustard clothing, holding hands in alternating poses against a white
background. No text or other graphics-just rows of flat illustration
stretching off to the right.

Without this change, the Figma, Playwright, Chrome, and other visual MCP
servers are pretty much entirely useless.

I tested this change with the openai respones api as well as a third
party completions api
2025-10-27 17:55:57 -04:00
Ahmed Ibrahim
7226365397 Centralize truncation in conversation history (#5652)
move the truncation logic to conversation history to use on any tool
output. This will help us in avoiding edge cases while truncating the
tool calls and mcp calls.
2025-10-27 14:05:35 -07:00
Eric Traut
0c1ff1d3fd Made token refresh code resilient to missing id_token (#5782)
This PR does the following:
1. Changes `try_refresh_token` to handle the case where the endpoint
returns a response without an `id_token`. The OpenID spec indicates that
this field is optional and clients should not assume it's present.
2. Changes the `attempt_stream_responses` to propagate token refresh
errors rather than silently ignoring them.
3. Fixes a typo in a couple of error messages (unrelated to the above,
but something I noticed in passing) - "reconnect" should be spelled
without a hyphen.

This PR does not implement the additional suggestion from @pakrym-oai
that we should sign out when receiving `refresh_token_expired` from the
refresh endpoint. Leaving this as a follow-on because I'm undecided on
whether this should be implemented in `try_refresh_token` or its
callers.
2025-10-27 10:09:53 -07:00
jif-oai
afc4eaab8b feat: TUI undo op (#5629) 2025-10-27 10:55:29 +00:00
jif-oai
e92c4f6561 feat: async ghost commit (#5618) 2025-10-27 10:09:10 +00:00
Ahmed Ibrahim
f178805252 Add feedback upload request handling (#5682) 2025-10-27 05:53:39 +00:00
Anton Panasenko
6af83d86ff [codex][app-server] introduce codex/event/raw_item events (#5578) 2025-10-24 22:41:52 +00:00
Eric Traut
f8af4f5c8d Added model summary and risk assessment for commands that violate sandbox policy (#5536)
This PR adds support for a model-based summary and risk assessment for
commands that violate the sandbox policy and require user approval. This
aids the user in evaluating whether the command should be approved.

The feature works by taking a failed command and passing it back to the
model and asking it to summarize the command, give it a risk level (low,
medium, high) and a risk category (e.g. "data deletion" or "data
exfiltration"). It uses a new conversation thread so the context in the
existing thread doesn't influence the answer. If the call to the model
fails or takes longer than 5 seconds, it falls back to the current
behavior.

For now, this is an experimental feature and is gated by a config key
`experimental_sandbox_command_assessment`.

Here is a screen shot of the approval prompt showing the risk assessment
and summary.

<img width="723" height="282" alt="image"
src="https://github.com/user-attachments/assets/4597dd7c-d5a0-4e9f-9d13-414bd082fd6b"
/>
2025-10-24 15:23:44 -07:00
Gabriel Peal
ed77d2d977 [MCP] Improve startup errors for timeouts and github (#5595)
1. I have seen too many reports of people hitting startup timeout errors
and thinking Codex is broken. Hopefully this will help people
self-serve. We may also want to consider raising the timeout to ~15s.
2. Make it more clear what PAT is (personal access token) in the GitHub
error

<img width="2378" height="674" alt="CleanShot 2025-10-23 at 22 05 06"
src="https://github.com/user-attachments/assets/d148ce1d-ade3-4511-84a4-c164aefdb5c5"
/>
2025-10-24 01:54:45 -04:00
Ahmed Ibrahim
f59978ed3d Handle cancelling/aborting while processing a turn (#5543)
Currently we collect all all turn items in a vector, then we add it to
the history on success. This result in losing those items on errors
including aborting `ctrl+c`.

This PR:
- Adds the ability for the tool call to handle cancellation
- bubble the turn items up to where we are recording this info

Admittedly, this logic is an ad-hoc logic that doesn't handle a lot of
error edge cases. The right thing to do is recording to the history on
the spot as `items`/`tool calls output` come. However, this isn't
possible because of having different `task_kind` that has different
`conversation_histories`. The `try_run_turn` has no idea what thread are
we using. We cannot also pass an `arc` to the `conversation_histories`
because it's a private element of `state`.

That's said, `abort` is the most common case and we should cover it
until we remove `task kind`
2025-10-23 08:47:10 -07:00
jif-oai
8e291a1706 chore: clean handle_container_exec_with_params (#5516)
Drop `handle_container_exec_with_params` to have simpler and more
straight forward execution path
2025-10-23 09:24:01 +01:00
Ahmed Ibrahim
273819aaae Move changing turn input functionalities to ConversationHistory (#5473)
We are doing some ad-hoc logic while dealing with conversation history.
Ideally, we shouldn't mutate `vec[responseitem]` manually at all and
should depend on `ConversationHistory` for those changes.

Those changes are:
- Adding input to the history
- Removing items from the history
- Correcting history

I am also adding some `error` logs for cases we shouldn't ideally face.
For example, we shouldn't be missing `toolcalls` or `outputs`. We
shouldn't hit `ContextWindowExceeded` while performing `compact`

This refactor will give us granular control over our context management.
2025-10-22 13:08:46 -07:00
Gabriel Peal
4cd6b01494 [MCP] Remove the legacy stdio client in favor of rmcp (#5529)
I haven't heard of any issues with the studio rmcp client so let's
remove the legacy one and default to the new one.

Any code changes are moving code from the adapter inline but there
should be no meaningful functionality changes.
2025-10-22 12:06:59 -07:00
pakrym-oai
3c90728a29 Add new thread items and rewire event parsing to use them (#5418)
1. Adds AgentMessage,  Reasoning,  WebSearch items.
2. Switches the ResponseItem parsing to use new items and then also emit
3. Removes user-item kind and filters out "special" (environment) user
items when returning to clients.
2025-10-22 10:14:50 -07:00
pakrym-oai
1b10a3a1b2 Enable plan tool by default (#5384)
## Summary
- make the plan tool available by default by removing the feature flag
and always registering the handler
- drop plan-tool CLI and API toggles across the exec, TUI, MCP server,
and app server code paths
- update tests and configs to reflect the always-on plan tool and guard
workspace restriction tests against env leakage

## Testing
Manually tested the extension. 
------
https://chatgpt.com/codex/tasks/task_i_68f67a3ff2d083209562a773f814c1f9
2025-10-21 16:25:05 +00:00
pakrym-oai
789e65b9d2 Pass TurnContext around instead of sub_id (#5421)
Today `sub_id` is an ID of a single incoming Codex Op submition. We then
associate all events triggered by this operation using the same
`sub_id`.

At the same time we are also creating a TurnContext per submission and
we'd like to start associating some events (item added/item completed)
with an entire turn instead of just the operation that started it.

Using turn context when sending events give us flexibility to change
notification scheme.
2025-10-21 08:04:16 -07:00
Thibault Sottiaux
7fc01c6e9b feat: include cwd in notify payload (#5415)
Expose the session cwd in the notify payload and update docs so scripts
and extensions receive the real project path; users get accurate
project-aware notifications in CLI and VS Code.

Fixes #5387
2025-10-20 23:53:03 +00:00
Gabriel Peal
ef806456e4 [MCP] Dedicated error message for GitHub MCPs missing a personal access token (#5393)
Because the GitHub MCP is one of the most popular MCPs and it
confusingly doesn't support OAuth, we should make it more clear how to
make it work so people don't think Codex is broken.
2025-10-20 16:23:26 -07:00
pakrym-oai
9c903c4716 Add ItemStarted/ItemCompleted events for UserInputItem (#5306)
Adds a new ItemStarted event and delivers UserMessage as the first item
type (more to come).


Renames `InputItem` to `UserInput` considering we're using the `Item`
suffix for actual items.
2025-10-20 13:34:44 -07:00
jif-oai
5e4f3bbb0b chore: rework tools execution workflow (#5278)
Re-work the tool execution flow. Read `orchestrator.rs` to understand
the structure
2025-10-20 20:57:37 +01:00
Ahmed Ibrahim
049a61bcfc Auto compact at ~90% (#5292)
Users now hit a window exceeded limit and they usually don't know what
to do. This starts auto compact at ~90% of the window.
2025-10-20 11:29:49 -07:00
pakrym-oai
2287d2afde Create independent TurnContexts (#5308)
The goal of this change:
1. Unify user input and user turn implementation.
2. Have a single place where turn/session setting overrides are applied.
3. Have a single place where turn context is created.
4. Create TurnContext only for actual turn and have a separate structure
for current session settings (reuse ConfigureSession)
2025-10-18 17:43:08 -07:00
Gabriel Peal
41900e9d0f [MCP] When MCP auth expires, prompt the user to log in again. (#5300)
Similar to https://github.com/openai/codex/pull/5193 but catches a case
where the user _has_ authenticated but the auth expired or was revoked.

Before:
<img width="2976" height="632" alt="CleanShot 2025-10-17 at 14 28 11"
src="https://github.com/user-attachments/assets/7c1bd11d-c075-46cb-9298-48891eaa77fe"
/>

After:
<img width="591" height="283" alt="image"
src="https://github.com/user-attachments/assets/fc14e08c-1a33-4077-8757-ff4ed3f00f8f"
/>
2025-10-17 18:16:22 -04:00
pakrym-oai
c03e31ecf5 Support graceful agent interruption (#5287) 2025-10-17 18:52:57 +00:00
jif-oai
6915ba2100 feat: better UX during refusal (#5260)
<img width="568" height="169" alt="Screenshot 2025-10-16 at 18 28 05"
src="https://github.com/user-attachments/assets/f42e8d6d-b7de-4948-b291-a5fbb50b1312"
/>
2025-10-17 11:06:55 +02:00
Gabriel Peal
40fba1bb4c [MCP] Add support for resources (#5239)
This PR adds support for [MCP
resources](https://modelcontextprotocol.io/specification/2025-06-18/server/resources)
by adding three new tools for the model:
1. `list_resources`
2. `list_resource_templates`
3. `read_resource`

These 3 tools correspond to the [three primary MCP resource protocol
messages](https://modelcontextprotocol.io/specification/2025-06-18/server/resources#protocol-messages).

Example of listing and reading a GitHub resource tempalte
<img width="2984" height="804" alt="CleanShot 2025-10-15 at 17 31 10"
src="https://github.com/user-attachments/assets/89b7f215-2e2a-41c5-90dd-b932ac84a585"
/>

`/mcp` with Figma configured
<img width="2984" height="442" alt="CleanShot 2025-10-15 at 18 29 35"
src="https://github.com/user-attachments/assets/a7578080-2ed2-4c59-b9b4-d8461f90d8ee"
/>

Fixes #4956
2025-10-17 01:05:15 -04:00
Anton Panasenko
c146585cdb [codex][otel] propagate user email in otel events (#5223)
include user email into otel events for proper user-level attribution in
case of workspace setup
2025-10-15 17:53:33 -07:00
Michael Bolin
995f5c3614 feat: add Vec<ParsedCommand> to ExecApprovalRequestEvent (#5222)
This adds `parsed_cmd: Vec<ParsedCommand>` to `ExecApprovalRequestEvent`
in the core protocol (`protocol/src/protocol.rs`), which is also what
this field is named on `ExecCommandBeginEvent`. Honestly, I don't love
the name (it sounds like a single command, but it is actually a list of
them), but I don't want to get distracted by a naming discussion right
now.

This also adds `parsed_cmd` to `ExecCommandApprovalParams` in
`codex-rs/app-server-protocol/src/protocol.rs`, so it will be available
via `codex app-server`, as well.

For consistency, I also updated `ExecApprovalElicitRequestParams` in
`codex-rs/mcp-server/src/exec_approval.rs` to include this field under
the name `codex_parsed_cmd`, as that struct already has a number of
special `codex_*` fields. Note this is the code for when Codex is used
as an MCP _server_ and therefore has to conform to the official spec for
an MCP elicitation type.
2025-10-15 13:58:40 -07:00
Michael Bolin
f38ad65254 chore: standardize on ParsedCommand from codex_protocol (#5218)
Note these two types were identical, so it seems clear to standardize on the one in `codex_protocol` and eliminate the `Into` stuff.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/5218).
* #5222
* __->__ #5218
2025-10-15 13:00:22 -07:00
Gabriel Peal
8a281cd1f4 [MCP] Prompt mcp login when adding a streamable HTTP server that supports oauth (#5193)
1. If Codex detects that a `codex mcp add -url …` server supports oauth,
it will auto-initiate the login flow.
2. If the TUI starts and a MCP server supports oauth but isn't logged
in, it will give the user an explicit warning telling them to log in.
2025-10-15 12:27:40 -04:00
Javi
13035561cd feat: pass codex thread ID in notifier metadata (#4582) 2025-10-14 11:55:10 -07:00
jif-oai
f7b4e29609 feat: feature flag (#4948)
Add proper feature flag instead of having custom flags for everything.
This is just for experimental/wip part of the code
It can be used through CLI:
```bash
codex --enable unified_exec --disable view_image_tool
```

Or in the `config.toml`
```toml
# Global toggles applied to every profile unless overridden.
[features]
apply_patch_freeform = true
view_image_tool = false
```

Follow-up:
In a following PR, the goal is to have a default have `bundles` of
features that we can associate to a model
2025-10-14 17:50:00 +00:00
jif-oai
268a10f917 feat: add header for task kind (#5142)
Add a header in the responses API request for the task kind (compact,
review, ...) for observability purpose
The header name is `codex-task-type`
2025-10-14 15:17:00 +00:00
jif-oai
f98fa85b44 feat: message when stream get correctly resumed (#4988)
<img width="366" height="109" alt="Screenshot 2025-10-09 at 17 44 16"
src="https://github.com/user-attachments/assets/26bc6f60-11bc-4fc6-a1cc-430ca1203969"
/>
2025-10-10 09:07:14 +00:00
dedrisian-oai
4300236681 revert /name for now (#4978)
There was a regression where we'd read entire rollout contents if there
was no /name present.
2025-10-08 17:13:49 -07:00
dedrisian-oai
ec238a2c39 feat: Set chat name (#4974)
Set chat name with `/name` so they appear in the codex resume page:


https://github.com/user-attachments/assets/c0252bba-3a53-44c7-a740-f4690a3ad405
2025-10-08 16:35:35 -07:00
Gabriel Peal
3c5e12e2a4 [MCP] Add auth status to MCP servers (#4918)
This adds a queryable auth status for MCP servers which is useful:
1. To determine whether a streamable HTTP server supports auth or not
based on whether or not it supports RFC 8414-3.2
2. Allow us to build a better user experience on top of MCP status
2025-10-08 17:37:57 -04:00
Gabriel Peal
496cb801e1 [MCP] Add the ability to explicitly specify a credentials store (#4857)
This lets users/companies explicitly choose whether to force/disallow
the keyring/fallback file storage for mcp credentials.

People who develop with Codex will want to use this until we sign
binaries or else each ad-hoc debug builds will require keychain access
on every build. I don't love this and am open to other ideas for how to
handle that.


```toml
mcp_oauth_credentials_store = "auto"
mcp_oauth_credentials_store = "file"
mcp_oauth_credentials_store = "keyrung"
```
Defaults to `auto`
2025-10-07 22:39:32 -04:00
dedrisian-oai
b016a3e7d8 Remove instruction hack for /review (#4896)
We use to put the review prompt in the first user message as well to
bypass statsig overrides, but now that's been resolved and instructions
are being respected, so we're duplicating the review instructions.
2025-10-07 12:47:00 -07:00