It turns out that we want slightly different behavior for the
`SetDefaultModel` RPC because some models do not work with reasoning
(like GPT-4.1), so we should be able to explicitly clear this value.
Verified in `codex-rs/mcp-server/tests/suite/set_default_model.rs`.
This adds `SetDefaultModel`, which takes `model` and `reasoning_effort`
as optional fields. If set, the field will overwrite what is in the
user's `config.toml`.
This reuses logic that was added to support the `/model` command in the
TUI: https://github.com/openai/codex/pull/2799.
`ClientRequest::NewConversation` picks up the reasoning level from the user's defaults in `config.toml`, so it should be reported in `NewConversationResponse`.
This PR does the following:
* Adds the ability to paste or type an API key.
* Removes the `preferred_auth_method` config option. The last login
method is always persisted in auth.json, so this isn't needed.
* If OPENAI_API_KEY env variable is defined, the value is used to
prepopulate the new UI. The env variable is otherwise ignored by the
CLI.
* Adds a new MCP server entry point "login_api_key" so we can implement
this same API key behavior for the VS Code extension.
<img width="473" height="140" alt="Screenshot 2025-09-04 at 3 51 04 PM"
src="https://github.com/user-attachments/assets/c11bbd5b-8a4d-4d71-90fd-34130460f9d9"
/>
<img width="726" height="254" alt="Screenshot 2025-09-04 at 3 51 32 PM"
src="https://github.com/user-attachments/assets/6cc76b34-309a-4387-acbc-15ee5c756db9"
/>
This PR changes get history op to get path. Then, forking will use a
path. This will help us have one unified codepath for resuming/forking
conversations. Will also help in having rollout history in order. It
also fixes a bug where you won't see the UI when resuming after forking.
## Unified PTY-Based Exec Tool
Note: this requires to have this flag in the config:
`use_experimental_unified_exec_tool=true`
- Adds a PTY-backed interactive exec feature (“unified_exec”) with
session reuse via
session_id, bounded output (128 KiB), and timeout clamping (≤ 60 s).
- Protocol: introduces ResponseItem::UnifiedExec { session_id,
arguments, timeout_ms }.
- Tools: exposes unified_exec as a function tool (Responses API);
excluded from Chat
Completions payload while still supported in tool lists.
- Path handling: resolves commands via PATH (or explicit paths), with
UTF‑8/newline‑aware
truncation (truncate_middle).
- Tests: cover command parsing, path resolution, session
persistence/cleanup, multi‑session
isolation, timeouts, and truncation behavior.
This adds a simple endpoint that provides the email address encoded in
`$CODEX_HOME/auth.json`.
As noted, for now, we do not hit the server to verify this is the user's
true email address.
This PR adds an `images` field to the existing `UserMessageEvent` so we
can encode zero or more images associated with a user message. This
allows images to be restored when conversations are restored.
Adds support for `ArchiveConversation` in the JSON-RPC server that takes
a `(ConversationId, PathBuf)` pair and:
- verifies the `ConversationId` corresponds to the rollout id at the
`PathBuf`
- if so, invokes
`ConversationManager.remove_conversation(ConversationId)`
- if the `CodexConversation` was in memory, send `Shutdown` and wait for
`ShutdownComplete` with a timeout
- moves the `.jsonl` file to `$CODEX_HOME/archived_sessions`
---------
Co-authored-by: Gabriel Peal <gabriel@openai.com>
Adding the `rollout_path` to the `NewConversationResponse` makes it so a
client can perform subsequent operations on a `(ConversationId,
PathBuf)` pair. #3353 will introduce support for `ArchiveConversation`.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/3352).
* #3353
* __->__ #3352
This PR does multiple things that are necessary for conversation resume
to work from the extension. I wanted to make sure everything worked so
these changes wound up in one PR:
1. Generate more ts types
2. Resume rollout history files rather than create a new one every time
it is resumed so you don't see a duplicate conversation in history for
every resume. Chatted with @aibrahim-oai to verify this
3. Return conversation_id in conversation summaries
4. [Cleanup] Use serde and strong types for a lot of the rollout file
parsing
- In the bottom line of the TUI, print the number of tokens to 3 sigfigs
with an SI suffix, e.g. "1.23K".
- Elsewhere where we print a number, I figure it's worthwhile to print
the exact number, because e.g. it's a summary of your session. Here we print
the numbers comma-separated.
We're trying to migrate from `session_id: Uuid` to `conversation_id:
ConversationId`. Not only does this give us more type safety but it
unifies our terminology across Codex and with the implementation of
session resuming, a conversation (which can span multiple sessions) is
more appropriate.
I started this impl on https://github.com/openai/codex/pull/3219 as part
of getting resume working in the extension but it's big enough that it
should be broken out.
When item ids are sent to Responses API it will load them from the
database ignoring the provided values. This adds extra latency.
Not having the mode to store requests also allows us to simplify the
code.
## Breaking change
The `disable_response_storage` configuration option is removed.
This PR introduces introduces a new
`OutgoingMessage::AppServerNotification` variant that is designed to
wrap a `ServerNotification`, which makes the serialization more
straightforward compared to
`OutgoingMessage::Notification(OutgoingNotification)`. We still use the
latter for serializing an `Event` as a `JSONRPCMessage::Notification`,
but I will try to get away from that in the near future.
With this change, now the generated TypeScript type for
`ServerNotification` is:
```typescript
export type ServerNotification =
| { "method": "authStatusChange", "params": AuthStatusChangeNotification }
| { "method": "loginChatGptComplete", "params": LoginChatGptCompleteNotification };
```
whereas before it was:
```typescript
export type ServerNotification =
| { type: "auth_status_change"; data: AuthStatusChangeNotification }
| { type: "login_chat_gpt_complete"; data: LoginChatGptCompleteNotification };
```
Once the `Event`s are migrated to the `ServerNotification` enum in Rust,
it should be considerably easier to work with notifications on the
TypeScript side, as it will be possible to `switch (message.method)` and
check for exhaustiveness.
Though we will probably need to introduce:
```typescript
export type ServerMessage = ServerRequest | ServerNotification;
```
and then we still need to group all of the `ServerResponse` types
together, as well.
# External (non-OpenAI) Pull Request Requirements
Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md
If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.
We had multiple issues with context size calculation:
1. `initial_prompt_tokens` calculation based on cache size is not
reliable, cache misses might set it to much higher value. For now
hardcoded to a safer constant.
2. Input context size for GPT-5 is 272k (that's where 33% came from).
Fixes.
## Summary
Follow-up to #3056
This PR updates the mcp-server interface for reading the config settings
saved by the user. At risk of introducing _another_ Config struct, I
think it makes sense to avoid tying our protocol to ConfigToml, as its
become a bit unwieldy. GetConfigTomlResponse was a de-facto struct for
this already - better to make it explicit, in my opinion.
This is technically a breaking change of the mcp-server protocol, but
given the previous interface was introduced so recently in #2725, and we
have not yet even started to call it, I propose proceeding with the
breaking change - but am open to preserving the old endpoint.
## Testing
- [x] Added additional integration test coverage
When serializing to JSON, the existing solution created an enormous
array of ints, which is far more bytes on the wire than a base64-encoded
string would be.
This PR does the following:
- divides user msgs into 3 categories: plain, user instructions, and
environment context
- Centralizes adding user instructions and environment context to a
degree
- Improve the integration testing
Building on top of #3123
Specifically this
[comment](https://github.com/openai/codex/pull/3123#discussion_r2319885089).
We need to send the user message while ignoring the User Instructions
and Environment Context we attach.
### Overview
This PR introduces the following changes:
1. Adds a unified mechanism to convert ResponseItem into EventMsg.
2. Ensures that when a session is initialized with initial history, a
vector of EventMsg is sent along with the session configuration. This
allows clients to re-render the UI accordingly.
3. Added integration testing
### Caveats
This implementation does not send every EventMsg that was previously
dispatched to clients. The excluded events fall into two categories:
• “Arguably” rolled-out events
Examples include tool calls and apply-patch calls. While these events
are conceptually rolled out, we currently only roll out ResponseItems.
These events are already being handled elsewhere and transformed into
EventMsg before being sent.
• Non-rolled-out events
Certain events such as TurnDiff, Error, and TokenCount are not rolled
out at all.
### Future Directions
At present, resuming a session involves maintaining two states:
• UI State
Clients can replay most of the important UI from the provided EventMsg
history.
• Model State
The model receives the complete session history to reconstruct its
internal state.
This design provides a solid foundation. If, in the future, more precise
UI reconstruction is needed, we have two potential paths:
1. Introduce a third data structure that allows us to derive both
ResponseItems and EventMsgs.
2. Clearly divide responsibilities: the core system ensures the
integrity of the model state, while clients are responsible for
reconstructing the UI.
## Summary
It appears that #2108 hit a merge conflict with #2355 - I failed to
notice the path difference when re-reviewing the former. This PR
rectifies that, and consolidates it into the protocol package, in line
with our philosophy of specifying types in one place.
## Testing
- [x] Adds config test for model_verbosity
- Introduce websearch end to complement the begin
- Moves the logic of adding the sebsearch tool to
create_tools_json_for_responses_api
- Making it the client responsibility to toggle the tool on or off
- Other misc in #2371 post commit feedback
- Show the query:
<img width="1392" height="151" alt="image"
src="https://github.com/user-attachments/assets/8457f1a6-f851-44cf-bcca-0d4fe460ce89"
/>
Adds custom `/prompts` to `~/.codex/prompts/<command>.md`.
<img width="239" height="107" alt="Screenshot 2025-08-25 at 6 22 42 PM"
src="https://github.com/user-attachments/assets/fe6ebbaa-1bf6-49d3-95f9-fdc53b752679"
/>
---
Details:
1. Adds `Op::ListCustomPrompts` to core.
2. Returns `ListCustomPromptsResponse` with list of `CustomPrompt`
(name, content).
3. TUI calls the operation on load, and populates the custom prompts
(excluding prompts that collide with builtins).
4. Selecting the custom prompt automatically sends the prompt to the
agent.
## Summary
Adds a GetConfig request to the MCP Protocol, so MCP clients can
evaluate the resolved config.toml settings which the harness is using.
## Testing
- [x] Added an end to end test of the endpoint
Adds web_search tool, enabling the model to use Responses API web_search
tool.
- Disabled by default, enabled by --search flag
- When --search is passed, exposes web_search_request function tool to
the model, which triggers user approval. When approved, the model can
use the web_search tool for the remainder of the turn
<img width="1033" height="294" alt="image"
src="https://github.com/user-attachments/assets/62ac6563-b946-465c-ba5d-9325af28b28f"
/>
---------
Co-authored-by: easong-openai <easong@openai.com>
We want to send an aggregated output of stderr and stdout so we don't
have to aggregate it stderr+stdout as we lose order sometimes.
---------
Co-authored-by: Gabriel Peal <gpeal@users.noreply.github.com>
This can be the underlying logic in order to start a conversation from a
previous message. will need some love in the UI.
Base for building this: #2588
This PR adds a central `AuthManager` struct that manages the auth
information used across conversations and the MCP server. Prior to this,
each conversation and the MCP server got their own private snapshots of
the auth information, and changes to one (such as a logout or token
refresh) were not seen by others.
This is especially problematic when multiple instances of the CLI are
run. For example, consider the case where you start CLI 1 and log in to
ChatGPT account X and then start CLI 2 and log out and then log in to
ChatGPT account Y. The conversation in CLI 1 is still using account X,
but if you create a new conversation, it will suddenly (and
unexpectedly) switch to account Y.
With the `AuthManager`, auth information is read from disk at the time
the `ConversationManager` is constructed, and it is cached in memory.
All new conversations use this same auth information, as do any token
refreshes.
The `AuthManager` is also used by the MCP server's GetAuthStatus
command, which now returns the auth method currently used by the MCP
server.
This PR also includes an enhancement to the GetAuthStatus command. It
now accepts two new (optional) input parameters: `include_token` and
`refresh_token`. Callers can use this to request the in-use auth token
and can optionally request to refresh the token.
The PR also adds tests for the login and auth APIs that I recently added
to the MCP server.
This PR adds the following:
* A getAuthStatus method on the mcp server. This returns the auth method
currently in use (chatgpt or apikey) or none if the user is not
authenticated. It also returns the "preferred auth method" which
reflects the `preferred_auth_method` value in the config.
* A logout method on the mcp server. If called, it logs out the user and
deletes the `auth.json` file — the same behavior in the cli's `/logout`
command.
* An `authStatusChange` event notification that is sent when the auth
status changes due to successful login or logout operations.
* Logic to pass command-line config overrides to the mcp server at
startup time. This allows use cases like `codex mcp -c
preferred_auth_method=apikey`.
## Summary
Adds a `/mcp` command to list active tools. We can extend this command
to allow configuration of MCP tools, but for now a simple list command
will help debug if your config.toml and your tools are working as
expected.
- Prevents the % left indicator from immediately decrementing to ~97%.
- Tested by prompting "hi" and noting it only decremented to 99%. And by
adding a bunch of debug logs and observing numbers.