Commit Graph

60 Commits

Author SHA1 Message Date
Celia Chen
2e81f1900d [App-server] Add auth v2 doc & update codex mcp interface auth section (#6353)
Added doc for auth v2 endpoints. Updated the auth section in Codex MCP
interface doc too.
2025-11-07 08:17:19 -08:00
Owen Lin
2030b28083 [app-server] feat: expose additional fields on Thread (#6338)
Add the following fields to Thread:

```
    pub preview: String,
    pub model_provider: String,
    pub created_at: i64,
```

Will prob need another PR once this lands:
https://github.com/openai/codex/pull/6337
2025-11-07 04:08:45 +00:00
Celia Chen
e84e39940b [App-server] Implement account/read endpoint (#6336)
This PR does two things:
1. add a new function in core that maps the core-internal plan type to
the external plan type;
2. implement account/read that get account status (v2 of
`getAuthStatus`).
2025-11-06 19:43:13 -08:00
Gabriel Peal
1b8cc8b625 [App Server] Add more session metadata to listConversations (#6337)
This unlocks a few new product experience for app server consumers
2025-11-06 17:13:24 -05:00
Owen Lin
fdb9fa301e chore: move relevant tests to app-server/tests/suite/v2 (#6289)
These are technically app-server v2 APIs, so move them to the same
directory as the others.
2025-11-06 10:53:17 -08:00
Owen Lin
6582554926 [app-server] feat: v2 Turn APIs (#6216)
Implements:
```
turn/start
turn/interrupt
```

along with their integration tests. These are relatively light wrappers
around the existing core logic, and changes to core logic are minimal.

However, an improvement made for developer ergonomics:
- `turn/start` replaces both `SendUserMessage` (no turn overrides) and
`SendUserTurn` (can override model, approval policy, etc.)
2025-11-06 16:36:36 +00:00
Thibault Sottiaux
667e841d3e feat: support models with single reasoning effort (#6300) 2025-11-05 23:06:45 -08:00
Celia Chen
229d18f4d2 [App-server] Add account/login/cancel v2 endpoint (#6288)
Add `account/login/cancel` v2 endpoint for auth. this is similar
implementation to `cancelLoginChatgpt` v1 endpoint.
2025-11-06 01:13:55 +00:00
Celia Chen
05f0b4f590 [App-server] Implement v2 for account/login/start and account/login/completed (#6183)
This PR implements `account/login/start` and `account/login/completed`.
Instead of having separate endpoints for login with chatgpt and api, we
have a single enum handling different login methods. For sync auth
methods like sign in with api key, we still send a `completed`
notification back to be compatible with the async login flow.
2025-11-05 13:52:50 -08:00
Eric Traut
d7953aed74 Fixes intermittent test failures in CI (#6282)
I'm seeing two tests fail intermittently in CI. This PR attempts to
address (or at least mitigate) the flakiness.

* summarize_context_three_requests_and_instructions - The test snapshots
server.received_requests() immediately after observing TaskComplete.
Because the OpenAI /v1/responses call is streamed, the HTTP request can
still be draining when that event fires, so wiremock occasionally
reports only two captured requests. Fix is to wait for async activity to
complete.
* archive_conversation_moves_rollout_into_archived_directory - times out
on a slow CI run. Mitigation is to increase timeout value from 10s to
20s.
2025-11-05 13:12:25 -08:00
Owen Lin
2ab1650d4d [app-server] feat: v2 Thread APIs (#6214)
Implements:
```
thread/list
thread/start
thread/resume
thread/archive
```

along with their integration tests. These are relatively light wrappers
around the existing core logic, and changes to core logic are minimal.

However, an improvement made for developer ergonomics:
- `thread/start` and `thread/resume` automatically attaches a
conversation listener internally, so clients don't have to make a
separate `AddConversationListener` call like they do today.

For consistency, also updated `model/list` and `feedback/upload` (naming
conventions, list API params).
2025-11-05 20:28:43 +00:00
Owen Lin
edf4c3f627 [app-server] feat: export.rs supports a v2 namespace, initial v2 notifications (#6212)
**Typescript and JSON schema exports**
While working on Thread/Turn/Items type definitions, I realize we will
run into name conflicts between v1 and v2 APIs (e.g. `RateLimitWindow`
which won't be reusable since v1 uses `RateLimitWindow` from `protocol/`
which uses snake_case, but we want to expose camelCase everywhere, so
we'll define a V2 version of that struct that serializes as camelCase).

To set us up for a clean and isolated v2 API, generate types into a
`v2/` namespace for both typescript and JSON schema.
- TypeScript: v2 types emit under `out_dir/v2/*.ts`, and root index.ts
now re-exports them via `export * as v2 from "./v2"`;.
- JSON Schemas: v2 definitions bundle under `#/definitions/v2/*` rather
than the root.

The location for the original types (v1 and types pulled from
`protocol/` and other core crates) haven't changed and are still at the
root. This is for backwards compatibility: no breaking changes to
existing usages of v1 APIs and types.

**Notifications**
While working on export.rs, I:
- refactored server/client notifications with macros (like we already do
for methods) so they also get exported (I noticed they weren't being
exported at all).
- removed the hardcoded list of types to export as JSON schema by
leveraging the existing macros instead
- and took a stab at API V2 notifications. These aren't wired up yet,
and I expect to iterate on these this week.
2025-11-05 01:02:39 +00:00
Celia Chen
d3187dbc17 [App-server] v2 for account/updated and account/logout (#6175)
V2 for `account/updated` and `account/logout` for app server. correspond
to old `authStatusChange` and `LogoutChatGpt` respectively. Followup PRs
will make other v2 endpoints call `account/updated` instead of
`authStatusChange` too.
2025-11-03 22:01:33 -08:00
Eric Traut
dccce34d84 Fix "archive conversation" on Windows (#6124)
Addresses issue https://github.com/openai/codex/issues/3582 where an
"archive conversation" command in the extension fails on Windows.

The problem is that the `archive_conversation` api server call is not
canonicalizing the path to the rollout path when performing its check to
verify that the rollout path is in the sessions directory. This causes
it to fail 100% of the time on Windows.

Testing: I was able to repro the error on Windows 100% prior to this
change. After the change, I'm no longer able to repro.
2025-11-02 21:41:05 -08:00
Anton Panasenko
0f22067242 [codex][app-server] improve error response for client requests (#6050) 2025-10-31 15:28:04 -07:00
pakrym-oai
2371d771cc Update user instruction message format (#6010) 2025-10-30 18:44:02 -07:00
Celia Chen
6ef658a9f9 [Hygiene] Remove include_view_image_tool config (#5976)
There's still some debate about whether we want to expose
`tools.view_image` or `feature.view_image` so those are left unchanged
for now, but this old `include_view_image_tool` config is good-to-go.
Also updated the doc to reflect that `view_image` tool is now by default
true.
2025-10-30 13:23:24 -07:00
Owen Lin
89c00611c2 [app-server] remove serde(skip_serializing_if = "Option::is_none") annotations (#5939)
We had this annotation everywhere in app-server APIs which made it so
that fields get serialized as `field?: T`, meaning if the field as
`None` we would omit the field in the payload. Removing this annotation
changes it so that we return `field: T | null` instead, which makes
codex app-server's API more aligned with the convention of public OpenAI
APIs like Responses.

Separately, remove the `#[ts(optional_fields = nullable)]` annotations
that were recently added which made all the TS types become `field?: T |
null` which is not great since clients need to handle undefined and
null.

I think generally it'll be best to have optional types be either:
- `field: T | null` (preferred, aligned with public OpenAI APIs)
- `field?: T` where we have to, such as types generated from the MCP
schema:
https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/schema/2025-06-18/schema.ts
(see changes to `mcp-types/`)

I updated @etraut-openai's unit test to check that all generated TS
types are one or the other, not both (so will error if we have a type
that has `field?: T | null`). I don't think there's currently a good use
case for that - but we can always revisit.
2025-10-30 18:18:53 +00:00
Anton Panasenko
9572cfc782 [codex] add developer instructions (#5897)
we are using developer instructions for code reviews, we need to pass
them in cli as well.
2025-10-30 11:18:31 -07:00
jif-oai
f4f9695978 feat: compaction prompt configurable (#5959)
```
 codex -c compact_prompt="Summarize in bullet points"
 ```
2025-10-30 14:24:24 +00:00
jif-oai
aa76003e28 chore: unify config crates (#5958) 2025-10-30 10:28:32 +00:00
jif-oai
db31f6966d chore: config editor (#5878)
The goal is to have a single place where we actually write files

In a follow-up PR, will move everything config related in a dedicated
module and move the helpers in a dedicated file
2025-10-29 20:52:46 +00:00
Rasmus Rygaard
39e09c289d Add a wrapper around raw response items (#5923)
We currently have nested enums when sending raw response items in the
app-server protocol. This makes downstream schemas confusing because we
need to embed `type`-discriminated enums within each other.

This PR adds a small wrapper around the response item so we can keep the
schemas separate
2025-10-29 20:32:40 +00:00
Anton Panasenko
149e198ce8 [codex][app-server] resume conversation from history (#5893) 2025-10-28 18:18:03 -07:00
Gabriel Peal
1d76ba5ebe [App Server] Allow fetching or resuming a conversation summary from the conversation id (#5890)
This PR adds an option to app server to allow conversation summaries to
be fetched from just the conversation id rather than rollout path for
convenience at the cost of some latency to discover the rollout path.

This convenience is non-trivial as it allows app servers to simply
maintain conversation ids rather than rollout paths and the associated
platform (Windows) handling associated with storing and encoding them
correctly.
2025-10-28 20:17:22 -04:00
Owen Lin
266419217e chore: use anyhow::Result for all app-server integration tests (#5836)
There's a lot of visual noise in app-server's integration tests due to
the number of `.expect("<some_msg>")` lines which are largely redundant
/ not very useful. Clean them up by using `anyhow::Result` + `?`
consistently.

Replaces the existing pattern of:
```
    let codex_home = TempDir::new().expect("create temp dir");
    create_config_toml(codex_home.path()).expect("write config.toml");

    let mut mcp = McpProcess::new(codex_home.path())
        .await
        .expect("spawn mcp process");
    timeout(DEFAULT_READ_TIMEOUT, mcp.initialize())
        .await
        .expect("initialize timeout")
        .expect("initialize request");
```

With:
```
    let codex_home = TempDir::new()?;
    create_config_toml(codex_home.path())?;

    let mut mcp = McpProcess::new(codex_home.path()).await?;
    timeout(DEFAULT_READ_TIMEOUT, mcp.initialize()).await??;
```
2025-10-28 08:10:23 -07:00
Celia Chen
4a42c4e142 [Auth] Choose which auth storage to use based on config (#5792)
This PR is a follow-up to #5591. It allows users to choose which auth
storage mode they want by using the new
`cli_auth_credentials_store_mode` config.
2025-10-27 19:41:49 -07:00
Celia Chen
eb5b1b627f [Auth] Introduce New Auth Storage Abstraction for Codex CLI (#5569)
This PR introduces a new `Auth Storage` abstraction layer that takes
care of read, write, and load of auth tokens based on the
AuthCredentialsStoreMode. It is similar to how we handle MCP client
oauth
[here](https://github.com/openai/codex/blob/main/codex-rs/rmcp-client/src/oauth.rs).
Instead of reading and writing directly from disk for auth tokens, Codex
CLI workflows now should instead use this auth storage using the public
helper functions.

This PR is just a refactor of the current code so the behavior stays the
same. We will add support for keyring and hybrid mode in follow-up PRs.

I have read the CLA Document and I hereby sign the CLA
2025-10-27 11:01:14 -07:00
Michael Bolin
5ee8a17b4e feat: introduce GetConversationSummary RPC (#5803)
This adds an RPC to the app server to the the `ConversationSummary` via
a rollout path. Now that the VS Code extension supports showing the
Codex UI in an editor panel where the URI of the panel maps to the
rollout file, we need to be able to get the `ConversationSummary` from
the rollout file directly.
2025-10-27 09:11:45 -07:00
Michael Bolin
15fa2283e7 feat: update NewConversationParams to take an optional model_provider (#5793)
An AppServer client should be able to use any (`model_provider`, `model`) in the user's config. `NewConversationParams` already supported specifying the `model`, but this PR expands it to support `model_provider`, as well.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/5793).
* #5803
* __->__ #5793
2025-10-27 09:33:30 +00:00
Michael Bolin
5907422d65 feat: annotate conversations with model_provider for filtering (#5658)
Because conversations that use the Responses API can have encrypted
reasoning messages, trying to resume a conversation with a different
provider could lead to confusing "failed to decrypt" errors. (This is
reproducible by starting a conversation using ChatGPT login and resuming
it as a conversation that uses OpenAI models via Azure.)

This changes `ListConversationsParams` to take a `model_providers:
Option<Vec<String>>` and adds `model_provider` on each
`ConversationSummary` it returns so these cases can be disambiguated.

Note this ended up making changes to
`codex-rs/core/src/rollout/tests.rs` because it had a number of cases
where it expected `Some` for the value of `next_cursor`, but the list of
rollouts was complete, so according to this docstring:


bcd64c7e72/codex-rs/app-server-protocol/src/protocol.rs (L334-L337)

If there are no more items to return, then `next_cursor` should be
`None`. This PR updates that logic.






---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/5658).
* #5803
* #5793
* __->__ #5658
2025-10-27 02:03:30 -07:00
Ahmed Ibrahim
f178805252 Add feedback upload request handling (#5682) 2025-10-27 05:53:39 +00:00
Eric Traut
0533bd2e7c Fixed flaky unit test (#5654)
This PR fixes a test that is sporadically failing in CI.

The problem is that two unit tests (the older `login_and_cancel_chatgpt`
and a recently added
`login_chatgpt_includes_forced_workspace_query_param`) exercise code
paths that start the login server. The server binds to a hard-coded
localhost port number, so attempts to start more than one server at the
same time will fail. If these two tests happen to run concurrently, one
of them will fail.

To fix this, I've added a simple mutex. We can use this same mutex for
future tests that use the same pattern.
2025-10-24 16:31:24 -07:00
Anton Panasenko
6af83d86ff [codex][app-server] introduce codex/event/raw_item events (#5578) 2025-10-24 22:41:52 +00:00
Eric Traut
f8af4f5c8d Added model summary and risk assessment for commands that violate sandbox policy (#5536)
This PR adds support for a model-based summary and risk assessment for
commands that violate the sandbox policy and require user approval. This
aids the user in evaluating whether the command should be approved.

The feature works by taking a failed command and passing it back to the
model and asking it to summarize the command, give it a risk level (low,
medium, high) and a risk category (e.g. "data deletion" or "data
exfiltration"). It uses a new conversation thread so the context in the
existing thread doesn't influence the answer. If the call to the model
fails or takes longer than 5 seconds, it falls back to the current
behavior.

For now, this is an experimental feature and is gated by a config key
`experimental_sandbox_command_assessment`.

Here is a screen shot of the approval prompt showing the risk assessment
and summary.

<img width="723" height="282" alt="image"
src="https://github.com/user-attachments/assets/4597dd7c-d5a0-4e9f-9d13-414bd082fd6b"
/>
2025-10-24 15:23:44 -07:00
Thibault Sottiaux
3059373e06 fix: resume lookup for gitignored CODEX_HOME (#5311)
Walk the sessions tree instead of using file_search so gitignored
CODEX_HOME directories can resume sessions. Add a regression test that
covers a .gitignore'd sessions directory.

Fixes #5247
Fixes #5412

---------

Co-authored-by: Owen Lin <owen@openai.com>
2025-10-23 17:04:40 +00:00
Owen Lin
aee321f62b [app-server] add new account method API stubs (#5527)
These are the schema definitions for the new JSON-RPC APIs associated
with accounts. These are not wired up to business logic yet and will
currently throw an internal error indicating these are unimplemented.
2025-10-22 15:36:11 -07:00
Owen Lin
8ae3949072 [app-server] send account/rateLimits/updated notifications (#5477)
Codex will now send an `account/rateLimits/updated` notification
whenever the user's rate limits are updated.

This is implemented by just transforming the existing TokenCount event.
2025-10-22 20:12:40 +00:00
pakrym-oai
3c90728a29 Add new thread items and rewire event parsing to use them (#5418)
1. Adds AgentMessage,  Reasoning,  WebSearch items.
2. Switches the ResponseItem parsing to use new items and then also emit
3. Removes user-item kind and filters out "special" (environment) user
items when returning to clients.
2025-10-22 10:14:50 -07:00
Anton Panasenko
682d05512f [otel] init otel for app-server (#5469) 2025-10-21 12:34:27 -07:00
Owen Lin
26f314904a [app-server] model/list API (#5382)
Adds a `model/list` paginated API that returns the list of models
supported by Codex.
2025-10-21 11:15:17 -07:00
pakrym-oai
1b10a3a1b2 Enable plan tool by default (#5384)
## Summary
- make the plan tool available by default by removing the feature flag
and always registering the handler
- drop plan-tool CLI and API toggles across the exec, TUI, MCP server,
and app server code paths
- update tests and configs to reflect the always-on plan tool and guard
workspace restriction tests against env leakage

## Testing
Manually tested the extension. 
------
https://chatgpt.com/codex/tasks/task_i_68f67a3ff2d083209562a773f814c1f9
2025-10-21 16:25:05 +00:00
Owen Lin
5c680c6587 [app-server] read rate limits API (#5302)
Adds a `GET account/rateLimits/read` API to app-server. This calls the
codex backend to fetch the user's current rate limits.

This would be helpful in checking rate limits without having to send a
message.

For calling the codex backend usage API, I generated the types and
manually copied the relevant ones into `codex-backend-openapi-types`.
It'll be nice to extend our internal openapi generator to support Rust
so we don't have to run these manual steps.

# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.
2025-10-20 14:11:54 -07:00
pakrym-oai
9c903c4716 Add ItemStarted/ItemCompleted events for UserInputItem (#5306)
Adds a new ItemStarted event and delivers UserMessage as the first item
type (more to come).


Renames `InputItem` to `UserInput` considering we're using the `Item`
suffix for actual items.
2025-10-20 13:34:44 -07:00
jif-oai
5e4f3bbb0b chore: rework tools execution workflow (#5278)
Re-work the tool execution flow. Read `orchestrator.rs` to understand
the structure
2025-10-20 20:57:37 +01:00
Gabriel Peal
d87f87e25b Add forced_chatgpt_workspace_id and forced_login_method configuration options (#5303)
This PR adds support for configs to specify a forced login method
(chatgpt or api) as well as a forced chatgpt account id. This lets
enterprises uses [managed
configs](https://developers.openai.com/codex/security#managed-configuration)
to force all employees to use their company's workspace instead of their
own or any other.

When a workspace id is set, a query param is sent to the login flow
which auto-selects the given workspace or errors if the user isn't a
member of it.

This PR is large but a large % of it is tests, wiring, and required
formatting changes.

API login with chatgpt forced
<img width="1592" height="116" alt="CleanShot 2025-10-19 at 22 40 04"
src="https://github.com/user-attachments/assets/560c6bb4-a20a-4a37-95af-93df39d057dd"
/>

ChatGPT login with api forced
<img width="1018" height="100" alt="CleanShot 2025-10-19 at 22 40 29"
src="https://github.com/user-attachments/assets/d010bbbb-9c8d-4227-9eda-e55bf043b4af"
/>

Onboarding with api forced
<img width="892" height="460" alt="CleanShot 2025-10-19 at 22 41 02"
src="https://github.com/user-attachments/assets/cc0ed45c-b257-4d62-a32e-6ca7514b5edd"
/>

Onboarding with ChatGPT forced
<img width="1154" height="426" alt="CleanShot 2025-10-19 at 22 41 27"
src="https://github.com/user-attachments/assets/41c41417-dc68-4bb4-b3e7-3b7769f7e6a1"
/>

Logging in with the wrong workspace
<img width="2222" height="84" alt="CleanShot 2025-10-19 at 22 42 31"
src="https://github.com/user-attachments/assets/0ff4222c-f626-4dd3-b035-0b7fe998a046"
/>
2025-10-20 08:50:54 -07:00
Thibault Sottiaux
4f46360aa4 feat: add --add-dir flag for extra writable roots (#5335)
Add a `--add-dir` CLI flag so sessions can use extra writable roots in
addition to the ones specified in the config file. These are ephemerally
added during the session only.

Fixes #3303
Fixes #2797
2025-10-18 22:13:53 -07:00
Michael Bolin
995f5c3614 feat: add Vec<ParsedCommand> to ExecApprovalRequestEvent (#5222)
This adds `parsed_cmd: Vec<ParsedCommand>` to `ExecApprovalRequestEvent`
in the core protocol (`protocol/src/protocol.rs`), which is also what
this field is named on `ExecCommandBeginEvent`. Honestly, I don't love
the name (it sounds like a single command, but it is actually a list of
them), but I don't want to get distracted by a naming discussion right
now.

This also adds `parsed_cmd` to `ExecCommandApprovalParams` in
`codex-rs/app-server-protocol/src/protocol.rs`, so it will be available
via `codex app-server`, as well.

For consistency, I also updated `ExecApprovalElicitRequestParams` in
`codex-rs/mcp-server/src/exec_approval.rs` to include this field under
the name `codex_parsed_cmd`, as that struct already has a number of
special `codex_*` fields. Note this is the code for when Codex is used
as an MCP _server_ and therefore has to conform to the official spec for
an MCP elicitation type.
2025-10-15 13:58:40 -07:00
Fouad Matin
a5b7675e42 add(core): managed config (#3868)
## Summary

- Factor `load_config_as_toml` into `core::config_loader` so config
loading is reusable across callers.
- Layer `~/.codex/config.toml`, optional `~/.codex/managed_config.toml`,
and macOS managed preferences (base64) with recursive table merging and
scoped threads per source.

## Config Flow

```
Managed prefs (macOS profile: com.openai.codex/config_toml_base64)
                               ▲
                               │
~/.codex/managed_config.toml   │  (optional file-based override)
                               ▲
                               │
                ~/.codex/config.toml (user-defined settings)
```

- The loader searches under the resolved `CODEX_HOME` directory
(defaults to `~/.codex`).
- Managed configs let administrators ship fleet-wide overrides via
device profiles which is useful for enforcing certain settings like
sandbox or approval defaults.
- For nested hash tables: overlays merge recursively. Child tables are
merged key-by-key, while scalar or array values replace the prior layer
entirely. This lets admins add or tweak individual fields without
clobbering unrelated user settings.
2025-10-03 13:02:26 -07:00
Michael Bolin
153338c20f docs: add barebones README for codex-app-server crate (#4671) 2025-10-03 09:26:44 -07:00