Commit Graph

112 Commits

Author SHA1 Message Date
jif-oai
5e4f3bbb0b chore: rework tools execution workflow (#5278)
Re-work the tool execution flow. Read `orchestrator.rs` to understand
the structure
2025-10-20 20:57:37 +01:00
Owen Lin
c84fc83222 Use int timestamps for rate limit reset_at (#5383)
The backend will be returning unix timestamps (seconds since epoch)
instead of RFC 3339 strings. This will make it more ergonomic for
developers to integrate against - no string parsing.
2025-10-20 12:26:46 -07:00
Ahmed Ibrahim
049a61bcfc Auto compact at ~90% (#5292)
Users now hit a window exceeded limit and they usually don't know what
to do. This starts auto compact at ~90% of the window.
2025-10-20 11:29:49 -07:00
Thibault Sottiaux
0e08dd6055 fix: switch rate limit reset handling to timestamps (#5304)
This change ensures that we store the absolute time instead of relative
offsets of when the primary and secondary rate limits will reset.
Previously these got recalculated relative to current time, which leads
to the displayed reset times to change over time, including after doing
a codex resume.

For previously changed sessions, this will cause the reset times to not
show due to this being a breaking change:
<img width="524" height="55" alt="Screenshot 2025-10-17 at 5 14 18 PM"
src="https://github.com/user-attachments/assets/53ebd43e-da25-4fef-9c47-94a529d40265"
/>

Fixes https://github.com/openai/codex/issues/4761
2025-10-17 17:39:37 -07:00
pakrym-oai
ed5b0bfeb3 Improve error decoding response body error (#5263)
Split Reqwest error into separate error:
1. One for streaming response
2. One for initial connection failing

Include request_id where possible.

<img width="1791" height="116" alt="image"
src="https://github.com/user-attachments/assets/549aa330-acfa-496a-9898-77fa58436316"
/>
2025-10-16 14:51:42 -07:00
Anton Panasenko
c146585cdb [codex][otel] propagate user email in otel events (#5223)
include user email into otel events for proper user-level attribution in
case of workspace setup
2025-10-15 17:53:33 -07:00
jif-oai
268a10f917 feat: add header for task kind (#5142)
Add a header in the responses API request for the task kind (compact,
review, ...) for observability purpose
The header name is `codex-task-type`
2025-10-14 15:17:00 +00:00
Michael Bolin
fe8122e514 fix: change log_sse_event() so it no longer takes a closure (#4953)
Unlikely fix for https://github.com/openai/codex/issues/4381, but worth a shot given that https://github.com/openai/codex/pull/2103 changed around the same time.
2025-10-08 16:53:35 +00:00
pakrym-oai
5c42419b02 Use assert_matches (#4756)
assert_matches is soon to be in std but is experimental for now.
2025-10-05 21:12:31 +00:00
jif-oai
dc3c6bf62a feat: parallel tool calls (#4663)
Add parallel tool calls. This is configurable at model level and tool
level
2025-10-05 16:10:49 +00:00
Ahmed Ibrahim
90ef94d3b3 Surface context window error to the client (#4675)
In the past, we were treating `input exceeded context window` as a
streaming error and retrying on it. Retrying on it has no point because
it won't change the behavior. In this PR, we surface the error to the
client without retry and also send a token count event to indicate that
the context window is full.

<img width="650" height="125" alt="image"
src="https://github.com/user-attachments/assets/c26b1213-4c27-4bfc-90f4-51a270a3efd5"
/>
2025-10-05 01:40:06 +00:00
pakrym-oai
e899ae7d8a Include request ID in the error message (#4572)
To help with issue debugging
<img width="1414" height="253" alt="image"
src="https://github.com/user-attachments/assets/254732df-44ac-4252-997a-6c5e0927355b"
/>
2025-10-01 15:36:04 -07:00
Michael Bolin
5881c0d6d4 fix: remove mcp-types from app server protocol (#4537)
We continue the separation between `codex app-server` and `codex
mcp-server`.

In particular, we introduce a new crate, `codex-app-server-protocol`,
and migrate `codex-rs/protocol/src/mcp_protocol.rs` into it, renaming it
`codex-rs/app-server-protocol/src/protocol.rs`.

Because `ConversationId` was defined in `mcp_protocol.rs`, we move it
into its own file, `codex-rs/protocol/src/conversation_id.rs`, and
because it is referenced in a ton of places, we have to touch a lot of
files as part of this PR.

We also decide to get away from proper JSON-RPC 2.0 semantics, so we
also introduce `codex-rs/app-server-protocol/src/jsonrpc_lite.rs`, which
is basically the same `JSONRPCMessage` type defined in `mcp-types`
except with all of the `"jsonrpc": "2.0"` removed.

Getting rid of `"jsonrpc": "2.0"` makes our serialization logic
considerably simpler, as we can lean heavier on serde to serialize
directly into the wire format that we use now.
2025-10-01 02:16:26 +00:00
vishnu-oai
04c1782e52 OpenTelemetry events (#2103)
### Title

## otel

Codex can emit [OpenTelemetry](https://opentelemetry.io/) **log events**
that
describe each run: outbound API requests, streamed responses, user
input,
tool-approval decisions, and the result of every tool invocation. Export
is
**disabled by default** so local runs remain self-contained. Opt in by
adding an
`[otel]` table and choosing an exporter.

```toml
[otel]
environment = "staging"   # defaults to "dev"
exporter = "none"          # defaults to "none"; set to otlp-http or otlp-grpc to send events
log_user_prompt = false    # defaults to false; redact prompt text unless explicitly enabled
```

Codex tags every exported event with `service.name = "codex-cli"`, the
CLI
version, and an `env` attribute so downstream collectors can distinguish
dev/staging/prod traffic. Only telemetry produced inside the
`codex_otel`
crate—the events listed below—is forwarded to the exporter.

### Event catalog

Every event shares a common set of metadata fields: `event.timestamp`,
`conversation.id`, `app.version`, `auth_mode` (when available),
`user.account_id` (when available), `terminal.type`, `model`, and
`slug`.

With OTEL enabled Codex emits the following event types (in addition to
the
metadata above):

- `codex.api_request`
  - `cf_ray` (optional)
  - `attempt`
  - `duration_ms`
  - `http.response.status_code` (optional)
  - `error.message` (failures)
- `codex.sse_event`
  - `event.kind`
  - `duration_ms`
  - `error.message` (failures)
  - `input_token_count` (completion only)
  - `output_token_count` (completion only)
  - `cached_token_count` (completion only, optional)
  - `reasoning_token_count` (completion only, optional)
  - `tool_token_count` (completion only)
- `codex.user_prompt`
  - `prompt_length`
  - `prompt` (redacted unless `log_user_prompt = true`)
- `codex.tool_decision`
  - `tool_name`
  - `call_id`
- `decision` (`approved`, `approved_for_session`, `denied`, or `abort`)
  - `source` (`config` or `user`)
- `codex.tool_result`
  - `tool_name`
  - `call_id`
  - `arguments`
  - `duration_ms` (execution time for the tool)
  - `success` (`"true"` or `"false"`)
  - `output`

### Choosing an exporter

Set `otel.exporter` to control where events go:

- `none` – leaves instrumentation active but skips exporting. This is
the
  default.
- `otlp-http` – posts OTLP log records to an OTLP/HTTP collector.
Specify the
  endpoint, protocol, and headers your collector expects:

  ```toml
  [otel]
  exporter = { otlp-http = {
    endpoint = "https://otel.example.com/v1/logs",
    protocol = "binary",
    headers = { "x-otlp-api-key" = "${OTLP_TOKEN}" }
  }}
  ```

- `otlp-grpc` – streams OTLP log records over gRPC. Provide the endpoint
and any
  metadata headers:

  ```toml
  [otel]
  exporter = { otlp-grpc = {
    endpoint = "https://otel.example.com:4317",
    headers = { "x-otlp-meta" = "abc123" }
  }}
  ```

If the exporter is `none` nothing is written anywhere; otherwise you
must run or point to your
own collector. All exporters run on a background batch worker that is
flushed on
shutdown.

If you build Codex from source the OTEL crate is still behind an `otel`
feature
flag; the official prebuilt binaries ship with the feature enabled. When
the
feature is disabled the telemetry hooks become no-ops so the CLI
continues to
function without the extra dependencies.

---------

Co-authored-by: Anton Panasenko <apanasenko@openai.com>
2025-09-29 11:30:55 -07:00
Ahmed Ibrahim
1fba99ed85 /status followup (#4304)
- Render `send a message to load usage data` in the beginning of the
session
- Render `data not available yet` if received no rate limits 
- nit case
- Deleted stall snapshots that were moved to
`codex-rs/tui/src/status/snapshots`
2025-09-26 18:16:54 +00:00
Ahmed Ibrahim
7355ca48c5 fix (#4251)
# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.
2025-09-25 15:12:25 -07:00
Michael Bolin
a0c37f5d07 chore: refactor attempt_stream_responses() out of stream_responses() (#4194)
I would like to be able to swap in a different way to resolve model
sampling requests, so this refactoring consolidates things behind
`attempt_stream_responses()` to make that easier. Ideally, we would
support an in-memory backend that we can use in our integration tests,
for example.
2025-09-25 10:34:07 -07:00
pakrym-oai
e85742635f Send text parameter for non-gpt-5 models (#4195)
We had a hardcoded check for gpt-5 before.

Fixes: https://github.com/openai/codex/issues/4181
2025-09-24 22:00:06 +00:00
Michael Bolin
639a6fd2f3 chore: upgrade to Rust 1.90 (#4124)
Inspired by Dependabot's attempt to do this:
https://github.com/openai/codex/pull/4029

The new version of Clippy found some unused structs that are removed in
this PR.

Though nothing stood out to me in the Release Notes in terms of things
we should start to take advantage of:
https://blog.rust-lang.org/2025/09/18/Rust-1.90.0/.
2025-09-24 08:32:00 -07:00
Ahmed Ibrahim
cb96f4f596 Add Reset in for rate limits (#4111)
- Parse the headers
- Reorganize the struct because it's getting too long
- show the resets at in the tui

<img width="324" height="79" alt="image"
src="https://github.com/user-attachments/assets/ca15cd48-f112-4556-91ab-1e3a9bc4683d"
/>
2025-09-24 15:31:08 +00:00
Ahmed Ibrahim
8227a5ba1b Send limits when getting rate limited (#4102)
Users need visibility on rate limits when they are rate limited.
2025-09-23 22:56:34 +00:00
pakrym-oai
fdb8dadcae Add exec output-schema parameter (#4079)
Adds structured output to `exec` via the `--structured-output`
parameter.
2025-09-23 13:59:16 -07:00
Ahmed Ibrahim
dd56750612 Change headers and struct of rate limits (#4060) 2025-09-22 21:06:20 +00:00
jif-oai
be366a31ab chore: clippy on redundant closure (#4058)
Add redundant closure clippy rules and let Codex fix it by minimising
FQP
2025-09-22 19:30:16 +00:00
Ahmed Ibrahim
04504d8218 Forward Rate limits to the UI (#3965)
We currently get information about rate limits in the response headers.
We want to forward them to the clients to have better transparency.
UI/UX plans have been discussed and this information is needed.
2025-09-20 21:26:16 -07:00
pakrym-oai
3d4acbaea0 Preserve IDs for more item types in azure (#3542)
https://github.com/openai/codex/issues/3509
2025-09-13 01:09:56 +00:00
pakrym-oai
e3c6903199 Add Azure Responses API workaround (#3528)
Azure Responses API doesn't work well with store:false and response
items.

If store = false and id is sent an error is thrown that ID is not found
If store = false and id is not sent an error is thrown that ID is
required

Add detection for Azure urls and add a workaround to preserve reasoning
item IDs and send store:true
2025-09-12 13:52:15 -07:00
jif-oai
ea225df22e feat: context compaction (#3446)
## Compact feature:
1. Stops the model when the context window become too large
2. Add a user turn, asking for the model to summarize
3. Build a bridge that contains all the previous user message + the
summary. Rendered from a template
4. Start sampling again from a clean conversation with only that bridge
2025-09-12 13:07:10 -07:00
jif-oai
c6fd056aa6 feat: reasoning effort as optional (#3527)
Allow the reasoning effort to be optional
2025-09-12 12:06:33 -07:00
pakrym-oai
3b5a5412bb Log cf-ray header in client traces (#3488)
## Summary
- log the `cf-ray` header when tracing HTTP responses in the Codex
client
- keep existing response status logging unchanged

## Testing
- just fmt
- just fix -p codex-core
- cargo test -p codex-core *(fails:
suite::client::azure_overrides_assign_properties_used_for_responses_url,
suite::client::env_var_overrides_loaded_auth)*

------
https://chatgpt.com/codex/tasks/task_i_68c31640dacc83209be131baf91611cd
2025-09-11 17:42:44 -07:00
Ahmed Ibrahim
44587c2443 Use PlanType enum when formatting usage-limit CTA (#3495)
- Started using Play type struct
- Added CTA for team/business 
- Refactored a bit to unify the logic
2025-09-11 22:01:25 +00:00
Gabriel Peal
5eab4c7ab4 Replace config.responses_originator_header_internal_override with CODEX_INTERNAL_ORIGINATOR_OVERRIDE_ENV_VAR (#3388)
The previous config approach had a few issues:
1. It is part of the config but not designed to be used externally
2. It had to be wired through many places (look at the +/- on this PR
3. It wasn't guaranteed to be set consistently everywhere because we
don't have a super well defined way that configs stack. For example, the
extension would configure during newConversation but anything that
happened outside of that (like login) wouldn't get it.

This env var approach is cleaner and also creates one less thing we have
to deal with when coming up with a better holistic story around configs.

One downside is that I removed the unit test testing for the override
because I don't want to deal with setting the global env or spawning
child processes and figuring out how to introspect their originator
header. The new code is sufficiently simple and I tested it e2e that I
feel as if this is still worth it.
2025-09-09 17:23:23 -04:00
Gabriel Peal
c8fab51372 Use ConversationId instead of raw Uuids (#3282)
We're trying to migrate from `session_id: Uuid` to `conversation_id:
ConversationId`. Not only does this give us more type safety but it
unifies our terminology across Codex and with the implementation of
session resuming, a conversation (which can span multiple sessions) is
more appropriate.

I started this impl on https://github.com/openai/codex/pull/3219 as part
of getting resume working in the extension but it's big enough that it
should be broken out.
2025-09-07 23:22:25 -04:00
pakrym-oai
0269096229 Move token usage/context information to session level (#3221)
Move context information into the main loop so it can be used to
interrupt the loop or start auto-compaction.
2025-09-06 15:19:23 +00:00
pakrym-oai
5775174ec2 Never store requests (#3212)
When item ids are sent to Responses API it will load them from the
database ignoring the provided values. This adds extra latency.

Not having the mode to store requests also allows us to simplify the
code.

## Breaking change

The `disable_response_storage` configuration option is removed.
2025-09-05 10:41:47 -07:00
pakrym-oai
c636f821ae Add a common way to create HTTP client (#3110)
Ensure User-Agent and originator are always sent.
2025-09-03 10:11:02 -07:00
pakrym-oai
03e2796ca4 Move CodexAuth and AuthManager to the core crate (#3074)
Fix a long standing layering issue.
2025-09-02 18:36:19 -07:00
Eric Traut
051f185ce3 Added back the logic to handle rate-limit errors when using API key (#3070)
A previous PR removed this when adding rate-limit errors for the ChatGPT
auth path.
2025-09-02 17:50:15 -07:00
Ahmed Ibrahim
9dbe7284d2 Following up on #2371 post commit feedback (#2852)
- Introduce websearch end to complement the begin 
- Moves the logic of adding the sebsearch tool to
create_tools_json_for_responses_api
- Making it the client responsibility to toggle the tool on or off 
- Other misc in #2371 post commit feedback
- Show the query:

<img width="1392" height="151" alt="image"
src="https://github.com/user-attachments/assets/8457f1a6-f851-44cf-bcca-0d4fe460ce89"
/>
2025-08-28 19:24:38 -07:00
Ahmed Ibrahim
d0e06f74e2 send context window with task started (#2752)
- Send context window with task started
- Accounting for changing the model per turn
2025-08-27 00:04:21 -07:00
Eric Traut
ab9250e714 Improved user message for rate-limit errors (#2695)
This PR improves the error message presented to the user when logged in
with ChatGPT and a rate-limit error occurs. In particular, it provides
the user with information about when the rate limit will be reset. It
removes older code that attempted to do the same but relied on parsing
of error messages that are not generated by the ChatGPT endpoint. The
new code uses newly-added error fields.
2025-08-25 21:42:10 -07:00
Eric Traut
d63e44ae29 Fixed a bug that causes token refresh to not work in a seamless manner (#2699)
This PR fixes a bug in the token refresh logic. Token refresh is
performed in a retry loop so if we receive a 401 error, we refresh the
token, then we go around the loop again and reissue the fetch with a
fresh token. The bug is that we're not using the updated token on the
second and subsequent times through the loop. The result is that we'll
try to refresh the token a few more times until we hit the retry limit
(default of 4). The 401 error is then passed back up to the caller.
Subsequent calls will use the refreshed token, so the problem clears
itself up.

The fix is straightforward — make sure we use the updated auth
information each time through the retry loop.
2025-08-25 19:18:16 -07:00
Reuben Narad
363636f5eb Add web search tool (#2371)
Adds web_search tool, enabling the model to use Responses API web_search
tool.
- Disabled by default, enabled by --search flag
- When --search is passed, exposes web_search_request function tool to
the model, which triggers user approval. When approved, the model can
use the web_search tool for the remainder of the turn
<img width="1033" height="294" alt="image"
src="https://github.com/user-attachments/assets/62ac6563-b946-465c-ba5d-9325af28b28f"
/>

---------

Co-authored-by: easong-openai <easong@openai.com>
2025-08-23 22:58:56 -07:00
Ahmed Ibrahim
097782c775 Move models.rs to protocol (#2595)
Moving models.rs to protocol so we can use them in `Codex` operations
2025-08-22 22:18:54 +00:00
Dylan
236c4f76a6 [apply_patch] freeform apply_patch tool (#2576)
## Summary
GPT-5 introduced the concept of [custom
tools](https://platform.openai.com/docs/guides/function-calling#custom-tools),
which allow the model to send a raw string result back, simplifying
json-escape issues. We are migrating gpt-5 to use this by default.

However, gpt-oss models do not support custom tools, only normal
functions. So we keep both tool definitions, and provide whichever one
the model family supports.

## Testing
- [x] Tested locally with various models
- [x] Unit tests pass
2025-08-22 13:42:34 -07:00
Eric Traut
dc42ec0eb4 Add AuthManager and enhance GetAuthStatus command (#2577)
This PR adds a central `AuthManager` struct that manages the auth
information used across conversations and the MCP server. Prior to this,
each conversation and the MCP server got their own private snapshots of
the auth information, and changes to one (such as a logout or token
refresh) were not seen by others.

This is especially problematic when multiple instances of the CLI are
run. For example, consider the case where you start CLI 1 and log in to
ChatGPT account X and then start CLI 2 and log out and then log in to
ChatGPT account Y. The conversation in CLI 1 is still using account X,
but if you create a new conversation, it will suddenly (and
unexpectedly) switch to account Y.

With the `AuthManager`, auth information is read from disk at the time
the `ConversationManager` is constructed, and it is cached in memory.
All new conversations use this same auth information, as do any token
refreshes.

The `AuthManager` is also used by the MCP server's GetAuthStatus
command, which now returns the auth method currently used by the MCP
server.

This PR also includes an enhancement to the GetAuthStatus command. It
now accepts two new (optional) input parameters: `include_token` and
`refresh_token`. Callers can use this to request the in-use auth token
and can optionally request to refresh the token.

The PR also adds tests for the login and auth APIs that I recently added
to the MCP server.
2025-08-22 13:10:11 -07:00
vjain419
80b00a193e feat(gpt5): add model_verbosity for GPT‑5 via Responses API (#2108)
**Summary**
- Adds `model_verbosity` config (values: low, medium, high).
- Sends `text.verbosity` only for GPT‑5 family models via the Responses
API.
- Updates docs and adds serialization tests.

**Motivation**
- GPT‑5 introduces a verbosity control to steer output length/detail
without pro
mpt surgery.
- Exposing it as a config knob keeps prompts stable and makes behavior
explicit
and repeatable.

**Changes**
- Config:
  - Added `Verbosity` enum (low|medium|high).
- Added optional `model_verbosity` to `ConfigToml`, `Config`, and
`ConfigProfi
le`.
- Request wiring:
  - Extended `ResponsesApiRequest` with optional `text` object.
- Populates `text.verbosity` only when model family is `gpt-5`; omitted
otherw
ise.
- Tests:
- Verifies `text.verbosity` serializes when set and is omitted when not
set.
- Docs:
  - Added “GPT‑5 Verbosity” section in `codex-rs/README.md`.
  - Added `model_verbosity` section to `codex-rs/config.md`.

**Usage**
- In `~/.codex/config.toml`:
  - `model = "gpt-5"`
  - `model_verbosity = "low"` (or `"medium"` default, `"high"`)
- CLI override example:
  - `codex -c model="gpt-5" -c model_verbosity="high"`

**API Impact**
- Requests to GPT‑5 via Responses API include: `text: { verbosity:
"low|medium|h
igh" }` when configured.
- For legacy models or Chat Completions providers, `text` is omitted.

**Backward Compatibility**
- Default behavior unchanged when `model_verbosity` is not set (server
default “
medium”).

**Testing**
- Added unit tests for serialization/omission of `text.verbosity`.
- Ran `cargo fmt` and `cargo test --all-features` (all green).

**Docs**
- `README.md`: new “GPT‑5 Verbosity” note under Config with example.
- `config.md`: new `model_verbosity` section.

**Out of Scope**
- No changes to temperature/top_p or other GPT‑5 parameters.
- No changes to Chat Completions wiring.

**Risks / Notes**
- If OpenAI changes the wire shape for verbosity, we may need to update
`Respons
esApiRequest`.
- Behavior gated to `gpt-5` model family to avoid unexpected effects
elsewhere.

**Checklist**
- [x] Code gated to GPT‑5 family only
- [x] Docs updated (`README.md`, `config.md`)
- [x] Tests added and passing
- [x] Formatting applied

Release note: Add `model_verbosity` config to control GPT‑5 output verbosity via the Responses API (low|medium|high).
2025-08-22 09:12:10 -07:00
Ahmed Ibrahim
c579ae41ae Fix login for internal employees (#2528)
This PR:
- fixes for internal employee because we currently want to prefer SIWC
for them.
- fixes retrying forever on unauthorized access. we need to break
eventually on max retries.
2025-08-20 14:05:20 -07:00
Michael Bolin
ce434b1219 fix: prefer config var to env var (#2495) 2025-08-20 04:51:59 +00:00
Ahmed Ibrahim
d1f1e36836 Refresh ChatGPT auth token (#2484)
ChatGPT token's live for only 1 hour. If the session is longer we don't
refresh the token. We should get the expiry timestamp and attempt to
refresh before it.
2025-08-19 21:01:31 -07:00