Commit Graph

203 Commits

Author SHA1 Message Date
pakrym-oai
0cf57e1f42 Include output truncation message in tool call results (#2183)
To avoid model being confused about incomplete output.
2025-08-11 11:52:05 -07:00
Gabriel Peal
7f6408720b [1/3] Parse exec commands and format them more nicely in the UI (#2095)
# Note for reviewers
The bulk of this PR is in in the new file, `parse_command.rs`. This file
is designed to be written TDD and implemented with Codex. Do not worry
about reviewing the code, just review the unit tests (if you want). If
any cases are missing, we'll add more tests and have Codex fix them.

I think the best approach will be to land and iterate. I have some
follow-ups I want to do after this lands. The next PR after this will
let us merge (and dedupe) multiple sequential cells of the same such as
multiple read commands. The deduping will also be important because the
model often reads the same file multiple times in a row in chunks

===

This PR formats common commands like reading, formatting, testing, etc
more nicely:

It tries to extract things like file names, tests and falls back to the
cmd if it doesn't. It also only shows stdout/err if the command failed.

<img width="770" height="238" alt="CleanShot 2025-08-09 at 16 05 15"
src="https://github.com/user-attachments/assets/0ead179a-8910-486b-aa3d-7d26264d751e"
/>
<img width="348" height="158" alt="CleanShot 2025-08-09 at 16 05 32"
src="https://github.com/user-attachments/assets/4302681b-5e87-4ff3-85b4-0252c6c485a9"
/>
<img width="834" height="324" alt="CleanShot 2025-08-09 at 16 05 56 2"
src="https://github.com/user-attachments/assets/09fb3517-7bd6-40f6-a126-4172106b700f"
/>

Part 2: https://github.com/openai/codex/pull/2097
Part 3: https://github.com/openai/codex/pull/2110
2025-08-11 14:26:15 -04:00
pakrym-oai
0aa7efe05b Trace RAW sse events (#2056)
For easier parsing.
2025-08-11 10:35:03 -07:00
Yaroslav
f146981b73 feat: add JSON schema sanitization for MCP tools to ensure compatibil… (#1975)
…ity with internal JsonSchema enum

Closes: #1973 

Co-authored-by: Dylan Hurd <dylan.hurd@openai.com>
2025-08-10 17:57:39 -07:00
Dylan
0091930f5a [core] Allow resume after client errors (#2053)
## Summary
Allow tui conversations to resume after the client fails out of retries.
I tested this with exec / mocked api failures as well, and it appears to
be fine. But happy to add an exec integration test as well!

## Testing
- [x] Added integration test
- [x] Tested locally
2025-08-08 18:21:19 -07:00
aibrahim-oai
6cfee15612 Moving the compact prompt near where it's used (#2031)
- Moved the prompt for compact to core
- Renamed it to be more clear
2025-08-08 12:43:43 -07:00
pakrym-oai
307d9957fa Fix usage limit banner grammar (#2018)
## Summary
- fix typo in usage limit banner text
- update error message tests

## Testing
- `just fmt`
- `RUSTC_BOOTSTRAP=1 just fix` *(fails: `let` expressions in this
position are unstable)*
- `RUSTC_BOOTSTRAP=1 cargo test --all-features` *(fails: `let`
expressions in this position are unstable)*

------
https://chatgpt.com/codex/tasks/task_i_689610fc1fe4832081bdd1118779b60b
2025-08-08 08:50:44 -07:00
pakrym-oai
431c9299d4 Remove part of the error message (#1983) 2025-08-08 02:01:53 +00:00
easong-openai
52e12f2b6c Revert "Streaming markdown (#1920)" (#1981)
This reverts commit 2b7139859e.
2025-08-08 01:38:39 +00:00
easong-openai
2b7139859e Streaming markdown (#1920)
We wait until we have an entire newline, then format it with markdown and stream in to the UI. This reduces time to first token but is the right thing to do with our current rendering model IMO. Also lets us add word wrapping!
2025-08-07 18:26:47 -07:00
pakrym-oai
fa0051190b Adjust error messages (#1969)
<img width="1378" height="285" alt="image"
src="https://github.com/user-attachments/assets/f0283378-f839-4a1f-8331-909694a04b1f"
/>
2025-08-07 18:24:34 -07:00
Michael Bolin
295abf3e51 chore: change CodexAuth::from_api_key() to take &str instead of String (#1970)
Good practice and simplifies some of the call sites.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1970).
* #1971
* __->__ #1970
* #1966
* #1965
* #1962
2025-08-07 16:55:33 -07:00
Michael Bolin
b991c04f86 chore: move top-level load_auth() to CodexAuth::from_codex_home() (#1966)
There are two valid ways to create an instance of `CodexAuth`:
`from_api_key()` and `from_codex_home()`. Now both are static methods of
`CodexAuth` and are listed first in the implementation.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1966).
* #1971
* #1970
* __->__ #1966
* #1965
* #1962
2025-08-07 16:49:37 -07:00
Dylan
548466df09 [client] Tune retries and backoff (#1956)
## Summary
10 is a bit excessive 😅 Also updates our backoff factor to space out
requests further.
2025-08-07 15:23:31 -07:00
Michael Bolin
7d67159587 fix: public load_auth() fn always called with include_env_var=true (#1961)
Apparently `include_env_var=false` was only used for testing, so clean
up the API a little to make that clear.


---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1961).
* #1962
* __->__ #1961
2025-08-07 14:19:30 -07:00
pakrym-oai
f23c3066c8 Add capacity error (#1947) 2025-08-07 10:46:43 -07:00
pakrym-oai
a593b1c3ab Use different field for error type (#1945) 2025-08-07 10:20:33 -07:00
Michael Bolin
107d2ce4e7 fix: change OPENAI_DEFAULT_MODEL to "gpt-5" (#1943) 2025-08-07 10:13:13 -07:00
pakrym-oai
62ed5907f9 Better usage errors (#1941)
<img width="771" height="279" alt="image"
src="https://github.com/user-attachments/assets/e56f967f-bcd7-49f7-8a94-3d88df68b65a"
/>
2025-08-07 09:46:13 -07:00
Dylan
bc28b87c7b [config] Onboarding flow with persistence (#1929)
## Summary
In collaboration with @gpeal: upgrade the onboarding flow, and persist
user settings.

---------

Co-authored-by: Gabriel Peal <gabriel@openai.com>
2025-08-07 09:27:38 -07:00
pakrym-oai
7e9ecfbc6a Rename the model (#1942) 2025-08-07 09:07:51 -07:00
Michael Bolin
c2c327c723 feat: change shell_environment_policy to default to inherit="all" (#1904)
Trying to use `core` as the default has been "too clever." Users can
always take responsibility for controlling the env without this setting
at all by specifying the `env` they use when calling `codex` in the
first place.

See https://github.com/openai/codex/issues/1249.
2025-08-07 01:55:41 -07:00
Michael Bolin
13982d6b4e chore: fix outstanding review comments from the bot on #1919 (#1928)
I should have read the comments before submitting!
2025-08-07 01:30:13 -07:00
ae
28395df957 [fix] fix absolute and % token counts (#1931)
- For absolute, use non-cached input + output.
- For estimating what % of the model's context window is used, we need
to account for reasoning output tokens from prior turns being dropped
from the context window. We approximate this here by subtracting
reasoning output tokens from the total. This will be off for the current
turn and pending function calls. We can improve it later.
2025-08-07 08:13:36 +00:00
Ed Bayes
eb80614a7c Tint chat composer background (#1921)
## Summary
- give the chat composer a subtle custom background and apply it across
the full area drawn

<img width="1008" height="718" alt="composer-bg"
src="https://github.com/user-attachments/assets/4b0f7f69-722a-438a-b4e9-0165ae8865a6"
/>

- update turn interrupted to be more human readable
<img width="648" height="170" alt="CleanShot 2025-08-06 at 22 44 47@2x"
src="https://github.com/user-attachments/assets/8d35e53a-bbfa-48e7-8612-c280a54e01dd"
/>

## Testing
- `cargo test --all-features` *(fails: `let` expressions in
`core/src/client.rs` require newer rustc)*
- `just fix` *(fails: `let` expressions in `core/src/client.rs` require
newer rustc)*

------
https://chatgpt.com/codex/tasks/task_i_68941f32c1008322bbcc39ee1d29a526
2025-08-07 00:46:45 -07:00
Michael Bolin
cd5f9074af feat: add /tmp by default (#1919)
Replaces the `include_default_writable_roots` option on
`sandbox_workspace_write` (that defaulted to `true`, which was slightly
weird/annoying) with `exclude_tmpdir_env_var`, which defaults to
`false`.

Though perhaps more importantly `/tmp` is now enabled by default as part
of `sandbox_mode = "workspace-write"`, though `exclude_slash_tmp =
false` can be used to disable this.
2025-08-07 00:17:00 -07:00
aibrahim-oai
f15e0fe1df Ensure exec command end always emitted (#1908)
## Summary
- defer ExecCommandEnd emission until after sandbox resolution
- make sandbox error handler return final exec output and response
- align sandbox error stderr with response content and rename to
`final_output`
- replace unstable `let` chains in client command header logic

## Testing
- `just fmt`
- `just fix`
- `cargo test --all-features` *(fails: NotPresent in
core/tests/client.rs)*

------
https://chatgpt.com/codex/tasks/task_i_6893e63b0c408321a8e1ff2a052c4c51
2025-08-07 06:25:56 +00:00
Gabriel Peal
8a990b5401 Migrate GitWarning to OnboardingScreen (#1915)
This paves the way to do per-directory approval settings
(https://github.com/openai/codex/pull/1912).

This also lets us pass in a Config/ChatWidgetArgs into onboarding which
can then mutate it and emit the ChatWidgetArgs it wants at the end which
may be modified by the said approval settings.

<img width="1180" height="428" alt="CleanShot 2025-08-06 at 19 30 55"
src="https://github.com/user-attachments/assets/4dcfda42-0f5e-4b6d-a16d-2597109cc31c"
/>
2025-08-06 22:39:07 -04:00
pakrym-oai
57c973b571 Add 2025-08-06 model family (#1899) 2025-08-06 23:14:02 +00:00
Gabriel Peal
2d5de795aa First pass at a TUI onboarding (#1876)
This sets up the scaffolding and basic flow for a TUI onboarding
experience. It covers sign in with ChatGPT, env auth, as well as some
safety guidance.

Next up:
1. Replace the git warning screen
2. Use this to configure default approval/sandbox modes


Note the shimmer flashes are from me slicing the video, not jank.

https://github.com/user-attachments/assets/0fbe3479-fdde-41f3-87fb-a7a83ab895b8
2025-08-06 18:22:14 -04:00
pakrym-oai
8262ba58b2 Prefer env var auth over default codex auth (#1861)
## Summary
- Prioritize provider-specific API keys over default Codex auth when
building requests
- Add test to ensure provider env var auth overrides default auth

## Testing
- `just fmt`
- `just fix` *(fails: `let` expressions in this position are unstable)*
- `cargo test --all-features` *(fails: `let` expressions in this
position are unstable)*

------
https://chatgpt.com/codex/tasks/task_i_68926a104f7483208f2c8fd36763e0e3
2025-08-06 13:02:00 -07:00
Michael Bolin
64f2f2eca2 fix: support $CODEX_HOME/AGENTS.md instead of $CODEX_HOME/instructions.md (#1891)
The docs and code do not match. It turns out the docs are "right" in
they are what we have been meaning to support, so this PR updates the
code:


ae88b69b09/README.md (L298-L302)

Support for `instructions.md` is a holdover from the TypeScript CLI, so
we are just going to drop support for it altogether rather than maintain
it in perpetuity.
2025-08-06 11:48:03 -07:00
Dylan
dc468d563f [env] Remove git config for now (#1884)
## Summary
Forgot to remove this in #1869 last night! Too much of a performance hit
on the main thread. We can bring it back via an async thread on startup.
2025-08-06 08:05:17 -07:00
Dylan
3e8bcf0247 [prompts] Add <environment_context> (#1869)
## Summary
Includes a new user message in the api payload which provides useful
environment context for the model, so it knows about things like the
current working directory and the sandbox.

## Testing
Updated unit tests
2025-08-06 01:13:31 -07:00
easong-openai
f8d70d67b6 Add OSS model info (#1860)
Add somewhat arbitrarily chosen context window/output limit.
2025-08-05 22:35:00 -07:00
Dylan
725dd6be6a [approval_policy] Add OnRequest approval_policy (#1865)
## Summary
A split-up PR of #1763 , stacked on top of a tools refactor #1858 to
make the change clearer. From the previous summary:

> Let's try something new: tell the model about the sandbox, and let it
decide when it will need to break the sandbox. Some local testing
suggests that it works pretty well with zero iteration on the prompt!

## Testing
- [x] Added unit tests
- [x] Tested locally and it appears to work smoothly!
2025-08-05 20:44:20 -07:00
Dylan
aff97ed7dd [core] Separate tools config from openai client (#1858)
## Summary
In an effort to make tools easier to work with and more configurable,
I'm introducing `ToolConfig` and updating `Prompt` to take in a general
list of Tools. I think this is simpler and better for a few reasons:
- We can easily assemble tools from various sources (our own harness,
mcp servers, etc.) and we can consolidate the logic for constructing the
logic in one place that is separate from serialization.
- client.rs no longer needs arbitrary config values, it just takes in a
list of tools to serialize

A hefty portion of the PR is now updating our conversion of
`mcp_types::Tool` to `OpenAITool`, but considering that @bolinfest
accurately called this out as a TODO long ago, I think it's time we
tackled it.

## Testing
- [x] Experimented locally, no changes, as expected
- [x] Added additional unit tests
- [x] Responded to rust-review
2025-08-05 19:27:52 -07:00
Dylan
ea7d3f27bd [core] Stop escalating timeouts (#1853)
## Summary
Escalating out of sandbox is (almost always) not going to fix
long-running commands timing out - therefore we should just pass the
failure back to the model instead of asking the user to re-run a command
that took a long time anyway.

## Testing
- [x] Ran locally with a timeout and confirmed this worked as expected
2025-08-05 17:52:25 -07:00
Michael Bolin
42bd73e150 chore: remove unnecessary default_ prefix (#1854)
This prefix is not inline with the other fields on the `ConfigOverrides`
struct.
2025-08-05 14:42:49 -07:00
Michael Bolin
d365cae077 fix: when using --oss, ensure correct configuration is threaded through correctly (#1859)
This PR started as an investigation with the goal of eliminating the use
of `unsafe { std::env::set_var() }` in `ollama/src/client.rs`, as
setting environment variables in a multithreaded context is indeed
unsafe and these tests were observed to be flaky, as a result.

Though as I dug deeper into the issue, I discovered that the logic for
instantiating `OllamaClient` under test scenarios was not quite right.
In this PR, I aimed to:

- share more code between the two creation codepaths,
`try_from_oss_provider()` and `try_from_provider_with_base_url()`
- use the values from `Config` when setting up Ollama, as we have
various mechanisms for overriding config values, so we should be sure
that we are always using the ultimate `Config` for things such as the
`ModelProviderInfo` associated with the `oss` id

Once this was in place,
`OllamaClient::try_from_provider_with_base_url()` could be used in unit
tests for `OllamaClient` so it was possible to create a properly
configured client without having to set environment variables.
2025-08-05 13:55:32 -07:00
easong-openai
9285350842 Introduce --oss flag to use gpt-oss models (#1848)
This adds support for easily running Codex backed by a local Ollama
instance running our new open source models. See
https://github.com/openai/gpt-oss for details.

If you pass in `--oss` you'll be prompted to install/launch ollama, and
it will automatically download the 20b model and attempt to use it.

We'll likely want to expand this with some options later to make the
experience smoother for users who can't run the 20b or want to run the
120b.

Co-authored-by: Michael Bolin <mbolin@openai.com>
2025-08-05 11:31:11 -07:00
easong-openai
e0303dbac0 Rescue chat completion changes (#1846)
https://github.com/openai/codex/pull/1835 has some messed up history.

This adds support for streaming chat completions, which is useful for ollama. We should probably take a very skeptical eye to the code introduced in this PR.

---------

Co-authored-by: Ahmed Ibrahim <aibrahim@openai.com>
2025-08-05 08:56:13 +00:00
Dylan
d31e149cb1 [prompt] Update prompt.md (#1839)
## Summary
Additional clarifications to our prompt. Still very concise, but we'll
continue to add more here.
2025-08-05 00:43:23 -07:00
Michael Bolin
136b3ee5bf chore: introduce ModelFamily abstraction (#1838)
To date, we have a number of hardcoded OpenAI model slug checks spread
throughout the codebase, which makes it hard to audit the various
special cases for each model. To mitigate this issue, this PR introduces
the idea of a `ModelFamily` that has fields to represent the existing
special cases, such as `supports_reasoning_summaries` and
`uses_local_shell_tool`.

There is a `find_family_for_model()` function that maps the raw model
slug to a `ModelFamily`. This function hardcodes all the knowledge about
the special attributes for each model. This PR then replaces the
hardcoded model name checks with checks against a `ModelFamily`.

Note `ModelFamily` is now available as `Config::model_family`. We should
ultimately remove `Config::model` in favor of
`Config::model_family::slug`.
2025-08-04 23:50:03 -07:00
easong-openai
906d449760 Stream model responses (#1810)
Stream models thoughts and responses instead of waiting for the whole
thing to come through. Very rough right now, but I'm making the risk call to push through.
2025-08-05 04:23:22 +00:00
Dylan
063083af15 [prompts] Better user_instructions handling (#1836)
## Summary
Our recent change in #1737 can sometimes lead to the model confusing
AGENTS.md context as part of the message. But a little prompting and
formatting can help fix this!

## Testing
- Ran locally with a few different prompts to verify the model
behaves well.
- Updated unit tests
2025-08-04 18:55:57 -07:00
pakrym-oai
84bcadb8d9 Restore API key and query param overrides (#1826)
Addresses https://github.com/openai/codex/issues/1796
2025-08-04 18:07:49 -07:00
Ahmed Ibrahim
e38ce39c51 Revert to 3f13ebce10 without rewriting history. Wrong merge 2025-08-04 17:03:24 -07:00
Ahmed Ibrahim
1a33de34b0 unify flag 2025-08-04 16:56:52 -07:00
Ahmed Ibrahim
bd171e5206 add raw reasoning 2025-08-04 16:49:42 -07:00