- For absolute, use non-cached input + output.
- For estimating what % of the model's context window is used, we need
to account for reasoning output tokens from prior turns being dropped
from the context window. We approximate this here by subtracting
reasoning output tokens from the total. This will be off for the current
turn and pending function calls. We can improve it later.
This will make @ more discoverable (even though it is currently not
super useful, IMO it should be used to bring files into context from
outside CWD)
---------
Co-authored-by: Gabriel Peal <gpeal@users.noreply.github.com>
Replaces the `include_default_writable_roots` option on
`sandbox_workspace_write` (that defaulted to `true`, which was slightly
weird/annoying) with `exclude_tmpdir_env_var`, which defaults to
`false`.
Though perhaps more importantly `/tmp` is now enabled by default as part
of `sandbox_mode = "workspace-write"`, though `exclude_slash_tmp =
false` can be used to disable this.
Cursor wasn't moving when inserting a file, resulting in being not at
the end of the filename when inserting the file.
This fixes it by moving the cursor to the end of the file + one trailing
space.
Example screenshot after selecting a file when typing `@`
<img width="823" height="268" alt="image"
src="https://github.com/user-attachments/assets/ec6e3741-e1ba-4752-89d2-11f14a2bd69f"
/>
This sets up the scaffolding and basic flow for a TUI onboarding
experience. It covers sign in with ChatGPT, env auth, as well as some
safety guidance.
Next up:
1. Replace the git warning screen
2. Use this to configure default approval/sandbox modes
Note the shimmer flashes are from me slicing the video, not jank.
https://github.com/user-attachments/assets/0fbe3479-fdde-41f3-87fb-a7a83ab895b8
## Summary
We have been returning `exit code 0` from the apply patch command when
writes fail, which causes our `exec` harness to pass back confusing
messages to the model. Instead, we should loudly fail so that the
harness and the model can handle these errors appropriately.
Also adds a test to confirm this behavior.
## Testing
- `cargo test -p codex-apply-patch`
- Arguably a bugfix as previously CTRL-Z didn't do anything.
- Only in TUI mode for now. This may make sense in other modes... to be
researched.
- The TUI runs the terminal in raw mode and the signals arrive as key
events, so we handle CTRL-Z as a key event just like CTRL-C.
- Not adding UI for it as a composer redesign is coming, and we can just
add it then.
- We should follow with CTRL-Z a second time doing the native terminal
action.
- Updates the launch screen to:
```
>_ You are using OpenAI Codex in ~/code/codex/codex-rs
Try one of the following commands to get started:
1. /init - Create an AGENTS.md file with instructions for Codex
2. /status - Show current session configuration and token usage
3. /compact - Compact the chat history
4. /new - Start a new chat
```
- These aren't the perfect commands, but as more land soon we can
update.
- We should also add logic later to make /init only show when there's no
existing AGENTS.md.
- Majorly need to iterate on copy.
<img width="905" height="769" alt="image"
src="https://github.com/user-attachments/assets/5912939e-fb0e-4e76-94ff-785261e2d6ee"
/>
## Summary
- Prioritize provider-specific API keys over default Codex auth when
building requests
- Add test to ensure provider env var auth overrides default auth
## Testing
- `just fmt`
- `just fix` *(fails: `let` expressions in this position are unstable)*
- `cargo test --all-features` *(fails: `let` expressions in this
position are unstable)*
------
https://chatgpt.com/codex/tasks/task_i_68926a104f7483208f2c8fd36763e0e3
The docs and code do not match. It turns out the docs are "right" in
they are what we have been meaning to support, so this PR updates the
code:
ae88b69b09/README.md (L298-L302)
Support for `instructions.md` is a holdover from the TypeScript CLI, so
we are just going to drop support for it altogether rather than maintain
it in perpetuity.
## Summary
Forgot to remove this in #1869 last night! Too much of a performance hit
on the main thread. We can bring it back via an async thread on startup.
## Summary
Includes a new user message in the api payload which provides useful
environment context for the model, so it knows about things like the
current working directory and the sandbox.
## Testing
Updated unit tests
## Summary
Have seen these tests flaking over the course of today on different
boxes. `wiremock` seems to be generally written with tokio/threads in
mind but based on the weird panics from the tests, let's see if this
helps.
- Added a `/status` command, which will be useful when we update the
home screen to print less status.
- Moved `create_config_summary_entries` to common since it's used in a
few places.
- Noticed we inconsistently had periods in slash command descriptions
and just removed them everywhere.
- Noticed the diff description was overflowing so made it shorter.
This script attempts to verify that:
- You have no local, uncommitted changes.
- You are on `main`
- The commit you are on exists on `main` also exists on the origin
`https://github.com/openai/codex`, i.e., it is not just a commit you
have pushed to your local version of `main`
As part of this, try to print better error message if/when these
conditions are violated.
https://github.com/openai/codex/pull/1868 is a related fix that was in
flight simultaenously, but after talking to @easong-openai, this:
- logs instead of renders for `BackgroundEvent`
- logs for `TurnDiff`
- renders for `PatchApplyEnd`
## Summary
A split-up PR of #1763 , stacked on top of a tools refactor #1858 to
make the change clearer. From the previous summary:
> Let's try something new: tell the model about the sandbox, and let it
decide when it will need to break the sandbox. Some local testing
suggests that it works pretty well with zero iteration on the prompt!
## Testing
- [x] Added unit tests
- [x] Tested locally and it appears to work smoothly!
## Summary
In an effort to make tools easier to work with and more configurable,
I'm introducing `ToolConfig` and updating `Prompt` to take in a general
list of Tools. I think this is simpler and better for a few reasons:
- We can easily assemble tools from various sources (our own harness,
mcp servers, etc.) and we can consolidate the logic for constructing the
logic in one place that is separate from serialization.
- client.rs no longer needs arbitrary config values, it just takes in a
list of tools to serialize
A hefty portion of the PR is now updating our conversion of
`mcp_types::Tool` to `OpenAITool`, but considering that @bolinfest
accurately called this out as a TODO long ago, I think it's time we
tackled it.
## Testing
- [x] Experimented locally, no changes, as expected
- [x] Added additional unit tests
- [x] Responded to rust-review
Previous to this PR, `ShutdownComplete` was not being handled correctly
in `codex exec`, so it always ended up printing the following to stderr:
```
ERROR codex_exec: Error receiving event: InternalAgentDied
```
Because we were not breaking out of the loop for `ShutdownComplete`,
inevitably `codex.next_event()` would get called again and
`rx_event.recv()` would fail and the error would get mapped to
`InternalAgentDied`:
ea7d3f27bd/codex-rs/core/src/codex.rs (L190-L197)
For reference, https://github.com/openai/codex/pull/1647 introduced the
`ShutdownComplete` variant.
## Summary
Escalating out of sandbox is (almost always) not going to fix
long-running commands timing out - therefore we should just pass the
failure back to the model instead of asking the user to re-run a command
that took a long time anyway.
## Testing
- [x] Ran locally with a timeout and confirmed this worked as expected
This PR started as an investigation with the goal of eliminating the use
of `unsafe { std::env::set_var() }` in `ollama/src/client.rs`, as
setting environment variables in a multithreaded context is indeed
unsafe and these tests were observed to be flaky, as a result.
Though as I dug deeper into the issue, I discovered that the logic for
instantiating `OllamaClient` under test scenarios was not quite right.
In this PR, I aimed to:
- share more code between the two creation codepaths,
`try_from_oss_provider()` and `try_from_provider_with_base_url()`
- use the values from `Config` when setting up Ollama, as we have
various mechanisms for overriding config values, so we should be sure
that we are always using the ultimate `Config` for things such as the
`ModelProviderInfo` associated with the `oss` id
Once this was in place,
`OllamaClient::try_from_provider_with_base_url()` could be used in unit
tests for `OllamaClient` so it was possible to create a properly
configured client without having to set environment variables.
I ended up force-pushing https://github.com/openai/codex/pull/1848
because CI jobs were not being triggered after updating the PR on
GitHub, so this spelling error sneaked through.
This adds support for easily running Codex backed by a local Ollama
instance running our new open source models. See
https://github.com/openai/gpt-oss for details.
If you pass in `--oss` you'll be prompted to install/launch ollama, and
it will automatically download the 20b model and attempt to use it.
We'll likely want to expand this with some options later to make the
experience smoother for users who can't run the 20b or want to run the
120b.
Co-authored-by: Michael Bolin <mbolin@openai.com>
https://github.com/openai/codex/pull/1835 has some messed up history.
This adds support for streaming chat completions, which is useful for ollama. We should probably take a very skeptical eye to the code introduced in this PR.
---------
Co-authored-by: Ahmed Ibrahim <aibrahim@openai.com>
To date, we have a number of hardcoded OpenAI model slug checks spread
throughout the codebase, which makes it hard to audit the various
special cases for each model. To mitigate this issue, this PR introduces
the idea of a `ModelFamily` that has fields to represent the existing
special cases, such as `supports_reasoning_summaries` and
`uses_local_shell_tool`.
There is a `find_family_for_model()` function that maps the raw model
slug to a `ModelFamily`. This function hardcodes all the knowledge about
the special attributes for each model. This PR then replaces the
hardcoded model name checks with checks against a `ModelFamily`.
Note `ModelFamily` is now available as `Config::model_family`. We should
ultimately remove `Config::model` in favor of
`Config::model_family::slug`.
Stream models thoughts and responses instead of waiting for the whole
thing to come through. Very rough right now, but I'm making the risk call to push through.