valknar/llmx - llmx - dev.pivoine.art

Author	SHA1	Message	Date
vjain419	80b00a193e	feat(gpt5): add model_verbosity for GPT‑5 via Responses API (#2108 ) Summary - Adds `model_verbosity` config (values: low, medium, high). - Sends `text.verbosity` only for GPT‑5 family models via the Responses API. - Updates docs and adds serialization tests. Motivation - GPT‑5 introduces a verbosity control to steer output length/detail without pro mpt surgery. - Exposing it as a config knob keeps prompts stable and makes behavior explicit and repeatable. Changes - Config: - Added `Verbosity` enum (low\|medium\|high). - Added optional `model_verbosity` to `ConfigToml`, `Config`, and `ConfigProfi le`. - Request wiring: - Extended `ResponsesApiRequest` with optional `text` object. - Populates `text.verbosity` only when model family is `gpt-5`; omitted otherw ise. - Tests: - Verifies `text.verbosity` serializes when set and is omitted when not set. - Docs: - Added “GPT‑5 Verbosity” section in `codex-rs/README.md`. - Added `model_verbosity` section to `codex-rs/config.md`. Usage - In `~/.codex/config.toml`: - `model = "gpt-5"` - `model_verbosity = "low"` (or `"medium"` default, `"high"`) - CLI override example: - `codex -c model="gpt-5" -c model_verbosity="high"` API Impact - Requests to GPT‑5 via Responses API include: `text: { verbosity: "low\|medium\|h igh" }` when configured. - For legacy models or Chat Completions providers, `text` is omitted. Backward Compatibility - Default behavior unchanged when `model_verbosity` is not set (server default “ medium”). Testing - Added unit tests for serialization/omission of `text.verbosity`. - Ran `cargo fmt` and `cargo test --all-features` (all green). Docs - `README.md`: new “GPT‑5 Verbosity” note under Config with example. - `config.md`: new `model_verbosity` section. Out of Scope - No changes to temperature/top_p or other GPT‑5 parameters. - No changes to Chat Completions wiring. Risks / Notes - If OpenAI changes the wire shape for verbosity, we may need to update `Respons esApiRequest`. - Behavior gated to `gpt-5` model family to avoid unexpected effects elsewhere. Checklist - [x] Code gated to GPT‑5 family only - [x] Docs updated (`README.md`, `config.md`) - [x] Tests added and passing - [x] Formatting applied Release note: Add `model_verbosity` config to control GPT‑5 output verbosity via the Responses API (low\|medium\|high).	2025-08-22 09:12:10 -07:00
Dylan	e4c275d615	[apply-patch] Clean up apply-patch tool definitions (#2539 ) ## Summary We've experienced a bit of drift in system prompting for `apply_patch`: - As pointed out in #2030 , our prettier formatting started altering prompt.md in a few ways - We introduced a separate markdown file for apply_patch instructions in #993, but currently duplicate them in the prompt.md file - We added a first-class apply_patch tool in #2303, which has yet another definition This PR starts to consolidate our logic in a few ways: - We now only use `apply_patch_tool_instructions.md](https://github.com/openai/codex/compare/dh--apply-patch-tool-definition?expand=1#diff-d4fffee5f85cb1975d3f66143a379e6c329de40c83ed5bf03ffd3829df985bea) for system instructions - We no longer include apply_patch system instructions if the tool is specified I'm leaving the definition in openai_tools.rs as duplicated text for now because we're going to be iterated on the first-class tool soon. ## Testing - [x] Added integration tests to verify prompt stability - [x] Tested locally with several different models (gpt-5, gpt-oss, o4-mini)	2025-08-21 20:07:41 -07:00
Ahmed Ibrahim	c9963b52e9	consolidate reasoning enums into one (#2428 ) We have three enums for each of reasoning summaries and reasoning effort with same values. They can be consolidated into one.	2025-08-18 11:50:17 -07:00
Kazuhiro Sera	dcfdd2faf5	Fix #2296 Add "minimal" reasoning effort for GPT 5 models (#2326 ) This pull request resolves #2296; I've confirmed if it works by: 1. Add settings to ~/.codex/config.toml: ```toml model_reasoning_effort = "minimal" ``` 2. Run the CLI: ``` cd codex-rs cargo build && RUST_LOG=trace cargo run --bin codex /status tail -f ~/.codex/log/codex-tui.log ``` Co-authored-by: pakrym-oai <pakrym@openai.com>	2025-08-15 12:59:52 -07:00
Parker Thompson	a075424437	Added `allow-expect-in-tests` / `allow-unwrap-in-tests` (#2328 ) This PR: * Added the clippy.toml to configure allowable expect / unwrap usage in tests * Removed as many expect/allow lines as possible from tests * moved a bunch of allows to expects where possible Note: in integration tests, non `#[test]` helper functions are not covered by this so we had to leave a few lingering `expect(expect_used` checks around	2025-08-14 17:59:01 -07:00
Dylan	544980c008	[context] Store context messages in rollouts (#2243 ) ## Summary Currently, we use request-time logic to determine the user_instructions and environment_context messages. This means that neither of these values can change over time as conversations go on. We want to add in additional details here, so we're migrating these to save these messages to the rollout file instead. This is simpler for the client, and allows us to append additional environment_context messages to each turn if we want ## Testing - [x] Integration test coverage - [x] Tested locally with a few turns, confirmed model could reference environment context and cached token metrics were reasonably high	2025-08-14 14:51:13 -04:00
easong-openai	6340acd885	Re-add markdown streaming (#2029 ) Wait for newlines, then render markdown on a line by line basis. Word wrap it for the current terminal size and then spit it out line by line into the UI. Also adds tests and fixes some UI regressions.	2025-08-12 17:37:28 -07:00
pakrym-oai	6a6bf99e2c	Send prompt_cache_key (#2200 ) To optimize prompt caching performance.	2025-08-11 16:37:45 -07:00
Dylan	dc468d563f	[env] Remove git config for now (#1884 ) ## Summary Forgot to remove this in #1869 last night! Too much of a performance hit on the main thread. We can bring it back via an async thread on startup.	2025-08-06 08:05:17 -07:00
Dylan	3e8bcf0247	[prompts] Add <environment_context> (#1869 ) ## Summary Includes a new user message in the api payload which provides useful environment context for the model, so it knows about things like the current working directory and the sandbox. ## Testing Updated unit tests	2025-08-06 01:13:31 -07:00
Dylan	aff97ed7dd	[core] Separate tools config from openai client (#1858 ) ## Summary In an effort to make tools easier to work with and more configurable, I'm introducing `ToolConfig` and updating `Prompt` to take in a general list of Tools. I think this is simpler and better for a few reasons: - We can easily assemble tools from various sources (our own harness, mcp servers, etc.) and we can consolidate the logic for constructing the logic in one place that is separate from serialization. - client.rs no longer needs arbitrary config values, it just takes in a list of tools to serialize A hefty portion of the PR is now updating our conversion of `mcp_types::Tool` to `OpenAITool`, but considering that @bolinfest accurately called this out as a TODO long ago, I think it's time we tackled it. ## Testing - [x] Experimented locally, no changes, as expected - [x] Added additional unit tests - [x] Responded to rust-review	2025-08-05 19:27:52 -07:00
easong-openai	e0303dbac0	Rescue chat completion changes (#1846 ) https://github.com/openai/codex/pull/1835 has some messed up history. This adds support for streaming chat completions, which is useful for ollama. We should probably take a very skeptical eye to the code introduced in this PR. --------- Co-authored-by: Ahmed Ibrahim <aibrahim@openai.com>	2025-08-05 08:56:13 +00:00
Michael Bolin	136b3ee5bf	chore: introduce ModelFamily abstraction (#1838 ) To date, we have a number of hardcoded OpenAI model slug checks spread throughout the codebase, which makes it hard to audit the various special cases for each model. To mitigate this issue, this PR introduces the idea of a `ModelFamily` that has fields to represent the existing special cases, such as `supports_reasoning_summaries` and `uses_local_shell_tool`. There is a `find_family_for_model()` function that maps the raw model slug to a `ModelFamily`. This function hardcodes all the knowledge about the special attributes for each model. This PR then replaces the hardcoded model name checks with checks against a `ModelFamily`. Note `ModelFamily` is now available as `Config::model_family`. We should ultimately remove `Config::model` in favor of `Config::model_family::slug`.	2025-08-04 23:50:03 -07:00
Dylan	063083af15	[prompts] Better user_instructions handling (#1836 ) ## Summary Our recent change in #1737 can sometimes lead to the model confusing AGENTS.md context as part of the message. But a little prompting and formatting can help fix this! ## Testing - Ran locally with a few different prompts to verify the model behaves well. - Updated unit tests	2025-08-04 18:55:57 -07:00
pakrym-oai	e0e245cc1c	Send AGENTS.md as a separate user message (#1737 )	2025-07-30 13:56:24 -07:00
pakrym-oai	591cb6149a	Always send entire request context (#1641 ) Always store the entire conversation history. Request encrypted COT when not storing Responses. Send entire input context instead of sending previous_response_id	2025-07-23 10:37:45 -07:00
pakrym-oai	6d82907082	Add support for custom base instructions (#1645 ) Allows providing custom instructions file as a config parameter and custom instruction text via MCP tool call.	2025-07-22 09:42:22 -07:00
aibrahim-oai	2bd3314886	support deltas in core (#1587 ) - Added support for message and reasoning deltas - Skipped adding the support in the cli and tui for later - Commented a failing test (wrong merge) that needs fix in a separate PR. Side note: I think we need to disable merge when the CI don't pass.	2025-07-16 15:11:18 -07:00
Michael Bolin	8a424fcfa3	feat: add new config option: model_supports_reasoning_summaries (#1524 ) As noted in the updated docs, this makes it so that you can set: ```toml model_supports_reasoning_summaries = true ``` as a way of overriding the existing heuristic for when to set the `reasoning` field on a sampling request: `341c091c5b/codex-rs/core/src/client_common.rs (L152-L166)`	2025-07-10 14:30:33 -07:00
Rene Leonhardt	82b0cebe8b	chore(rs): update dependencies (#1494 ) ### Chores - Update cargo dependencies - Remove unused cargo dependencies - Fix clippy warnings - Update Dockerfile (package.json requires node 22) - Let Dependabot update bun, cargo, devcontainers, docker, github-actions, npm (nix still not supported) ### TODO - Upgrade dependencies with breaking changes ```shell $ cargo update --verbose Unchanged crossterm v0.28.1 (available: v0.29.0) Unchanged schemars v0.8.22 (available: v1.0.4) ```	2025-07-10 11:08:16 -07:00
Gabriel Peal	a339a7bcce	[Rust] Allow resuming a session that was killed with ctrl + c (#1387 ) Previously, if you ctrl+c'd a conversation, all subsequent turns would 400 because the Responses API never got a response for one of its call ids. This ensures that if we aren't sending a call id by hand, we generate a synthetic aborted call. Fixes #1244 https://github.com/user-attachments/assets/5126354f-b970-45f5-8c65-f811bca8294a	2025-06-26 14:40:42 -04:00
Michael Bolin	fcfe43c7df	feat: show number of tokens remaining in UI (#1388 ) When using the OpenAI Responses API, we now record the `usage` field for a `"response.completed"` event, which includes metrics about the number of tokens consumed. We also introduce `openai_model_info.rs`, which includes current data about the most common OpenAI models available via the API (specifically `context_window` and `max_output_tokens`). If Codex does not recognize the model, you can set `model_context_window` and `model_max_output_tokens` explicitly in `config.toml`. When then introduce a new event type to `protocol.rs`, `TokenCount`, which includes the `TokenUsage` for the most recent turn. Finally, we update the TUI to record the running sum of tokens used so the percentage of available context window remaining can be reported via the placeholder text for the composer: ![Screenshot 2025-06-25 at 11 20 55 PM](https://github.com/user-attachments/assets/6fd6982f-7247-4f14-84b2-2e600cb1fd49) We could certainly get much fancier with this (such as reporting the estimated cost of the conversation), but for now, we are just trying to achieve feature parity with the TypeScript CLI. Though arguably this improves upon the TypeScript CLI, as the TypeScript CLI uses heuristics to estimate the number of tokens used rather than using the `usage` information directly: `296996d74e/codex-cli/src/utils/approximate-tokens-used.ts (L3-L16)` Fixes https://github.com/openai/codex/issues/1242	2025-06-25 23:31:11 -07:00
Michael Bolin	c6fcec55fe	fix: always send full instructions when using the Responses API (#1207 ) This fixes a longstanding error in the Rust CLI where `codex.rs` contained an errant `is_first_turn` check that would exclude the user instructions for subsequent "turns" of a conversation when using the responses API (i.e., when `previous_response_id` existed). While here, renames `Prompt.instructions` to `Prompt.user_instructions` since we now have quite a few levels of instructions floating around. Also removed an unnecessary use of `clone()` in `Prompt.get_full_instructions()`.	2025-06-03 09:40:19 -07:00
Michael Bolin	6fcc528a43	fix: provide tolerance for apply_patch tool (#993 ) As explained in detail in the doc comment for `ParseMode::Lenient`, we have observed that GPT-4.1 does not always generate a valid invocation of `apply_patch`. Fortunately, the error is predictable, so we introduce some new logic to the `codex-apply-patch` crate to recover from this error. Because we would like to avoid this becoming a de facto standard (as it would be incompatible if `apply_patch` were provided as an actual executable, unless we also introduced the lenient behavior in the executable, as well), we require passing `ParseMode::Lenient` to `parse_patch_text()` to make it clear that the caller is opting into supporting this special case. Note the analogous change to the TypeScript CLI was https://github.com/openai/codex/pull/930. In addition to changing the accepted input to `apply_patch`, it also introduced additional instructions for the model, which we include in this PR. Note that `apply-patch` does not depend on either `regex` or `regex-lite`, so some of the checks are slightly more verbose to avoid introducing this dependency. That said, this PR does not leverage the existing `extract_heredoc_body_from_apply_patch_command()`, which depends on `tree-sitter` and `tree-sitter-bash`: `5a5aa89914/codex-rs/apply-patch/src/lib.rs (L191-L246)` though perhaps it should.	2025-06-03 09:06:38 -07:00
Michael Bolin	0f3cc8f842	feat: make reasoning effort/summaries configurable (#1199 ) Previous to this PR, we always set `reasoning` when making a request using the Responses API: `d7245cbbc9/codex-rs/core/src/client.rs (L108-L111)` Though if you tried to use the Rust CLI with `--model gpt-4.1`, this would fail with: ```shell "Unsupported parameter: 'reasoning.effort' is not supported with this model." ``` We take a cue from the TypeScript CLI, which does a check on the model name: `d7245cbbc9/codex-cli/src/utils/agent/agent-loop.ts (L786-L789)` This PR does a similar check, though also adds support for the following config options: ``` model_reasoning_effort = "low" \| "medium" \| "high" \| "none" model_reasoning_summary = "auto" \| "concise" \| "detailed" \| "none" ``` This way, if you have a model whose name happens to start with `"o"` (or `"codex"`?), you can set these to `"none"` to explicitly disable reasoning, if necessary. (That said, it seems unlikely anyone would use the Responses API with non-OpenAI models, but we provide an escape hatch, anyway.) This PR also updates both the TUI and `codex exec` to show `reasoning effort` and `reasoning summaries` in the header.	2025-06-02 16:01:34 -07:00
Michael Bolin	61b881d4e5	fix: agent instructions were not being included when ~/.codex/instructions.md was empty (#908 ) I had seen issues where `codex-rs` would not always write files without me pressuring it to do so, and between that and the report of https://github.com/openai/codex/issues/900, I decided to look into this further. I found two serious issues with agent instructions: (1) We were only sending agent instructions on the first turn, but looking at the TypeScript code, we should be sending them on every turn. (2) There was a serious issue where the agent instructions were frequently lost: * The TypeScript CLI appears to keep writing `~/.codex/instructions.md`: `55142e3e6c/codex-cli/src/utils/config.ts (L586)` * If `instructions.md` is present, the Rust CLI uses the contents of it INSTEAD OF the default prompt, even if `instructions.md` is empty: `55142e3e6c/codex-rs/core/src/config.rs (L202-L203)` The combination of these two things means that I have been using `codex-rs` without these key instructions: https://github.com/openai/codex/blob/main/codex-rs/core/prompt.md Looking at the TypeScript code, it appears we should be concatenating these three items every time (if they exist): * `prompt.md` * `~/.codex/instructions.md` * nearest `AGENTS.md` This PR fixes things so that: * `Config.instructions` is `None` if `instructions.md` is empty * `Payload.instructions` is now `&'a str` instead of `Option<&'a String>` because we should always have _something_ to send * `Prompt` now has a `get_full_instructions()` helper that returns a `Cow<str>` that will always include the agent instructions first.	2025-05-12 17:24:44 -07:00
Michael Bolin	b4785b5f88	feat: include "reasoning" messages in Rust TUI (#892 ) As shown in the screenshot, we now include reasoning messages from the model in the TUI under the heading "codex reasoning": ![image](https://github.com/user-attachments/assets/d8eb3dc3-2f9f-4e95-847e-d24b421249a8) To ensure these are visible by default when using `o4-mini`, this also changes the default value for `summary` (formerly `generate_summary`, which is deprecated in favor of `summary` according to the docs) from unset to `"auto"`.	2025-05-10 21:43:27 -07:00
Michael Bolin	e924070cee	feat: support the chat completions API in the Rust CLI (#862 ) This is a substantial PR to add support for the chat completions API, which in turn makes it possible to use non-OpenAI model providers (just like in the TypeScript CLI): * It moves a number of structs from `client.rs` to `client_common.rs` so they can be shared. * It introduces support for the chat completions API in `chat_completions.rs`. * It updates `ModelProviderInfo` so that `env_key` is `Option<String>` instead of `String` (for e.g., ollama) and adds a `wire_api` field * It updates `client.rs` to choose between `stream_responses()` and `stream_chat_completions()` based on the `wire_api` for the `ModelProviderInfo` * It updates the `exec` and TUI CLIs to no longer fail if the `OPENAI_API_KEY` environment variable is not set * It updates the TUI so that `EventMsg::Error` is displayed more prominently when it occurs, particularly now that it is important to alert users to the `CodexErr::EnvVar` variant. * `CodexErr::EnvVar` was updated to include an optional `instructions` field so we can preserve the behavior where we direct users to https://platform.openai.com if `OPENAI_API_KEY` is not set. * Cleaned up the "welcome message" in the TUI to ensure the model provider is displayed. * Updated the docs in `codex-rs/README.md`. To exercise the chat completions API from OpenAI models, I added the following to my `config.toml`: ```toml model = "gpt-4o" model_provider = "openai-chat-completions" [model_providers.openai-chat-completions] name = "OpenAI using Chat Completions" base_url = "https://api.openai.com/v1" env_key = "OPENAI_API_KEY" wire_api = "chat" ``` Though to test a non-OpenAI provider, I installed ollama with mistral locally on my Mac because ChatGPT said that would be a good match for my hardware: ```shell brew install ollama ollama serve ollama pull mistral ``` Then I added the following to my `~/.codex/config.toml`: ```toml model = "mistral" model_provider = "ollama" ``` Note this code could certainly use more test coverage, but I want to get this in so folks can start playing with it. For reference, I believe https://github.com/openai/codex/pull/247 was roughly the comparable PR on the TypeScript side.	2025-05-08 21:46:06 -07:00

28 Commits