valknar/llmx - llmx - dev.pivoine.art

Author	SHA1	Message	Date
Gabriel Peal	c8fab51372	Use ConversationId instead of raw Uuids (#3282 ) We're trying to migrate from `session_id: Uuid` to `conversation_id: ConversationId`. Not only does this give us more type safety but it unifies our terminology across Codex and with the implementation of session resuming, a conversation (which can span multiple sessions) is more appropriate. I started this impl on https://github.com/openai/codex/pull/3219 as part of getting resume working in the extension but it's big enough that it should be broken out.	2025-09-07 23:22:25 -04:00
pakrym-oai	0269096229	Move token usage/context information to session level (#3221 ) Move context information into the main loop so it can be used to interrupt the loop or start auto-compaction.	2025-09-06 15:19:23 +00:00
pakrym-oai	5775174ec2	Never store requests (#3212 ) When item ids are sent to Responses API it will load them from the database ignoring the provided values. This adds extra latency. Not having the mode to store requests also allows us to simplify the code. ## Breaking change The `disable_response_storage` configuration option is removed.	2025-09-05 10:41:47 -07:00
pakrym-oai	c636f821ae	Add a common way to create HTTP client (#3110 ) Ensure User-Agent and originator are always sent.	2025-09-03 10:11:02 -07:00
pakrym-oai	03e2796ca4	Move CodexAuth and AuthManager to the core crate (#3074 ) Fix a long standing layering issue.	2025-09-02 18:36:19 -07:00
Eric Traut	051f185ce3	Added back the logic to handle rate-limit errors when using API key (#3070 ) A previous PR removed this when adding rate-limit errors for the ChatGPT auth path.	2025-09-02 17:50:15 -07:00
Ahmed Ibrahim	9dbe7284d2	Following up on #2371 post commit feedback (#2852 ) - Introduce websearch end to complement the begin - Moves the logic of adding the sebsearch tool to create_tools_json_for_responses_api - Making it the client responsibility to toggle the tool on or off - Other misc in #2371 post commit feedback - Show the query: <img width="1392" height="151" alt="image" src="https://github.com/user-attachments/assets/8457f1a6-f851-44cf-bcca-0d4fe460ce89" />	2025-08-28 19:24:38 -07:00
Ahmed Ibrahim	d0e06f74e2	send context window with task started (#2752 ) - Send context window with task started - Accounting for changing the model per turn	2025-08-27 00:04:21 -07:00
Eric Traut	ab9250e714	Improved user message for rate-limit errors (#2695 ) This PR improves the error message presented to the user when logged in with ChatGPT and a rate-limit error occurs. In particular, it provides the user with information about when the rate limit will be reset. It removes older code that attempted to do the same but relied on parsing of error messages that are not generated by the ChatGPT endpoint. The new code uses newly-added error fields.	2025-08-25 21:42:10 -07:00
Eric Traut	d63e44ae29	Fixed a bug that causes token refresh to not work in a seamless manner (#2699 ) This PR fixes a bug in the token refresh logic. Token refresh is performed in a retry loop so if we receive a 401 error, we refresh the token, then we go around the loop again and reissue the fetch with a fresh token. The bug is that we're not using the updated token on the second and subsequent times through the loop. The result is that we'll try to refresh the token a few more times until we hit the retry limit (default of 4). The 401 error is then passed back up to the caller. Subsequent calls will use the refreshed token, so the problem clears itself up. The fix is straightforward — make sure we use the updated auth information each time through the retry loop.	2025-08-25 19:18:16 -07:00
Reuben Narad	363636f5eb	Add web search tool (#2371 ) Adds web_search tool, enabling the model to use Responses API web_search tool. - Disabled by default, enabled by --search flag - When --search is passed, exposes web_search_request function tool to the model, which triggers user approval. When approved, the model can use the web_search tool for the remainder of the turn <img width="1033" height="294" alt="image" src="https://github.com/user-attachments/assets/62ac6563-b946-465c-ba5d-9325af28b28f" /> --------- Co-authored-by: easong-openai <easong@openai.com>	2025-08-23 22:58:56 -07:00
Ahmed Ibrahim	097782c775	Move models.rs to protocol (#2595 ) Moving models.rs to protocol so we can use them in `Codex` operations	2025-08-22 22:18:54 +00:00
Dylan	236c4f76a6	[apply_patch] freeform apply_patch tool (#2576 ) ## Summary GPT-5 introduced the concept of [custom tools](https://platform.openai.com/docs/guides/function-calling#custom-tools), which allow the model to send a raw string result back, simplifying json-escape issues. We are migrating gpt-5 to use this by default. However, gpt-oss models do not support custom tools, only normal functions. So we keep both tool definitions, and provide whichever one the model family supports. ## Testing - [x] Tested locally with various models - [x] Unit tests pass	2025-08-22 13:42:34 -07:00
Eric Traut	dc42ec0eb4	Add AuthManager and enhance GetAuthStatus command (#2577 ) This PR adds a central `AuthManager` struct that manages the auth information used across conversations and the MCP server. Prior to this, each conversation and the MCP server got their own private snapshots of the auth information, and changes to one (such as a logout or token refresh) were not seen by others. This is especially problematic when multiple instances of the CLI are run. For example, consider the case where you start CLI 1 and log in to ChatGPT account X and then start CLI 2 and log out and then log in to ChatGPT account Y. The conversation in CLI 1 is still using account X, but if you create a new conversation, it will suddenly (and unexpectedly) switch to account Y. With the `AuthManager`, auth information is read from disk at the time the `ConversationManager` is constructed, and it is cached in memory. All new conversations use this same auth information, as do any token refreshes. The `AuthManager` is also used by the MCP server's GetAuthStatus command, which now returns the auth method currently used by the MCP server. This PR also includes an enhancement to the GetAuthStatus command. It now accepts two new (optional) input parameters: `include_token` and `refresh_token`. Callers can use this to request the in-use auth token and can optionally request to refresh the token. The PR also adds tests for the login and auth APIs that I recently added to the MCP server.	2025-08-22 13:10:11 -07:00
vjain419	80b00a193e	feat(gpt5): add model_verbosity for GPT‑5 via Responses API (#2108 ) Summary - Adds `model_verbosity` config (values: low, medium, high). - Sends `text.verbosity` only for GPT‑5 family models via the Responses API. - Updates docs and adds serialization tests. Motivation - GPT‑5 introduces a verbosity control to steer output length/detail without pro mpt surgery. - Exposing it as a config knob keeps prompts stable and makes behavior explicit and repeatable. Changes - Config: - Added `Verbosity` enum (low\|medium\|high). - Added optional `model_verbosity` to `ConfigToml`, `Config`, and `ConfigProfi le`. - Request wiring: - Extended `ResponsesApiRequest` with optional `text` object. - Populates `text.verbosity` only when model family is `gpt-5`; omitted otherw ise. - Tests: - Verifies `text.verbosity` serializes when set and is omitted when not set. - Docs: - Added “GPT‑5 Verbosity” section in `codex-rs/README.md`. - Added `model_verbosity` section to `codex-rs/config.md`. Usage - In `~/.codex/config.toml`: - `model = "gpt-5"` - `model_verbosity = "low"` (or `"medium"` default, `"high"`) - CLI override example: - `codex -c model="gpt-5" -c model_verbosity="high"` API Impact - Requests to GPT‑5 via Responses API include: `text: { verbosity: "low\|medium\|h igh" }` when configured. - For legacy models or Chat Completions providers, `text` is omitted. Backward Compatibility - Default behavior unchanged when `model_verbosity` is not set (server default “ medium”). Testing - Added unit tests for serialization/omission of `text.verbosity`. - Ran `cargo fmt` and `cargo test --all-features` (all green). Docs - `README.md`: new “GPT‑5 Verbosity” note under Config with example. - `config.md`: new `model_verbosity` section. Out of Scope - No changes to temperature/top_p or other GPT‑5 parameters. - No changes to Chat Completions wiring. Risks / Notes - If OpenAI changes the wire shape for verbosity, we may need to update `Respons esApiRequest`. - Behavior gated to `gpt-5` model family to avoid unexpected effects elsewhere. Checklist - [x] Code gated to GPT‑5 family only - [x] Docs updated (`README.md`, `config.md`) - [x] Tests added and passing - [x] Formatting applied Release note: Add `model_verbosity` config to control GPT‑5 output verbosity via the Responses API (low\|medium\|high).	2025-08-22 09:12:10 -07:00
Ahmed Ibrahim	c579ae41ae	Fix login for internal employees (#2528 ) This PR: - fixes for internal employee because we currently want to prefer SIWC for them. - fixes retrying forever on unauthorized access. we need to break eventually on max retries.	2025-08-20 14:05:20 -07:00
Michael Bolin	ce434b1219	fix: prefer config var to env var (#2495 )	2025-08-20 04:51:59 +00:00
Ahmed Ibrahim	d1f1e36836	Refresh ChatGPT auth token (#2484 ) ChatGPT token's live for only 1 hour. If the session is longer we don't refresh the token. We should get the expiry timestamp and attempt to refresh before it.	2025-08-19 21:01:31 -07:00
Ahmed Ibrahim	c283f9f6ce	Add an operation to override current task context (#2431 ) - Added an operation to override current task context - Added a test to check that cache stays the same	2025-08-18 19:59:19 +00:00
Ahmed Ibrahim	c9963b52e9	consolidate reasoning enums into one (#2428 ) We have three enums for each of reasoning summaries and reasoning effort with same values. They can be consolidated into one.	2025-08-18 11:50:17 -07:00
Michael Bolin	13ed67cfc1	feat: introduce TurnContext (#2343 ) This PR introduces `TurnContext`, which is designed to hold a set of fields that should be constant for a turn of a conversation. Note that the fields of `TurnContext` were previously governed by `Session`. Ultimately, we want to enable users to change these values between turns (changing model, approval policy, etc.), though in the current implementation, the `TurnContext` is constant for the entire conversation. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2345). * #2345 * #2329 * __->__ #2343 * #2340 * #2338	2025-08-15 09:40:02 -07:00
Parker Thompson	a075424437	Added `allow-expect-in-tests` / `allow-unwrap-in-tests` (#2328 ) This PR: * Added the clippy.toml to configure allowable expect / unwrap usage in tests * Removed as many expect/allow lines as possible from tests * moved a bunch of allows to expects where possible Note: in integration tests, non `#[test]` helper functions are not covered by this so we had to leave a few lingering `expect(expect_used` checks around	2025-08-14 17:59:01 -07:00
pakrym-oai	41eb59a07d	Wait for requested delay in rate limit errors (#2266 ) Fixes: https://github.com/openai/codex/issues/2131 Response doesn't have the delay in a separate field (yet) so parse the message.	2025-08-13 15:43:54 -07:00
easong-openai	6340acd885	Re-add markdown streaming (#2029 ) Wait for newlines, then render markdown on a line by line basis. Word wrap it for the current terminal size and then spit it out line by line into the UI. Also adds tests and fixes some UI regressions.	2025-08-12 17:37:28 -07:00
pakrym-oai	cb78f2333e	Set user-agent (#2230 ) Use the same well-defined value in all cases when sending user-agent header	2025-08-12 16:40:04 +00:00
pakrym-oai	6a6bf99e2c	Send prompt_cache_key (#2200 ) To optimize prompt caching performance.	2025-08-11 16:37:45 -07:00
pakrym-oai	0aa7efe05b	Trace RAW sse events (#2056 ) For easier parsing.	2025-08-11 10:35:03 -07:00
easong-openai	52e12f2b6c	Revert "Streaming markdown (#1920 )" (#1981 ) This reverts commit `2b7139859e`.	2025-08-08 01:38:39 +00:00
easong-openai	2b7139859e	Streaming markdown (#1920 ) We wait until we have an entire newline, then format it with markdown and stream in to the UI. This reduces time to first token but is the right thing to do with our current rendering model IMO. Also lets us add word wrapping!	2025-08-07 18:26:47 -07:00
pakrym-oai	fa0051190b	Adjust error messages (#1969 ) <img width="1378" height="285" alt="image" src="https://github.com/user-attachments/assets/f0283378-f839-4a1f-8331-909694a04b1f" />	2025-08-07 18:24:34 -07:00
pakrym-oai	f23c3066c8	Add capacity error (#1947 )	2025-08-07 10:46:43 -07:00
pakrym-oai	a593b1c3ab	Use different field for error type (#1945 )	2025-08-07 10:20:33 -07:00
pakrym-oai	62ed5907f9	Better usage errors (#1941 ) <img width="771" height="279" alt="image" src="https://github.com/user-attachments/assets/e56f967f-bcd7-49f7-8a94-3d88df68b65a" />	2025-08-07 09:46:13 -07:00
pakrym-oai	7e9ecfbc6a	Rename the model (#1942 )	2025-08-07 09:07:51 -07:00
pakrym-oai	57c973b571	Add 2025-08-06 model family (#1899 )	2025-08-06 23:14:02 +00:00
pakrym-oai	8262ba58b2	Prefer env var auth over default codex auth (#1861 ) ## Summary - Prioritize provider-specific API keys over default Codex auth when building requests - Add test to ensure provider env var auth overrides default auth ## Testing - `just fmt` - `just fix` (fails: `let` expressions in this position are unstable) - `cargo test --all-features` (fails: `let` expressions in this position are unstable) ------ https://chatgpt.com/codex/tasks/task_i_68926a104f7483208f2c8fd36763e0e3	2025-08-06 13:02:00 -07:00
Dylan	3e8bcf0247	[prompts] Add <environment_context> (#1869 ) ## Summary Includes a new user message in the api payload which provides useful environment context for the model, so it knows about things like the current working directory and the sandbox. ## Testing Updated unit tests	2025-08-06 01:13:31 -07:00
Dylan	aff97ed7dd	[core] Separate tools config from openai client (#1858 ) ## Summary In an effort to make tools easier to work with and more configurable, I'm introducing `ToolConfig` and updating `Prompt` to take in a general list of Tools. I think this is simpler and better for a few reasons: - We can easily assemble tools from various sources (our own harness, mcp servers, etc.) and we can consolidate the logic for constructing the logic in one place that is separate from serialization. - client.rs no longer needs arbitrary config values, it just takes in a list of tools to serialize A hefty portion of the PR is now updating our conversion of `mcp_types::Tool` to `OpenAITool`, but considering that @bolinfest accurately called this out as a TODO long ago, I think it's time we tackled it. ## Testing - [x] Experimented locally, no changes, as expected - [x] Added additional unit tests - [x] Responded to rust-review	2025-08-05 19:27:52 -07:00
easong-openai	e0303dbac0	Rescue chat completion changes (#1846 ) https://github.com/openai/codex/pull/1835 has some messed up history. This adds support for streaming chat completions, which is useful for ollama. We should probably take a very skeptical eye to the code introduced in this PR. --------- Co-authored-by: Ahmed Ibrahim <aibrahim@openai.com>	2025-08-05 08:56:13 +00:00
Michael Bolin	136b3ee5bf	chore: introduce ModelFamily abstraction (#1838 ) To date, we have a number of hardcoded OpenAI model slug checks spread throughout the codebase, which makes it hard to audit the various special cases for each model. To mitigate this issue, this PR introduces the idea of a `ModelFamily` that has fields to represent the existing special cases, such as `supports_reasoning_summaries` and `uses_local_shell_tool`. There is a `find_family_for_model()` function that maps the raw model slug to a `ModelFamily`. This function hardcodes all the knowledge about the special attributes for each model. This PR then replaces the hardcoded model name checks with checks against a `ModelFamily`. Note `ModelFamily` is now available as `Config::model_family`. We should ultimately remove `Config::model` in favor of `Config::model_family::slug`.	2025-08-04 23:50:03 -07:00
Dylan	063083af15	[prompts] Better user_instructions handling (#1836 ) ## Summary Our recent change in #1737 can sometimes lead to the model confusing AGENTS.md context as part of the message. But a little prompting and formatting can help fix this! ## Testing - Ran locally with a few different prompts to verify the model behaves well. - Updated unit tests	2025-08-04 18:55:57 -07:00
pakrym-oai	84bcadb8d9	Restore API key and query param overrides (#1826 ) Addresses https://github.com/openai/codex/issues/1796	2025-08-04 18:07:49 -07:00
Ahmed Ibrahim	e38ce39c51	Revert to `3f13ebce10` without rewriting history. Wrong merge	2025-08-04 17:03:24 -07:00
Ahmed Ibrahim	1a33de34b0	unify flag	2025-08-04 16:56:52 -07:00
Ahmed Ibrahim	bd171e5206	add raw reasoning	2025-08-04 16:49:42 -07:00
pakrym-oai	88ea215c80	Add a custom originator setting (#1781 )	2025-08-01 09:55:23 -07:00
pakrym-oai	0935e6a875	Send account id when available (#1767 ) For users with multiple accounts we need to specify the account to use.	2025-07-31 15:40:19 -07:00
pakrym-oai	e0e245cc1c	Send AGENTS.md as a separate user message (#1737 )	2025-07-30 13:56:24 -07:00
pakrym-oai	ea01a5ffe2	Add support for a separate chatgpt auth endpoint (#1712 ) Adds a `CodexAuth` type that encapsulates information about available auth modes and logic for refreshing the token. Changes `Responses` API to send requests to different endpoints based on the auth type. Updates login_with_chatgpt to support API-less mode and skip the key exchange.	2025-07-30 19:40:15 +00:00
Gabriel Peal	8828f6f082	Add an experimental plan tool (#1726 ) This adds a tool the model can call to update a plan. The tool doesn't actually _do_ anything but it gives clients a chance to read and render the structured plan. We will likely iterate on the prompt and tools exposed for planning over time.	2025-07-29 14:22:02 -04:00

1 2

80 Commits