valknar/llmx - llmx - dev.pivoine.art

Author	SHA1	Message	Date
Ahmed Ibrahim	e3f913f567	revert #5812 release file (#5887 ) revert #5812 release file	2025-10-28 20:06:16 +00:00
pakrym-oai	1b8f2543ac	Filter out reasoning items from previous turns (#5857 ) Reduces request size and prevents 400 errors when switching between API orgs. Based on Responses API behavior described in https://cookbook.openai.com/examples/responses_api/reasoning_items#caching	2025-10-28 11:39:34 -07:00
Jeremy Rose	65107d24a2	Fix handling of non-main default branches for cloud task submissions (#5069 ) ## Summary - detect the repository's default branch before submitting a cloud task - expose a helper in `codex_core::git_info` for retrieving the default branch name Fixes #4888 ------ https://chatgpt.com/codex/tasks/task_i_68e96093cf28832ca0c9c73fc618a309	2025-10-28 11:02:25 -07:00
Jeremy Rose	36eb071998	tui: show queued messages during response stream (#5540 ) This fixes an issue where messages sent during the final response stream would seem to disappear, because the "queued messages" UI wasn't shown during streaming.	2025-10-28 16:59:19 +00:00
Jeremy Rose	9b33ce3409	tui: wait longer for color query results (#5004 ) this bumps the timeout when reading the responses to OSC 10/11 so that we're less likely to pass the deadline halfway through reading the response.	2025-10-28 09:42:57 -07:00
zhao-oai	926c89cb20	fix advanced.md (#5833 ) table wasn't formatting correctly	2025-10-28 16:32:20 +00:00
jif-oai	5ba2a17576	chore: decompose submission loop (#5854 )	2025-10-28 15:23:46 +00:00
Owen Lin	266419217e	chore: use anyhow::Result for all app-server integration tests (#5836 ) There's a lot of visual noise in app-server's integration tests due to the number of `.expect("<some_msg>")` lines which are largely redundant / not very useful. Clean them up by using `anyhow::Result` + `?` consistently. Replaces the existing pattern of: ``` let codex_home = TempDir::new().expect("create temp dir"); create_config_toml(codex_home.path()).expect("write config.toml"); let mut mcp = McpProcess::new(codex_home.path()) .await .expect("spawn mcp process"); timeout(DEFAULT_READ_TIMEOUT, mcp.initialize()) .await .expect("initialize timeout") .expect("initialize request"); ``` With: ``` let codex_home = TempDir::new()?; create_config_toml(codex_home.path())?; let mut mcp = McpProcess::new(codex_home.path()).await?; timeout(DEFAULT_READ_TIMEOUT, mcp.initialize()).await??; ```	2025-10-28 08:10:23 -07:00
jif-oai	be4bdfec93	chore: drop useless shell stuff (#5848 )	2025-10-28 14:52:52 +00:00
jif-oai	7ff142d93f	chore: speed-up pipeline (#5812 ) Speed-up pipeline by: * Decoupling tests and clippy * Use pre-built binary in tests * `sccache` for caching of the builds	2025-10-28 14:08:52 +00:00
Celia Chen	4a42c4e142	[Auth] Choose which auth storage to use based on config (#5792 ) This PR is a follow-up to #5591. It allows users to choose which auth storage mode they want by using the new `cli_auth_credentials_store_mode` config.	2025-10-27 19:41:49 -07:00
Josh McKinney	66a4b89822	feat(tui): clarify Windows auto mode requirements (#5568 ) ## Summary - Coerce Windows `workspace-write` configs back to read-only, surface the forced downgrade in the approvals popup, and funnel users toward WSL or Full Access. - Add WSL installation instructions to the Auto preset on Windows while keeping the preset available for other platforms. - Skip the trust-on-first-run prompt on native Windows so new folders remain read-only without additional confirmation. - Expose a structured sandbox policy resolution from config to flag Windows downgrades and adjust tests (core, exec, TUI) to reflect the new behavior; provide a Windows-only approvals snapshot. ## Testing - cargo fmt - cargo test -p codex-core config::tests::add_dir_override_extends_workspace_writable_roots - cargo test -p codex-exec suite::resume::exec_resume_preserves_cli_configuration_overrides - cargo test -p codex-tui chatwidget::tests::approvals_selection_popup_snapshot - cargo test -p codex-tui approvals_popup_includes_wsl_note_for_auto_mode - cargo test -p codex-tui windows_skips_trust_prompt - just fix -p codex-core - just fix -p codex-tui	2025-10-28 01:19:32 +00:00
Ahmed Ibrahim	d7b333be97	Truncate the content-item for mcp tools (#5835 ) This PR truncates the text output of MCP tool	2025-10-28 00:39:35 +00:00
zhao-oai	4d6a42a622	fix image drag drop (#5794 ) fixing drag/drop photos bug in codex state of the world before: sometimes, when you drag screenshots into codex, the image does not properly render into context. instead, the file name is shown in quotation marks. https://github.com/user-attachments/assets/3c0e540a-505c-4ec0-b634-e9add6a73119 the screenshot is not actually included in agent context. the agent needs to manually call the view_image tool to see the screenshot. this can be unreliable especially if the image is part of a longer prompt and is dependent on the agent going out of its way to view the image. state of the world after: https://github.com/user-attachments/assets/5f2b7bf7-8a3f-4708-85f3-d68a017bfd97 now, images will always be directly embedded into chat context ## Technical Details - MacOS sends screenshot paths with a narrow no‑break space right before the “AM/PM” suffix, which used to trigger our non‑ASCII fallback in the paste burst detector. - That fallback flushed the partially buffered paste immediately, so the path arrived in two separate `handle_paste` calls (quoted prefix + `PM.png'`). The split string could not be normalized to a real path, so we showed the quoted filename instead of embedding the image. - We now append non‑ASCII characters into the burst buffer when a burst is already active. Finder’s payload stays intact, the path normalizes, and the image attaches automatically. - When no burst is active (e.g. during IME typing), non‑ASCII characters still bypass the buffer so text entry remains responsive.	2025-10-27 17:11:30 -07:00
Gabriel Peal	b0bdc04c30	[MCP] Render MCP tool call result images to the model (#5600 ) It's pretty amazing we have gotten here without the ability for the model to see image content from MCP tool calls. This PR builds off of 4391 and fixes #4819. I would like @KKcorps to get adequete credit here but I also want to get this fix in ASAP so I gave him a week to update it and haven't gotten a response so I'm going to take it across the finish line. This test highlights how absured the current situation is. I asked the model to read this image using the Chrome MCP <img width="2378" height="674" alt="image" src="https://github.com/user-attachments/assets/9ef52608-72a2-4423-9f5e-7ae36b2b56e0" /> After this change, it correctly outputs: > Captured the page: image dhows a dark terminal-style UI labeled `OpenAI Codex (v0.0.0)` with prompt `model: gpt-5-codex medium` and working directory `/codex/codex-rs` (and more) Before this change, it said: > Took the full-page screenshot you asked for. It shows a long, horizontally repeating pattern of stylized people in orange, light-blue, and mustard clothing, holding hands in alternating poses against a white background. No text or other graphics-just rows of flat illustration stretching off to the right. Without this change, the Figma, Playwright, Chrome, and other visual MCP servers are pretty much entirely useless. I tested this change with the openai respones api as well as a third party completions api	2025-10-27 17:55:57 -04:00
Owen Lin	67a219ffc2	fix: move account struct to app-server-protocol and use camelCase (#5829 ) Makes sense to move this struct to `app-server-protocol/` since we want to serialize as camelCase, but we don't for structs defined in `protocol/` It was: ``` export type Account = { "type": "ApiKey", api_key: string, } \| { "type": "chatgpt", email: string \| null, plan_type: PlanType, }; ``` But we want: ``` export type Account = { "type": "apiKey", apiKey: string, } \| { "type": "chatgpt", email: string \| null, planType: PlanType, }; ```	2025-10-27 14:06:13 -07:00
Ahmed Ibrahim	7226365397	Centralize truncation in conversation history (#5652 ) move the truncation logic to conversation history to use on any tool output. This will help us in avoiding edge cases while truncating the tool calls and mcp calls.	2025-10-27 14:05:35 -07:00
Celia Chen	0fc295d958	[Auth] Add keyring support for Codex CLI (#5591 ) Follow-up PR to #5569. Add Keyring Support for Auth Storage in Codex CLI as well as a hybrid mode (default to persisting in keychain but fall back to file when unavailable.) It also refactors out the keyringstore implementation from rmcp-client [here](https://github.com/openai/codex/blob/main/codex-rs/rmcp-client/src/oauth.rs) to a new keyring-store crate. There will be a follow-up that picks the right credential mode depending on the config, instead of hardcoding `AuthCredentialsStoreMode::File`.	2025-10-27 12:10:11 -07:00
jif-oai	3e50f94d76	feat: support verbosity in model_family (#5821 )	2025-10-27 18:46:30 +00:00
Celia Chen	eb5b1b627f	[Auth] Introduce New Auth Storage Abstraction for Codex CLI (#5569 ) This PR introduces a new `Auth Storage` abstraction layer that takes care of read, write, and load of auth tokens based on the AuthCredentialsStoreMode. It is similar to how we handle MCP client oauth [here](https://github.com/openai/codex/blob/main/codex-rs/rmcp-client/src/oauth.rs). Instead of reading and writing directly from disk for auth tokens, Codex CLI workflows now should instead use this auth storage using the public helper functions. This PR is just a refactor of the current code so the behavior stays the same. We will add support for keyring and hybrid mode in follow-up PRs. I have read the CLA Document and I hereby sign the CLA	2025-10-27 11:01:14 -07:00
Eric Traut	0c1ff1d3fd	Made token refresh code resilient to missing `id_token` (#5782 ) This PR does the following: 1. Changes `try_refresh_token` to handle the case where the endpoint returns a response without an `id_token`. The OpenID spec indicates that this field is optional and clients should not assume it's present. 2. Changes the `attempt_stream_responses` to propagate token refresh errors rather than silently ignoring them. 3. Fixes a typo in a couple of error messages (unrelated to the above, but something I noticed in passing) - "reconnect" should be spelled without a hyphen. This PR does not implement the additional suggestion from @pakrym-oai that we should sign out when receiving `refresh_token_expired` from the refresh endpoint. Leaving this as a follow-on because I'm undecided on whether this should be implemented in `try_refresh_token` or its callers.	2025-10-27 10:09:53 -07:00
jif-oai	aea7610c76	feat: image resizing (#5446 ) Add image resizing on the client side to reduce load on the API	2025-10-27 16:58:10 +00:00
jif-oai	775fbba6e0	feat: return an error if unknown enabled/disabled feature (#5817 )	2025-10-27 16:53:00 +00:00
Michael Bolin	5ee8a17b4e	feat: introduce GetConversationSummary RPC (#5803 ) This adds an RPC to the app server to the the `ConversationSummary` via a rollout path. Now that the VS Code extension supports showing the Codex UI in an editor panel where the URI of the panel maps to the rollout file, we need to be able to get the `ConversationSummary` from the rollout file directly.	2025-10-27 09:11:45 -07:00
jif-oai	81be54b229	fix: test yield time (#5811 )	2025-10-27 11:57:29 +00:00
jif-oai	5e8659dcbc	chore: undo nits (#5631 )	2025-10-27 11:48:01 +00:00
jif-oai	2338294b39	nit: doc on session task (#5809 )	2025-10-27 11:43:33 +00:00
jif-oai	afc4eaab8b	feat: TUI undo op (#5629 )	2025-10-27 10:55:29 +00:00
jif-oai	e92c4f6561	feat: async ghost commit (#5618 )	2025-10-27 10:09:10 +00:00
Michael Bolin	15fa2283e7	feat: update NewConversationParams to take an optional model_provider (#5793 ) An AppServer client should be able to use any (`model_provider`, `model`) in the user's config. `NewConversationParams` already supported specifying the `model`, but this PR expands it to support `model_provider`, as well. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/5793). * #5803 * __->__ #5793	2025-10-27 09:33:30 +00:00
Michael Bolin	5907422d65	feat: annotate conversations with model_provider for filtering (#5658 ) Because conversations that use the Responses API can have encrypted reasoning messages, trying to resume a conversation with a different provider could lead to confusing "failed to decrypt" errors. (This is reproducible by starting a conversation using ChatGPT login and resuming it as a conversation that uses OpenAI models via Azure.) This changes `ListConversationsParams` to take a `model_providers: Option<Vec<String>>` and adds `model_provider` on each `ConversationSummary` it returns so these cases can be disambiguated. Note this ended up making changes to `codex-rs/core/src/rollout/tests.rs` because it had a number of cases where it expected `Some` for the value of `next_cursor`, but the list of rollouts was complete, so according to this docstring: `bcd64c7e72/codex-rs/app-server-protocol/src/protocol.rs (L334-L337)` If there are no more items to return, then `next_cursor` should be `None`. This PR updates that logic. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/5658). * #5803 * #5793 * __->__ #5658	2025-10-27 02:03:30 -07:00
Ahmed Ibrahim	f178805252	Add feedback upload request handling (#5682 )	2025-10-27 05:53:39 +00:00
Michael Bolin	a55b0c4bcc	fix: revert "[app-server] fix account/read response annotation (#5642 )" (#5796 ) Revert #5642 because this generates: ``` // GENERATED CODE! DO NOT MODIFY BY HAND! // This file was generated by [ts-rs](https://github.com/Aleph-Alpha/ts-rs). Do not edit this file manually. export type GetAccountResponse = Account \| null; ``` But `Account` is unknown. The unique use of `#[ts(export)]` on `GetAccountResponse` is also suspicious as are the changes to `codex-rs/app-server-protocol/src/export.rs` since the existing system has worked fine for quite some time. Though a pure backout of #5642 puts things in a state where, as the PR noted, the following does not work: ``` cargo run -p codex-app-server-protocol --bin export -- --out DIR ``` So in addition to the backout, this PR adds: ```rust #[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)] #[serde(rename_all = "camelCase")] pub struct GetAccountResponse { pub account: Account, } ``` and changes `GetAccount.response` as follows: ```diff - response: Option<Account>, + response: GetAccountResponse, ``` making it consistent with other types. With this change, I verified that both of the following work: ``` just codex generate-ts --out /tmp/somewhere cargo run -p codex-app-server-protocol --bin export -- --out /tmp/somewhere-else ``` The generated TypeScript is as follows: ```typescript // GetAccountResponse.ts import type { Account } from "./Account"; export type GetAccountResponse = { account: Account, }; ``` and ```typescript // Account.ts import type { PlanType } from "./PlanType"; export type Account = { "type": "ApiKey", api_key: string, } \| { "type": "chatgpt", email: string \| null, plan_type: PlanType, }; ``` Though while the inconsistency between `"type": "ApiKey"` and `"type": "chatgpt"` is quite concerning, I'm not sure if that format is ever written to disk in any case, but @owenlin0, I would recommend looking into that. Also, it appears that the types in `codex-rs/protocol/src/account.rs` are used exclusively by the `app-server-protocol` crate, so perhaps they should just be moved there?	2025-10-26 18:57:42 -07:00
Thibault Sottiaux	224222f09f	fix: use codex-exp prefix for experimental models and consider codex- models to be production (#5797 )	2025-10-27 01:55:12 +00:00
Gabriel Peal	7aab45e060	[MCP] Minor docs clarifications around stdio tokens (#5676 ) Noticed [here](https://github.com/openai/codex/issues/4707#issuecomment-3446547561)	2025-10-26 13:38:30 -04:00
Eric Traut	bcd64c7e72	Reduced runtime of unit test that was taking multiple minutes (#5688 ) Modified `build_compacted_history_truncates_overlong_user_messages` test to reduce runtime from minutes to tens of seconds	2025-10-25 23:46:08 -07:00
Eric Traut	c124f24354	Added support for `sandbox_mode` in profiles (#5686 ) Currently, `approval_policy` is supported in profiles, but `sandbox_mode` is not. This PR adds support for `sandbox_mode`. Note: a fix for this was submitted in [this PR](https://github.com/openai/codex/pull/2397), but the underlying code has changed significantly since then. This addresses issue #3034	2025-10-25 16:52:26 -07:00
pakrym-oai	c7e4e6d0ee	Skip flaky test (#5680 ) Did an investigation but couldn't find anything obvious. Let's skip for now.	2025-10-25 12:11:16 -07:00
Ahmed Ibrahim	88abbf58ce	Followup feedback (#5663 ) - Added files to be uploaded - Refactored - Updated title	2025-10-25 06:07:40 +00:00
Ahmed Ibrahim	71f838389b	Improve feedback (#5661 ) <img width="1099" height="153" alt="image" src="https://github.com/user-attachments/assets/2c901884-8baf-4b1b-b2c4-bcb61ff42be8" /> <img width="1082" height="125" alt="image" src="https://github.com/user-attachments/assets/6336e6c9-9ace-46df-a383-a807ceffa524" /> <img width="1102" height="103" alt="image" src="https://github.com/user-attachments/assets/78883682-7e44-4fa3-9e04-57f7df4766fd" />	2025-10-24 22:28:14 -07:00
Eric Traut	0533bd2e7c	Fixed flaky unit test (#5654 ) This PR fixes a test that is sporadically failing in CI. The problem is that two unit tests (the older `login_and_cancel_chatgpt` and a recently added `login_chatgpt_includes_forced_workspace_query_param`) exercise code paths that start the login server. The server binds to a hard-coded localhost port number, so attempts to start more than one server at the same time will fail. If these two tests happen to run concurrently, one of them will fail. To fix this, I've added a simple mutex. We can use this same mutex for future tests that use the same pattern.	2025-10-24 16:31:24 -07:00
Anton Panasenko	6af83d86ff	[codex][app-server] introduce codex/event/raw_item events (#5578 )	2025-10-24 22:41:52 +00:00
Gabriel Peal	e2e1b65da6	[MCP] Properly gate login after `mcp add` with `experimental_use_rmcp_client` (#5653 ) There was supposed to be a check here like in other places.	2025-10-24 18:32:15 -04:00
Gabriel Peal	817d1508bc	[MCP] Redact environment variable values in `/mcp` and `mcp get` (#5648 ) Fixes #5524	2025-10-24 18:30:20 -04:00
Eric Traut	f8af4f5c8d	Added model summary and risk assessment for commands that violate sandbox policy (#5536 ) This PR adds support for a model-based summary and risk assessment for commands that violate the sandbox policy and require user approval. This aids the user in evaluating whether the command should be approved. The feature works by taking a failed command and passing it back to the model and asking it to summarize the command, give it a risk level (low, medium, high) and a risk category (e.g. "data deletion" or "data exfiltration"). It uses a new conversation thread so the context in the existing thread doesn't influence the answer. If the call to the model fails or takes longer than 5 seconds, it falls back to the current behavior. For now, this is an experimental feature and is gated by a config key `experimental_sandbox_command_assessment`. Here is a screen shot of the approval prompt showing the risk assessment and summary. <img width="723" height="282" alt="image" src="https://github.com/user-attachments/assets/4597dd7c-d5a0-4e9f-9d13-414bd082fd6b" />	2025-10-24 15:23:44 -07:00
pakrym-oai	a4be4d78b9	Log more types of request IDs (#5645 ) Different services return different sets of IDs, log all of them to simplify debugging.	2025-10-24 19:12:03 +00:00
Shijie Rao	00c1de0c56	Add instruction for upgrading codex with brew (#5640 ) Include instruction for upgrading codex with brew when there is switch from formula to cask.	2025-10-24 11:30:34 -07:00
Owen Lin	190e7eb104	[app-server] fix account/read response annotation (#5642 ) The API schema export is currently broken: ``` > cargo run -p codex-app-server-protocol --bin export -- --out DIR Error: this type cannot be exported ``` This PR fixes the error message so we get more info: ``` > cargo run -p codex-app-server-protocol --bin export -- --out DIR Error: failed to export client responses: dependency core::option::Option<codex_protocol::account::Account> cannot be exported ``` And fixes the root cause which is the `account/read` response.	2025-10-24 11:17:46 -07:00
pakrym-oai	061862a0e2	Add CodexHttpClient wrapper with request logging (#5564 ) ## Summary - wrap the default reqwest::Client inside a new CodexHttpClient/CodexRequestBuilder pair and log the HTTP method, URL, and status for each request - update the auth/model/provider plumbing to use the new builder helpers so headers and bearer auth continue to be applied consistently - add the shared `http` dependency that backs the header conversion helpers ## Testing - `CODEX_SANDBOX=seatbelt CODEX_SANDBOX_NETWORK_DISABLED=1 cargo test -p codex-core` - `CODEX_SANDBOX=seatbelt CODEX_SANDBOX_NETWORK_DISABLED=1 cargo test -p codex-chatgpt` - `CODEX_SANDBOX=seatbelt CODEX_SANDBOX_NETWORK_DISABLED=1 cargo test -p codex-tui` ------ https://chatgpt.com/codex/tasks/task_i_68fa5038c17483208b1148661c5873be	2025-10-24 09:47:52 -07:00
zhao-oai	c72b2ad766	adding messaging for stale rate limits + when no rate limits are cached (#5570 )	2025-10-24 08:46:31 -07:00

... 3 4 5 6 7 ...

1935 Commits