valknar/llmx - llmx - dev.pivoine.art

Author	SHA1	Message	Date
Ahmed Ibrahim	fac548e430	Send delegate header (#5942 ) Send delegate type header	2025-10-30 09:49:40 +00:00
zhao-oai	b34efde2f3	asdf (#5940 ) .	2025-10-30 01:10:41 +00:00
Ahmed Ibrahim	7aa46ab5fc	ignore agent message deltas for the review mode (#5937 ) The deltas produce the whole json output. ignore them.	2025-10-30 00:47:55 +00:00
pakrym-oai	3429e82e45	Add item streaming events (#5546 ) Adds AgentMessageContentDelta, ReasoningContentDelta, ReasoningRawContentDelta item streaming events while maintaining compatibility for old events. --------- Co-authored-by: Owen Lin <owen@openai.com>	2025-10-29 22:33:57 +00:00
Ahmed Ibrahim	13e1d0362d	Delegate review to codex instance (#5572 ) In this PR, I am exploring migrating task kind to an invocation of Codex. The main reason would be getting rid off multiple `ConversationHistory` state and streamlining our context/history management. This approach depends on opening a channel between the sub-codex and codex. This channel is responsible for forwarding `interactive` (`approvals`) and `non-interactive` events. The `task` is responsible for handling those events. This opens the door for implementing `codex as a tool`, replacing `compact` and `review`, and potentially subagents. One consideration is this code is very similar to `app-server` specially in the approval part. If in the future we wanted an interactive `sub-codex` we should consider using `codex-mcp`	2025-10-29 21:04:25 +00:00
jif-oai	db31f6966d	chore: config editor (#5878 ) The goal is to have a single place where we actually write files In a follow-up PR, will move everything config related in a dedicated module and move the helpers in a dedicated file	2025-10-29 20:52:46 +00:00
Rasmus Rygaard	39e09c289d	Add a wrapper around raw response items (#5923 ) We currently have nested enums when sending raw response items in the app-server protocol. This makes downstream schemas confusing because we need to embed `type`-discriminated enums within each other. This PR adds a small wrapper around the response item so we can keep the schemas separate	2025-10-29 20:32:40 +00:00
jif-oai	3183935bd7	feat: add output even in sandbox denied (#5908 )	2025-10-29 18:21:18 +00:00
jif-oai	060637b4d4	feat: deprecation warning (#5825 ) <img width="955" height="311" alt="Screenshot 2025-10-28 at 14 26 25" src="https://github.com/user-attachments/assets/99729b3d-3bc9-4503-aab3-8dc919220ab4" />	2025-10-29 12:29:28 +00:00
jif-oai	fa92cd92fa	chore: merge git crates (#5909 ) Merge `git-apply` and `git-tooling` into `utils/`	2025-10-29 12:11:44 +00:00
Abhishek Bhardwaj	89591e4246	feature: Add "!cmd" user shell execution (#2471 ) feature: Add "!cmd" user shell execution This change lets users run local shell commands directly from the TUI by prefixing their input with ! (e.g. !ls). Output is truncated to keep the exec cell usable, and Ctrl-C cleanly interrupts long-running commands (e.g. !sleep 10000). Summary of changes - Route Op::RunUserShellCommand through a dedicated UserShellCommandTask (core/src/tasks/user_shell.rs), keeping the task logic out of codex.rs. - Reuse the existing tool router: the task constructs a ToolCall for the local_shell tool and relies on ShellHandler, so no manual MCP tool lookup is required. - Emit exec lifecycle events (ExecCommandBegin/ExecCommandEnd) so the TUI can show command metadata, live output, and exit status. End-to-end flow TUI handling 1. ChatWidget::submit_user_message (TUI) intercepts messages starting with !. 2. Non-empty commands dispatch Op::RunUserShellCommand { command }; empty commands surface a help hint. 3. No UserInput items are created, so nothing is enqueued for the model. Core submission loop 4. The submission loop routes the op to handlers::run_user_shell_command (core/src/codex.rs). 5. A fresh TurnContext is created and Session::spawn_user_shell_command enqueues UserShellCommandTask. Task execution 6. UserShellCommandTask::run emits TaskStartedEvent, formats the command, and prepares a ToolCall targeting local_shell. 7. ToolCallRuntime::handle_tool_call dispatches to ShellHandler. Shell tool runtime 8. ShellHandler::run_exec_like launches the process via the unified exec runtime, honoring sandbox and shell policies, and emits ExecCommandBegin/End. 9. Stdout/stderr are captured for the UI, but the task does not turn the resulting ToolOutput into a model response. Completion 10. After ExecCommandEnd, the task finishes without an assistant message; the session marks it complete and the exec cell displays the final output. Conversation context - The command and its output never enter the conversation history or the model prompt; the flow is local-only. - Only exec/task events are emitted for UI rendering. Demo video https://github.com/user-attachments/assets/fcd114b0-4304-4448-a367-a04c43e0b996	2025-10-29 00:31:20 -07:00
Axojhf	802d2440b4	Fix bash detection failure in VS Code Codex extension on Windows under certain conditions (#3421 ) Found that the VS Code Codex extension throws “Error starting conversation” when initializing a conversation with Git for Windows’ bash on PATH. Debugging showed the bash-detection logic did not return as expected; this change makes it reliable in that scenario. Possibly related to issue #2841.	2025-10-28 21:29:16 -07:00
pakrym-oai	ef3e075ad6	Refresh tokens more often and log a better message when both auth and token refresh fails (#5655 ) <img width="784" height="153" alt="image" src="https://github.com/user-attachments/assets/c44b0eb2-d65c-4fc2-8b54-b34f7e1c4d95" />	2025-10-28 18:55:53 -07:00
Anton Panasenko	149e198ce8	[codex][app-server] resume conversation from history (#5893 )	2025-10-28 18:18:03 -07:00
zhao-oai	36113509f2	verify mime type of images (#5888 ) solves: https://github.com/openai/codex/issues/5675 Block non-image uploads in the view_image workflow. We now confirm the file’s MIME is image/* before building the data URL; otherwise we emit a “unsupported MIME type” error to the model. This stops the agent from sending application/json blobs that the Responses API rejects with 400s. <img width="409" height="556" alt="Screenshot 2025-10-28 at 1 15 10 PM" src="https://github.com/user-attachments/assets/a92199e8-2769-4b1d-8e33-92d9238c90fe" />	2025-10-28 14:52:51 -07:00
Ahmed Ibrahim	ef55992ab0	remove beta experimental header (#5892 )	2025-10-28 21:28:56 +00:00
pakrym-oai	1b8f2543ac	Filter out reasoning items from previous turns (#5857 ) Reduces request size and prevents 400 errors when switching between API orgs. Based on Responses API behavior described in https://cookbook.openai.com/examples/responses_api/reasoning_items#caching	2025-10-28 11:39:34 -07:00
Jeremy Rose	65107d24a2	Fix handling of non-main default branches for cloud task submissions (#5069 ) ## Summary - detect the repository's default branch before submitting a cloud task - expose a helper in `codex_core::git_info` for retrieving the default branch name Fixes #4888 ------ https://chatgpt.com/codex/tasks/task_i_68e96093cf28832ca0c9c73fc618a309	2025-10-28 11:02:25 -07:00
jif-oai	5ba2a17576	chore: decompose submission loop (#5854 )	2025-10-28 15:23:46 +00:00
jif-oai	be4bdfec93	chore: drop useless shell stuff (#5848 )	2025-10-28 14:52:52 +00:00
jif-oai	7ff142d93f	chore: speed-up pipeline (#5812 ) Speed-up pipeline by: * Decoupling tests and clippy * Use pre-built binary in tests * `sccache` for caching of the builds	2025-10-28 14:08:52 +00:00
Celia Chen	4a42c4e142	[Auth] Choose which auth storage to use based on config (#5792 ) This PR is a follow-up to #5591. It allows users to choose which auth storage mode they want by using the new `cli_auth_credentials_store_mode` config.	2025-10-27 19:41:49 -07:00
Josh McKinney	66a4b89822	feat(tui): clarify Windows auto mode requirements (#5568 ) ## Summary - Coerce Windows `workspace-write` configs back to read-only, surface the forced downgrade in the approvals popup, and funnel users toward WSL or Full Access. - Add WSL installation instructions to the Auto preset on Windows while keeping the preset available for other platforms. - Skip the trust-on-first-run prompt on native Windows so new folders remain read-only without additional confirmation. - Expose a structured sandbox policy resolution from config to flag Windows downgrades and adjust tests (core, exec, TUI) to reflect the new behavior; provide a Windows-only approvals snapshot. ## Testing - cargo fmt - cargo test -p codex-core config::tests::add_dir_override_extends_workspace_writable_roots - cargo test -p codex-exec suite::resume::exec_resume_preserves_cli_configuration_overrides - cargo test -p codex-tui chatwidget::tests::approvals_selection_popup_snapshot - cargo test -p codex-tui approvals_popup_includes_wsl_note_for_auto_mode - cargo test -p codex-tui windows_skips_trust_prompt - just fix -p codex-core - just fix -p codex-tui	2025-10-28 01:19:32 +00:00
Ahmed Ibrahim	d7b333be97	Truncate the content-item for mcp tools (#5835 ) This PR truncates the text output of MCP tool	2025-10-28 00:39:35 +00:00
Gabriel Peal	b0bdc04c30	[MCP] Render MCP tool call result images to the model (#5600 ) It's pretty amazing we have gotten here without the ability for the model to see image content from MCP tool calls. This PR builds off of 4391 and fixes #4819. I would like @KKcorps to get adequete credit here but I also want to get this fix in ASAP so I gave him a week to update it and haven't gotten a response so I'm going to take it across the finish line. This test highlights how absured the current situation is. I asked the model to read this image using the Chrome MCP <img width="2378" height="674" alt="image" src="https://github.com/user-attachments/assets/9ef52608-72a2-4423-9f5e-7ae36b2b56e0" /> After this change, it correctly outputs: > Captured the page: image dhows a dark terminal-style UI labeled `OpenAI Codex (v0.0.0)` with prompt `model: gpt-5-codex medium` and working directory `/codex/codex-rs` (and more) Before this change, it said: > Took the full-page screenshot you asked for. It shows a long, horizontally repeating pattern of stylized people in orange, light-blue, and mustard clothing, holding hands in alternating poses against a white background. No text or other graphics-just rows of flat illustration stretching off to the right. Without this change, the Figma, Playwright, Chrome, and other visual MCP servers are pretty much entirely useless. I tested this change with the openai respones api as well as a third party completions api	2025-10-27 17:55:57 -04:00
Ahmed Ibrahim	7226365397	Centralize truncation in conversation history (#5652 ) move the truncation logic to conversation history to use on any tool output. This will help us in avoiding edge cases while truncating the tool calls and mcp calls.	2025-10-27 14:05:35 -07:00
Celia Chen	0fc295d958	[Auth] Add keyring support for Codex CLI (#5591 ) Follow-up PR to #5569. Add Keyring Support for Auth Storage in Codex CLI as well as a hybrid mode (default to persisting in keychain but fall back to file when unavailable.) It also refactors out the keyringstore implementation from rmcp-client [here](https://github.com/openai/codex/blob/main/codex-rs/rmcp-client/src/oauth.rs) to a new keyring-store crate. There will be a follow-up that picks the right credential mode depending on the config, instead of hardcoding `AuthCredentialsStoreMode::File`.	2025-10-27 12:10:11 -07:00
jif-oai	3e50f94d76	feat: support verbosity in model_family (#5821 )	2025-10-27 18:46:30 +00:00
Celia Chen	eb5b1b627f	[Auth] Introduce New Auth Storage Abstraction for Codex CLI (#5569 ) This PR introduces a new `Auth Storage` abstraction layer that takes care of read, write, and load of auth tokens based on the AuthCredentialsStoreMode. It is similar to how we handle MCP client oauth [here](https://github.com/openai/codex/blob/main/codex-rs/rmcp-client/src/oauth.rs). Instead of reading and writing directly from disk for auth tokens, Codex CLI workflows now should instead use this auth storage using the public helper functions. This PR is just a refactor of the current code so the behavior stays the same. We will add support for keyring and hybrid mode in follow-up PRs. I have read the CLA Document and I hereby sign the CLA	2025-10-27 11:01:14 -07:00
Eric Traut	0c1ff1d3fd	Made token refresh code resilient to missing `id_token` (#5782 ) This PR does the following: 1. Changes `try_refresh_token` to handle the case where the endpoint returns a response without an `id_token`. The OpenID spec indicates that this field is optional and clients should not assume it's present. 2. Changes the `attempt_stream_responses` to propagate token refresh errors rather than silently ignoring them. 3. Fixes a typo in a couple of error messages (unrelated to the above, but something I noticed in passing) - "reconnect" should be spelled without a hyphen. This PR does not implement the additional suggestion from @pakrym-oai that we should sign out when receiving `refresh_token_expired` from the refresh endpoint. Leaving this as a follow-on because I'm undecided on whether this should be implemented in `try_refresh_token` or its callers.	2025-10-27 10:09:53 -07:00
jif-oai	aea7610c76	feat: image resizing (#5446 ) Add image resizing on the client side to reduce load on the API	2025-10-27 16:58:10 +00:00
jif-oai	775fbba6e0	feat: return an error if unknown enabled/disabled feature (#5817 )	2025-10-27 16:53:00 +00:00
Michael Bolin	5ee8a17b4e	feat: introduce GetConversationSummary RPC (#5803 ) This adds an RPC to the app server to the the `ConversationSummary` via a rollout path. Now that the VS Code extension supports showing the Codex UI in an editor panel where the URI of the panel maps to the rollout file, we need to be able to get the `ConversationSummary` from the rollout file directly.	2025-10-27 09:11:45 -07:00
jif-oai	81be54b229	fix: test yield time (#5811 )	2025-10-27 11:57:29 +00:00
jif-oai	5e8659dcbc	chore: undo nits (#5631 )	2025-10-27 11:48:01 +00:00
jif-oai	2338294b39	nit: doc on session task (#5809 )	2025-10-27 11:43:33 +00:00
jif-oai	afc4eaab8b	feat: TUI undo op (#5629 )	2025-10-27 10:55:29 +00:00
jif-oai	e92c4f6561	feat: async ghost commit (#5618 )	2025-10-27 10:09:10 +00:00
Michael Bolin	5907422d65	feat: annotate conversations with model_provider for filtering (#5658 ) Because conversations that use the Responses API can have encrypted reasoning messages, trying to resume a conversation with a different provider could lead to confusing "failed to decrypt" errors. (This is reproducible by starting a conversation using ChatGPT login and resuming it as a conversation that uses OpenAI models via Azure.) This changes `ListConversationsParams` to take a `model_providers: Option<Vec<String>>` and adds `model_provider` on each `ConversationSummary` it returns so these cases can be disambiguated. Note this ended up making changes to `codex-rs/core/src/rollout/tests.rs` because it had a number of cases where it expected `Some` for the value of `next_cursor`, but the list of rollouts was complete, so according to this docstring: `bcd64c7e72/codex-rs/app-server-protocol/src/protocol.rs (L334-L337)` If there are no more items to return, then `next_cursor` should be `None`. This PR updates that logic. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/5658). * #5803 * #5793 * __->__ #5658	2025-10-27 02:03:30 -07:00
Ahmed Ibrahim	f178805252	Add feedback upload request handling (#5682 )	2025-10-27 05:53:39 +00:00
Thibault Sottiaux	224222f09f	fix: use codex-exp prefix for experimental models and consider codex- models to be production (#5797 )	2025-10-27 01:55:12 +00:00
Eric Traut	bcd64c7e72	Reduced runtime of unit test that was taking multiple minutes (#5688 ) Modified `build_compacted_history_truncates_overlong_user_messages` test to reduce runtime from minutes to tens of seconds	2025-10-25 23:46:08 -07:00
Eric Traut	c124f24354	Added support for `sandbox_mode` in profiles (#5686 ) Currently, `approval_policy` is supported in profiles, but `sandbox_mode` is not. This PR adds support for `sandbox_mode`. Note: a fix for this was submitted in [this PR](https://github.com/openai/codex/pull/2397), but the underlying code has changed significantly since then. This addresses issue #3034	2025-10-25 16:52:26 -07:00
Ahmed Ibrahim	71f838389b	Improve feedback (#5661 ) <img width="1099" height="153" alt="image" src="https://github.com/user-attachments/assets/2c901884-8baf-4b1b-b2c4-bcb61ff42be8" /> <img width="1082" height="125" alt="image" src="https://github.com/user-attachments/assets/6336e6c9-9ace-46df-a383-a807ceffa524" /> <img width="1102" height="103" alt="image" src="https://github.com/user-attachments/assets/78883682-7e44-4fa3-9e04-57f7df4766fd" />	2025-10-24 22:28:14 -07:00
Anton Panasenko	6af83d86ff	[codex][app-server] introduce codex/event/raw_item events (#5578 )	2025-10-24 22:41:52 +00:00
Eric Traut	f8af4f5c8d	Added model summary and risk assessment for commands that violate sandbox policy (#5536 ) This PR adds support for a model-based summary and risk assessment for commands that violate the sandbox policy and require user approval. This aids the user in evaluating whether the command should be approved. The feature works by taking a failed command and passing it back to the model and asking it to summarize the command, give it a risk level (low, medium, high) and a risk category (e.g. "data deletion" or "data exfiltration"). It uses a new conversation thread so the context in the existing thread doesn't influence the answer. If the call to the model fails or takes longer than 5 seconds, it falls back to the current behavior. For now, this is an experimental feature and is gated by a config key `experimental_sandbox_command_assessment`. Here is a screen shot of the approval prompt showing the risk assessment and summary. <img width="723" height="282" alt="image" src="https://github.com/user-attachments/assets/4597dd7c-d5a0-4e9f-9d13-414bd082fd6b" />	2025-10-24 15:23:44 -07:00
pakrym-oai	a4be4d78b9	Log more types of request IDs (#5645 ) Different services return different sets of IDs, log all of them to simplify debugging.	2025-10-24 19:12:03 +00:00
pakrym-oai	061862a0e2	Add CodexHttpClient wrapper with request logging (#5564 ) ## Summary - wrap the default reqwest::Client inside a new CodexHttpClient/CodexRequestBuilder pair and log the HTTP method, URL, and status for each request - update the auth/model/provider plumbing to use the new builder helpers so headers and bearer auth continue to be applied consistently - add the shared `http` dependency that backs the header conversion helpers ## Testing - `CODEX_SANDBOX=seatbelt CODEX_SANDBOX_NETWORK_DISABLED=1 cargo test -p codex-core` - `CODEX_SANDBOX=seatbelt CODEX_SANDBOX_NETWORK_DISABLED=1 cargo test -p codex-chatgpt` - `CODEX_SANDBOX=seatbelt CODEX_SANDBOX_NETWORK_DISABLED=1 cargo test -p codex-tui` ------ https://chatgpt.com/codex/tasks/task_i_68fa5038c17483208b1148661c5873be	2025-10-24 09:47:52 -07:00
jif-oai	80783a7bb9	fix: flaky tests (#5625 )	2025-10-24 13:56:41 +01:00
Gabriel Peal	ed77d2d977	[MCP] Improve startup errors for timeouts and github (#5595 ) 1. I have seen too many reports of people hitting startup timeout errors and thinking Codex is broken. Hopefully this will help people self-serve. We may also want to consider raising the timeout to ~15s. 2. Make it more clear what PAT is (personal access token) in the GitHub error <img width="2378" height="674" alt="CleanShot 2025-10-23 at 22 05 06" src="https://github.com/user-attachments/assets/d148ce1d-ade3-4511-84a4-c164aefdb5c5" />	2025-10-24 01:54:45 -04:00

1 2 3 4 5 ...

664 Commits