valknar/llmx - llmx - dev.pivoine.art

Author	SHA1	Message	Date
Robby He	dc2f26f7b5	Fix is_api_message to correctly exclude reasoning messages (#6156 ) ## Problem The `is_api_message` function in `conversation_history.rs` had a misalignment between its documentation and implementation: - Comment stated: "Anything that is not a system message or 'reasoning' message is considered an API message" - Code behavior: Was returning `true` for `ResponseItem::Reasoning`, meaning reasoning messages were incorrectly treated as API messages This inconsistency could lead to reasoning messages being persisted in conversation history when they should be filtered out. ## Root Cause Investigation revealed that reasoning messages are explicitly excluded throughout the codebase: 1. Chat completions API (lines 267-272 in `chat_completions.rs`) omits reasoning from conversation history: ```rust ResponseItem::Reasoning { .. } \| ResponseItem::Other => { // Omit these items from the conversation history. continue; } ``` 2. Existing tests like `drops_reasoning_when_last_role_is_user` and `ignores_reasoning_before_last_user` validate that reasoning should be excluded from API payloads ## Solution Fixed the `is_api_message` function to align with its documentation and the rest of the codebase: ```rust // Before: Reasoning was incorrectly returning true ResponseItem::Reasoning { .. } \| ResponseItem::WebSearchCall { .. } => true, // After: Reasoning correctly returns false ResponseItem::WebSearchCall { .. } => true, ResponseItem::Reasoning { .. } \| ResponseItem::Other => false, ``` ## Testing - Enhanced existing test to verify reasoning messages are properly filtered out - All 264 core tests pass, including 8 chat completions tests that validate reasoning behavior - No regressions introduced This ensures reasoning messages are consistently excluded from API message processing across the entire codebase.	2025-11-03 20:55:41 -08:00
Eric Traut	1e0e553304	Fixed notify handler so it passes correct `input_messages` details (#6143 ) This fixes bug #6121. The `input_messages` field passed to the notify handler is currently empty because the logic is incorrectly including the OutputText rather than InputText. I've fixed that and added proper filtering to remove messages associated with AGENTS.md and other context injected by the harness. Testing: I wrote a notify handler and verified that the user prompt is correctly passed through to the handler.	2025-11-03 14:23:04 -08:00
iceweasel-oai	07b7d28937	log sandbox commands to $CODEX_HOME instead of cwd (#6171 ) Logging commands in the Windows Sandbox is temporary, but while we are doing it, let's always write to CODEX_HOME instead of dirtying the cwd.	2025-11-03 13:12:33 -08:00
Ahmed Ibrahim	6ee7fbcfff	feat: add the time after aborting (#5996 ) Tell the model how much time passed after the user aborted the call.	2025-11-03 11:44:06 -08:00
iceweasel-oai	2eda75a8ee	Do not skip trust prompt on Windows if sandbox is enabled. (#6167 ) If the experimental windows sandbox is enabled, the trust prompt should show on Windows.	2025-11-03 11:27:45 -08:00
Vinh Nguyen	a1ee10b438	fix: improve usage URLs in status card and snapshots (#6111 ) Hi OpenAI Codex team, currently "Visit chatgpt.com/codex/settings/usage for up-to-date information on rate limits and credits" message in status card and error messages. For now, without the "https://" prefix, the link cannot be clicked directly from most terminals or chat interfaces. <img width="636" height="127" alt="Screenshot 2025-11-02 at 22 47 06" src="https://github.com/user-attachments/assets/5ea11e8b-fb74-451c-85dc-f4d492b2678b" /> --- The fix is intent to improve this issue: - It makes the link clickable in terminals that support it, hence better accessibility - It follows standard URL formatting practices - It maintains consistency with other links in the application (like the existing "https://openai.com/chatgpt/pricing" links) Thank you!	2025-11-02 21:44:59 -08:00
Eric Traut	0c7efa0cfd	Fix incorrect "deprecated" message about experimental config key (#6131 ) When I enable `experimental_sandbox_command_assessment`, I get an incorrect deprecation warning: "experimental_sandbox_command_assessment is deprecated. Use experimental_sandbox_command_assessment instead." This PR fixes this error.	2025-11-02 16:33:09 -08:00
Eric Traut	d5853d9c47	Changes to sandbox command assessment feature based on initial experiment feedback (#6091 ) * Removed sandbox risk categories; feedback indicates that these are not that useful and "less is more" * Tweaked the assessment prompt to generate terser answers * Fixed bug in orchestrator that prevents this feature from being exposed in the extension	2025-11-01 14:52:23 -07:00
Thomas Stokes	d9118c04bf	Parse the Azure OpenAI rate limit message (#5956 ) Fixes #4161 Currently Codex uses a regex to parse the "Please try again in 1.898s" OpenAI-style rate limit message, so that it can wait the correct duration before retrying. Azure OpenAI returns a different error that looks like "Rate limit exceeded. Try again in 35 seconds." This PR extends the regex and parsing code to match in a more fuzzy manner, handling anything matching the pattern "try again in \<duration>\<unit>".	2025-11-01 09:33:13 -07:00
jif-oai	611e00c862	feat: compactor 2 (#6027 ) Co-authored-by: pakrym-oai <pakrym@openai.com>	2025-10-31 14:27:08 -07:00
Ahmed Ibrahim	c8ebb2a0dc	Add warning on compact (#6052 ) This PR introduces the ability for `core` to send `warnings` as it can send `errors. It also sends a warning on compaction. <img width="811" height="187" alt="image" src="https://github.com/user-attachments/assets/0947a42d-b720-420d-b7fd-115f8a65a46a" />	2025-10-31 13:27:33 -07:00
Dylan Hurd	88e083a9d0	chore: Add shell serialization tests for json (#6043 ) ## Summary Can never have enough tests on this code path - checking that json inside a shell call is deserialized correctly. ## Tests - [x] These are tests 😎	2025-10-31 11:01:58 -07:00
Ahmed Ibrahim	1c8507b32a	Truncate total tool calls text (#5979 ) Put a cap on the aggregate output of text content on tool calls. --------- Co-authored-by: Gabriel Peal <gpeal@users.noreply.github.com>	2025-10-31 10:30:36 -07:00
jif-oai	0508823075	test: undo (#6034 )	2025-10-31 14:46:24 +00:00
pakrym-oai	2371d771cc	Update user instruction message format (#6010 )	2025-10-30 18:44:02 -07:00
Ahmed Ibrahim	dc2aeac21f	override verbosity for gpt-5-codex (#6007 ) we are seeing [reports](https://github.com/openai/codex/issues/6004) of users having verbosity in their config.toml and facing issues. gpt-5-codex doesn't accept other values rather than medium for verbosity.	2025-10-31 00:45:05 +00:00
Jack	f842849bec	docs: Fix markdown list item spacing in codex-rs/core/review_prompt.md (#4144 ) Fixes a Markdown parsing issue where a list item used `` without a following space (`Line ranges ...`). Per CommonMark, a space after the list marker is required. Updated to `* Line ranges ...` so the guideline renders as a standalone bullet. This change improves readability and prevents mis-parsing in renderers. Co-authored-by: Eric Traut <etraut@openai.com>	2025-10-30 17:39:21 -07:00
zhao-oai	dcf73970d2	rate limit errors now provide absolute time (#6000 )	2025-10-30 20:33:25 -04:00
Ahmed Ibrahim	a3d3719481	Remove last turn reasoning filtering (#5986 )	2025-10-30 23:20:32 +00:00
iceweasel-oai	87cce88f48	Windows Sandbox - Alpha version (#4905 ) - Added the new codex-windows-sandbox crate that builds both a library entry point (run_windows_sandbox_capture) and a CLI executable to launch commands inside a Windows restricted-token sandbox, including ACL management, capability SID provisioning, network lockdown, and output capture (windows-sandbox-rs/src/lib.rs:167, windows-sandbox-rs/src/main.rs:54). - Introduced the experimental WindowsSandbox feature flag and wiring so Windows builds can opt into the sandbox: SandboxType::WindowsRestrictedToken, the in-process execution path, and platform sandbox selection now honor the flag (core/src/features.rs:47, core/src/config.rs:1224, core/src/safety.rs:19, core/src/sandboxing/mod.rs:69, core/src/exec.rs:79, core/src/exec.rs:172). - Updated workspace metadata to include the new crate and its Windows-specific dependencies so the core crate can link against it (codex-rs/ Cargo.toml:91, core/Cargo.toml:86). - Added a PowerShell bootstrap script that installs the Windows toolchain, required CLI utilities, and builds the workspace to ease development on the platform (scripts/setup-windows.ps1:1). - Landed a Python smoke-test suite that exercises read-only/workspace-write policies, ACL behavior, and network denial for the Windows sandbox binary (windows-sandbox-rs/sandbox_smoketests.py:1).	2025-10-30 15:51:57 -07:00
Bernard Niset	ff6d4cec6b	fix: Update seatbelt policy for java on macOS (#3987 ) # Summary This PR is related to the Issue #3978 and contains a fix to the seatbelt profile for macOS that allows to run java/jdk tooling from the sandbox. I have found that the included change is the minimum change to make it run on my machine. There is a unit test added by codex when making this fix. I wonder if it is useful since you need java installed on the target machine for it to be relevant. I can remove it it is better. Fixes #3978	2025-10-30 14:25:04 -07:00
Celia Chen	6ef658a9f9	[Hygiene] Remove `include_view_image_tool` config (#5976 ) There's still some debate about whether we want to expose `tools.view_image` or `feature.view_image` so those are left unchanged for now, but this old `include_view_image_tool` config is good-to-go. Also updated the doc to reflect that `view_image` tool is now by default true.	2025-10-30 13:23:24 -07:00
Anton Panasenko	9572cfc782	[codex] add developer instructions (#5897 ) we are using developer instructions for code reviews, we need to pass them in cli as well.	2025-10-30 11:18:31 -07:00
Dylan Hurd	4a55646a02	chore: testing on freeform apply_patch (#5952 ) ## Summary Duplicates the tests in `apply_patch_cli.rs`, but tests the freeform apply_patch tool as opposed to the function call path. The good news is that all the tests pass with zero logical tests, with the exception of the heredoc, which doesn't really make sense in the freeform tool context anyway. @jif-oai since you wrote the original tests in #5557, I'd love your opinion on the right way to DRY these test cases between the two. Happy to set up a more sophisticated harness, but didn't want to go down the rabbit hole until we agreed on the right pattern ## Testing - [x] These are tests	2025-10-30 10:40:48 -07:00
jif-oai	f4f9695978	feat: compaction prompt configurable (#5959 ) ``` codex -c compact_prompt="Summarize in bullet points" ```	2025-10-30 14:24:24 +00:00
Ahmed Ibrahim	5fcc380bd9	Pass initial history as an optional to codex delegate (#5950 ) This will give us more freedom on controlling the delegation. i.e we can fork our history and run `compact`.	2025-10-30 07:22:42 -07:00
jif-oai	aa76003e28	chore: unify config crates (#5958 )	2025-10-30 10:28:32 +00:00
Ahmed Ibrahim	fac548e430	Send delegate header (#5942 ) Send delegate type header	2025-10-30 09:49:40 +00:00
zhao-oai	b34efde2f3	asdf (#5940 ) .	2025-10-30 01:10:41 +00:00
Ahmed Ibrahim	7aa46ab5fc	ignore agent message deltas for the review mode (#5937 ) The deltas produce the whole json output. ignore them.	2025-10-30 00:47:55 +00:00
pakrym-oai	3429e82e45	Add item streaming events (#5546 ) Adds AgentMessageContentDelta, ReasoningContentDelta, ReasoningRawContentDelta item streaming events while maintaining compatibility for old events. --------- Co-authored-by: Owen Lin <owen@openai.com>	2025-10-29 22:33:57 +00:00
Ahmed Ibrahim	13e1d0362d	Delegate review to codex instance (#5572 ) In this PR, I am exploring migrating task kind to an invocation of Codex. The main reason would be getting rid off multiple `ConversationHistory` state and streamlining our context/history management. This approach depends on opening a channel between the sub-codex and codex. This channel is responsible for forwarding `interactive` (`approvals`) and `non-interactive` events. The `task` is responsible for handling those events. This opens the door for implementing `codex as a tool`, replacing `compact` and `review`, and potentially subagents. One consideration is this code is very similar to `app-server` specially in the approval part. If in the future we wanted an interactive `sub-codex` we should consider using `codex-mcp`	2025-10-29 21:04:25 +00:00
jif-oai	db31f6966d	chore: config editor (#5878 ) The goal is to have a single place where we actually write files In a follow-up PR, will move everything config related in a dedicated module and move the helpers in a dedicated file	2025-10-29 20:52:46 +00:00
Rasmus Rygaard	39e09c289d	Add a wrapper around raw response items (#5923 ) We currently have nested enums when sending raw response items in the app-server protocol. This makes downstream schemas confusing because we need to embed `type`-discriminated enums within each other. This PR adds a small wrapper around the response item so we can keep the schemas separate	2025-10-29 20:32:40 +00:00
jif-oai	3183935bd7	feat: add output even in sandbox denied (#5908 )	2025-10-29 18:21:18 +00:00
jif-oai	060637b4d4	feat: deprecation warning (#5825 ) <img width="955" height="311" alt="Screenshot 2025-10-28 at 14 26 25" src="https://github.com/user-attachments/assets/99729b3d-3bc9-4503-aab3-8dc919220ab4" />	2025-10-29 12:29:28 +00:00
jif-oai	fa92cd92fa	chore: merge git crates (#5909 ) Merge `git-apply` and `git-tooling` into `utils/`	2025-10-29 12:11:44 +00:00
Abhishek Bhardwaj	89591e4246	feature: Add "!cmd" user shell execution (#2471 ) feature: Add "!cmd" user shell execution This change lets users run local shell commands directly from the TUI by prefixing their input with ! (e.g. !ls). Output is truncated to keep the exec cell usable, and Ctrl-C cleanly interrupts long-running commands (e.g. !sleep 10000). Summary of changes - Route Op::RunUserShellCommand through a dedicated UserShellCommandTask (core/src/tasks/user_shell.rs), keeping the task logic out of codex.rs. - Reuse the existing tool router: the task constructs a ToolCall for the local_shell tool and relies on ShellHandler, so no manual MCP tool lookup is required. - Emit exec lifecycle events (ExecCommandBegin/ExecCommandEnd) so the TUI can show command metadata, live output, and exit status. End-to-end flow TUI handling 1. ChatWidget::submit_user_message (TUI) intercepts messages starting with !. 2. Non-empty commands dispatch Op::RunUserShellCommand { command }; empty commands surface a help hint. 3. No UserInput items are created, so nothing is enqueued for the model. Core submission loop 4. The submission loop routes the op to handlers::run_user_shell_command (core/src/codex.rs). 5. A fresh TurnContext is created and Session::spawn_user_shell_command enqueues UserShellCommandTask. Task execution 6. UserShellCommandTask::run emits TaskStartedEvent, formats the command, and prepares a ToolCall targeting local_shell. 7. ToolCallRuntime::handle_tool_call dispatches to ShellHandler. Shell tool runtime 8. ShellHandler::run_exec_like launches the process via the unified exec runtime, honoring sandbox and shell policies, and emits ExecCommandBegin/End. 9. Stdout/stderr are captured for the UI, but the task does not turn the resulting ToolOutput into a model response. Completion 10. After ExecCommandEnd, the task finishes without an assistant message; the session marks it complete and the exec cell displays the final output. Conversation context - The command and its output never enter the conversation history or the model prompt; the flow is local-only. - Only exec/task events are emitted for UI rendering. Demo video https://github.com/user-attachments/assets/fcd114b0-4304-4448-a367-a04c43e0b996	2025-10-29 00:31:20 -07:00
Axojhf	802d2440b4	Fix bash detection failure in VS Code Codex extension on Windows under certain conditions (#3421 ) Found that the VS Code Codex extension throws “Error starting conversation” when initializing a conversation with Git for Windows’ bash on PATH. Debugging showed the bash-detection logic did not return as expected; this change makes it reliable in that scenario. Possibly related to issue #2841.	2025-10-28 21:29:16 -07:00
pakrym-oai	ef3e075ad6	Refresh tokens more often and log a better message when both auth and token refresh fails (#5655 ) <img width="784" height="153" alt="image" src="https://github.com/user-attachments/assets/c44b0eb2-d65c-4fc2-8b54-b34f7e1c4d95" />	2025-10-28 18:55:53 -07:00
Anton Panasenko	149e198ce8	[codex][app-server] resume conversation from history (#5893 )	2025-10-28 18:18:03 -07:00
zhao-oai	36113509f2	verify mime type of images (#5888 ) solves: https://github.com/openai/codex/issues/5675 Block non-image uploads in the view_image workflow. We now confirm the file’s MIME is image/* before building the data URL; otherwise we emit a “unsupported MIME type” error to the model. This stops the agent from sending application/json blobs that the Responses API rejects with 400s. <img width="409" height="556" alt="Screenshot 2025-10-28 at 1 15 10 PM" src="https://github.com/user-attachments/assets/a92199e8-2769-4b1d-8e33-92d9238c90fe" />	2025-10-28 14:52:51 -07:00
Ahmed Ibrahim	ef55992ab0	remove beta experimental header (#5892 )	2025-10-28 21:28:56 +00:00
pakrym-oai	1b8f2543ac	Filter out reasoning items from previous turns (#5857 ) Reduces request size and prevents 400 errors when switching between API orgs. Based on Responses API behavior described in https://cookbook.openai.com/examples/responses_api/reasoning_items#caching	2025-10-28 11:39:34 -07:00
Jeremy Rose	65107d24a2	Fix handling of non-main default branches for cloud task submissions (#5069 ) ## Summary - detect the repository's default branch before submitting a cloud task - expose a helper in `codex_core::git_info` for retrieving the default branch name Fixes #4888 ------ https://chatgpt.com/codex/tasks/task_i_68e96093cf28832ca0c9c73fc618a309	2025-10-28 11:02:25 -07:00
jif-oai	5ba2a17576	chore: decompose submission loop (#5854 )	2025-10-28 15:23:46 +00:00
jif-oai	be4bdfec93	chore: drop useless shell stuff (#5848 )	2025-10-28 14:52:52 +00:00
jif-oai	7ff142d93f	chore: speed-up pipeline (#5812 ) Speed-up pipeline by: * Decoupling tests and clippy * Use pre-built binary in tests * `sccache` for caching of the builds	2025-10-28 14:08:52 +00:00
Celia Chen	4a42c4e142	[Auth] Choose which auth storage to use based on config (#5792 ) This PR is a follow-up to #5591. It allows users to choose which auth storage mode they want by using the new `cli_auth_credentials_store_mode` config.	2025-10-27 19:41:49 -07:00
Josh McKinney	66a4b89822	feat(tui): clarify Windows auto mode requirements (#5568 ) ## Summary - Coerce Windows `workspace-write` configs back to read-only, surface the forced downgrade in the approvals popup, and funnel users toward WSL or Full Access. - Add WSL installation instructions to the Auto preset on Windows while keeping the preset available for other platforms. - Skip the trust-on-first-run prompt on native Windows so new folders remain read-only without additional confirmation. - Expose a structured sandbox policy resolution from config to flag Windows downgrades and adjust tests (core, exec, TUI) to reflect the new behavior; provide a Windows-only approvals snapshot. ## Testing - cargo fmt - cargo test -p codex-core config::tests::add_dir_override_extends_workspace_writable_roots - cargo test -p codex-exec suite::resume::exec_resume_preserves_cli_configuration_overrides - cargo test -p codex-tui chatwidget::tests::approvals_selection_popup_snapshot - cargo test -p codex-tui approvals_popup_includes_wsl_note_for_auto_mode - cargo test -p codex-tui windows_skips_trust_prompt - just fix -p codex-core - just fix -p codex-tui	2025-10-28 01:19:32 +00:00

1 2 3 4 5 ...

691 Commits