valknar/llmx - llmx - dev.pivoine.art

Author	SHA1	Message	Date
Jeremy Rose	19f46439ae	timeouts for mcp tool calls (#3959 ) defaults to 60sec, overridable with MCP_TOOL_TIMEOUT or on a per-server basis in the config.	2025-09-22 10:30:59 -07:00
jif-oai	e258ca61b4	chore: more clippy rules 2 (#4057 ) The only file to watch is the cargo.toml All the others come from just fix + a few manual small fix The set of rules have been taken from the list of clippy rules arbitrarily while trying to optimise the learning and style of the code while limiting the loss of productivity	2025-09-22 17:16:02 +00:00
jif-oai	e5fe50d3ce	chore: unify cargo versions (#4044 ) Unify cargo versions at root	2025-09-22 16:47:01 +00:00
pakrym-oai	14a115d488	Add non_sandbox_test helper (#3880 ) Makes tests shorter	2025-09-22 14:50:41 +00:00
dedrisian-oai	5996ee0e5f	feat: Add more /review options (#3961 ) Adds the following options: 1. Review current changes 2. Review a specific commit 3. Review against a base branch (PR style) 4. Custom instructions <img width="487" height="330" alt="Screenshot 2025-09-20 at 2 11 36 PM" src="https://github.com/user-attachments/assets/edb0aaa5-5747-47fa-881f-cc4c4f7fe8bc" /> --- \+ Adds the following UI helpers: 1. Makes list selection searchable 2. Adds navigation to the bottom pane, so you could add a stack of popups 3. Basic custom prompt view	2025-09-21 20:18:35 -07:00
Ahmed Ibrahim	04504d8218	Forward Rate limits to the UI (#3965 ) We currently get information about rate limits in the response headers. We want to forward them to the clients to have better transparency. UI/UX plans have been discussed and this information is needed.	2025-09-20 21:26:16 -07:00
pakrym-oai	9b18875a42	Use helpers instead of fixtures (#3888 ) Move to using test helper method everywhere.	2025-09-19 06:46:25 -07:00
pakrym-oai	881c7978f1	Move responses mocking helpers to a shared lib (#3878 ) These are generally useful	2025-09-18 17:53:14 -07:00
Ahmed Ibrahim	a7fda70053	Use a unified shell tell to not break cache (#3814 ) Currently, we change the tool description according to the sandbox policy and approval policy. This breaks the cache when the user hits `/approvals`. This PR does the following: - Always use the shell with escalation parameter: - removes `create_shell_tool_for_sandbox` and always uses unified tool via `create_shell_tool` - Reject the func call when the model uses escalation parameter when it cannot.	2025-09-19 00:08:28 +00:00
Michael Bolin	de64f5f007	fix: update try_parse_word_only_commands_sequence() to return commands in order (#3881 ) Incidentally, we had a test for this in `accepts_multiple_commands_with_allowed_operators()`, but it was verifying the bad behavior. Oops!	2025-09-18 16:07:38 -07:00
Michael Bolin	8595237505	fix: ensure cwd for conversation and sandbox are separate concerns (#3874 ) Previous to this PR, both of these functions take a single `cwd`: `71038381aa/codex-rs/core/src/seatbelt.rs (L19-L25)` `71038381aa/codex-rs/core/src/landlock.rs (L16-L23)` whereas `cwd` and `sandbox_cwd` should be set independently (fixed in this PR). Added `sandbox_distinguishes_command_and_policy_cwds()` to `codex-rs/exec/tests/suite/sandbox.rs` to verify this.	2025-09-18 14:37:06 -07:00
dedrisian-oai	62258df92f	feat: /review (#3774 ) Adds `/review` action in TUI <img width="637" height="370" alt="Screenshot 2025-09-17 at 12 41 19 AM" src="https://github.com/user-attachments/assets/b1979a6e-844a-4b97-ab20-107c185aec1d" />	2025-09-18 14:14:16 -07:00
Jeremy Rose	b34e906396	Reland "refactor transcript view to handle HistoryCells" (#3753 ) Reland of #3538	2025-09-18 20:55:53 +00:00
Jeremy Rose	71038381aa	fix error on missing notifications in [tui] (#3867 ) Fixes #3811.	2025-09-18 11:25:09 -07:00
jif-oai	277fc6254e	chore: use tokio mutex and async function to prevent blocking a worker (#3850 ) ### Why Use `tokio::sync::Mutex` `std::sync::Mutex` are not _async-aware_. As a result, they will block the entire thread instead of just yielding the task. Furthermore they can be poisoned which is not the case of `tokio` Mutex. This allows the Tokio runtime to continue running other tasks while waiting for the lock, preventing deadlocks and performance bottlenecks. In general, this is preferred in async environment	2025-09-18 18:21:52 +01:00
jif-oai	992b531180	fix: some nit Rust reference issues (#3849 ) Fix some small references issue. No behavioural change. Just making the code cleaner	2025-09-18 18:18:06 +01:00
jif-oai	4a5d6f7c71	Make ESC button work when auto-compaction (#3857 ) Only emit a task finished when the compaction comes from a `/compact`	2025-09-18 15:34:16 +00:00
pakrym-oai	d4aba772cb	Switch to uuid_v7 and tighten ConversationId usage (#3819 ) Make sure conversations have a timestamp.	2025-09-18 14:37:03 +00:00
jif-oai	4c97eeb32a	bug: Ignore tests for now (#3777 ) Ignore flaky / long tests for now	2025-09-18 10:43:45 +01:00
Thibault Sottiaux	c9505488a1	chore: update "Codex CLI harness, sandboxing, and approvals" section (#3822 )	2025-09-17 16:48:20 -07:00
dedrisian-oai	72733e34c4	Add dev message upon review out (#3758 ) Proposal: We want to record a dev message like so: ``` { "type": "message", "role": "user", "content": [ { "type": "input_text", "text": "<user_action> <context>User initiated a review task. Here's the full review output from reviewer model. User may select one or more comments to resolve.</context> <action>review</action> <results> {findings_str} </results> </user_action>" } ] }, ``` Without showing in the chat transcript. Rough idea, but it fixes issue where the user finishes a review thread, and asks the parent "fix the rest of the review issues" thinking that the parent knows about it. ### Question: Why not a tool call? Because the agent didn't make the call, it was a human. + we haven't implemented sub-agents yet, and we'll need to think about the way we represent these human-led tool calls for the agent.	2025-09-16 18:43:32 -07:00
dedrisian-oai	7fe4021f95	Review mode core updates (#3701 ) 1. Adds the environment prompt (including cwd) to review thread 2. Prepends the review prompt as a user message (temporary fix so the instructions are not replaced on backend) 3. Sets reasoning to low 4. Sets default review model to `gpt-5-codex`	2025-09-16 13:36:51 -07:00
Dylan	11285655c4	fix: Record EnvironmentContext in SendUserTurn (#3678 ) ## Summary SendUserTurn has not been correctly handling updates to policies. While the tui protocol handles this in `Op::OverrideTurnContext`, the SendUserTurn should be appending `EnvironmentContext` messages when the sandbox settings change. MCP client behavior should match the cli behavior, so we update `SendUserTurn` message to match. ## Testing - [x] Added prompt caching tests	2025-09-16 11:32:20 -07:00
Ahmed Ibrahim	244687303b	Persist search items (#3745 ) Let's record the search items because they are part of the history.	2025-09-16 18:02:15 +00:00
Dylan	a8026d3846	fix: read-only escalations (#3673 ) ## Summary Splitting out this smaller fix from #2694 - fixes the sandbox permissions so Chat / read-only mode tool definition matches expectations ## Testing - [x] Tested locally <img width="1271" height="629" alt="Screenshot 2025-09-15 at 2 51 19 PM" src="https://github.com/user-attachments/assets/fcb247e4-30b6-4199-80d7-a2876d79ad7d" />	2025-09-15 19:01:10 -07:00
dependabot[bot]	404c126fc3	chore(deps): bump wildmatch from 2.4.0 to 2.5.0 in /codex-rs (#3619 ) Bumps [wildmatch](https://github.com/becheran/wildmatch) from 2.4.0 to 2.5.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/becheran/wildmatch/releases">wildmatch's releases</a>.</em></p> <blockquote> <h2>v2.5.0</h2> <p><a href="https://redirect.github.com/becheran/wildmatch/pull/27">becheran/wildmatch#27</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`b39902c120`"><code>b39902c</code></a> chore: Release wildmatch version 2.5.0</li> <li><a href="`87a8cf4c80`"><code>87a8cf4</code></a> Merge pull request <a href="https://redirect.github.com/becheran/wildmatch/issues/28">#28</a> from smichaku/micha/fix-unicode-case-insensitive-matching</li> <li><a href="`a3ab4903f5`"><code>a3ab490</code></a> fix: Fix unicode matching for non-ASCII characters</li> <li>See full diff in <a href="https://github.com/becheran/wildmatch/compare/v2.4.0...v2.5.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=wildmatch&package-manager=cargo&previous-version=2.4.0&new-version=2.5.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-15 12:57:17 -07:00
Jeremy Rose	0560079c41	notifications on approvals and turn end (#3329 ) uses OSC 9 to notify when a turn ends or approval is required. won't work in vs code or terminal.app but iterm2/kitty/wezterm supports it :)	2025-09-15 10:22:02 -07:00
ae	5c583fe89b	feat: tweak onboarding strings (#3650 )	2025-09-15 08:49:37 -07:00
pakrym-oai	b1c291e2bb	Add file reference guidelines to gpt-5 prompt (#3651 )	2025-09-15 08:35:30 -07:00
Michael Bolin	f037b2fd56	chore: rename (#3648 )	2025-09-15 08:17:13 -07:00
Thibault Sottiaux	d60cbed691	fix: add references (#3633 )	2025-09-15 07:48:22 -07:00
jimmyfraiture2	d555b68469	fix: race condition unified exec (#3644 ) Fix race condition without storing an rx in the session	2025-09-15 06:52:39 -07:00
Thibault Sottiaux	6039f8a126	feat: tighten preset filter, tame storage load logs, enable rollout prompt by default (#3628 ) Summary - common: use exact equality for Swiftfox exclusion to avoid hiding future slugs that merely contain the substring - core: treat missing internal_storage.json as expected (debug), warn only on real IO/parse errors - tui: drop DEBUG_HIGH gate; always consider showing rollout prompt, but suppress under ApiKey auth mode	2025-09-14 23:05:41 -07:00
Ahmed Ibrahim	50262a44ce	Show abort in the resume (#3629 ) Show abort error when resuming a session	2025-09-15 05:24:30 +00:00
easong-openai	6a8e743d57	initial mcp add interface (#3543 ) Adds `codex mcp add`, `codex mcp list`, `codex mcp remove`. Currently writes to global config.	2025-09-15 04:30:56 +00:00
Thibault Sottiaux	a797051921	chore: update swiftfox_prompt.md (#3624 )	2025-09-15 04:10:35 +00:00
Ahmed Ibrahim	26f1246a89	Revert "refactor transcript view to handle HistoryCells" (#3614 ) Reverts openai/codex#3538 It panics on forking first message. It also calculates the index in a wrong way.	2025-09-15 03:39:36 +00:00
Eric Traut	900bb01486	When logging in using ChatGPT, make sure to overwrite API key (#3611 ) When logging in using ChatGPT using the `codex login` command, a successful login should write a new `auth.json` file with the ChatGPT token information. The old code attempted to retain the API key and merge the token information into the existing `auth.json` file. With the new simplified login mechanism, `auth.json` should have auth information for only ChatGPT or API Key, not both. The `codex login --api-key <key>` code path was already doing the right thing here, but the `codex login` command was incorrect. This PR fixes the problem and adds test cases for both commands.	2025-09-14 19:48:18 -07:00
Dylan	b6673838e8	fix: model family and apply_patch consistency (#3603 ) ## Summary Resolves a merge conflict between #3597 and #3560, and adds tests to double check our apply_patch configuration. ## Testing - [x] Added unit tests --------- Co-authored-by: dedrisian-oai <dedrisian@openai.com>	2025-09-14 18:20:37 -07:00
Fouad Matin	5185d69f13	fix(core): flaky test `completed_commands_do_not_persist_sessions` (#3596 ) Fix flaky test: ``` FAIL [ 2.641s] codex-core unified_exec::tests::completed_commands_do_not_persist_sessions stdout ─── running 1 test test unified_exec::tests::completed_commands_do_not_persist_sessions ... FAILED failures: failures: unified_exec::tests::completed_commands_do_not_persist_sessions test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 235 filtered out; finished in 2.63s stderr ─── thread 'unified_exec::tests::completed_commands_do_not_persist_sessions' panicked at core/src/unified_exec/mod.rs:582:9: assertion failed: result.output.contains("codex") ```	2025-09-14 18:04:05 -07:00
dedrisian-oai	2aa84b8891	Fix EventMsg Optional (#3604 )	2025-09-15 00:34:33 +00:00
pakrym-oai	9177bdae5e	Only one branch for swiftfox (#3601 ) Make each model family have a single branch.	2025-09-14 16:56:22 -07:00
Ahmed Ibrahim	a30e5e40ee	enable-resume (#3537 ) Adding the ability to resume conversations. we have one verb `resume`. Behavior: `tui`: `codex resume`: opens session picker `codex resume --last`: continue last message `codex resume <session id>`: continue conversation with `session id` `exec`: `codex resume --last`: continue last conversation `codex resume <session id>`: continue conversation with `session id` Implementation: - I added a function to find the path in `~/.codex/sessions/` with a `UUID`. This is helpful in resuming with session id. - Added the above mentioned flags - Added lots of testing	2025-09-14 19:33:19 -04:00
dedrisian-oai	b2f6fc3b9a	Fix flaky windows test (#3564 ) There are exactly 4 types of flaky tests in Windows x86 right now: 1. `review_input_isolated_from_parent_history` => Times out waiting for closing events 2. `review_does_not_emit_agent_message_on_structured_output` => Times out waiting for closing events 3. `auto_compact_runs_after_token_limit_hit` => Times out waiting for closing events 4. `auto_compact_runs_after_token_limit_hit` => Also has a problem where auto compact should add a third request, but receives 4 requests. 1, 2, and 3 seem to be solved with increasing threads on windows runner from 2 -> 4. Don't know yet why # 4 is happening, but probably also because of WireMock issues on windows causing races.	2025-09-14 23:20:25 +00:00
pakrym-oai	51f88fd04a	Fix swiftfox model selector (#3598 ) The model shouldn't be saved with a suffix. The effort is a separate field.	2025-09-14 23:12:21 +00:00
pakrym-oai	916fdc2a37	Add per-model-family prompts (#3597 ) Allows more flexibility in defining prompts.	2025-09-14 22:45:15 +00:00
pakrym-oai	863d9c237e	Include command output when sending timeout to model (#3576 ) Being able to see the output helps the model decide how to handle the timeout.	2025-09-14 14:38:26 -07:00
Ahmed Ibrahim	bbea6bbf7e	Handle resuming/forking after compact (#3533 ) We need to construct the history different when compact happens. For this, we need to just consider the history after compact and convert compact to a response item. This needs to change and use `build_compact_history` when this #3446 is merged.	2025-09-14 13:23:31 +00:00
Jeremy Rose	4891ee29c5	refactor transcript view to handle HistoryCells (#3538 ) No (intended) functional change. This refactors the transcript view to hold a list of HistoryCells instead of a list of Lines. This simplifies and makes much of the logic more robust, as well as laying the groundwork for future changes, e.g. live-updating history cells in the transcript. Similar to #2879 in goal. Fixes #2755.	2025-09-13 19:23:14 -07:00
Thibault Sottiaux	bac8a427f3	chore: default swiftfox models to experimental reasoning summaries (#3560 )	2025-09-13 23:40:54 +00:00

1 2 3 4 5 ...

449 Commits