valknar/llmx - llmx - dev.pivoine.art

Author	SHA1	Message	Date
jif-oai	e5fe50d3ce	chore: unify cargo versions (#4044 ) Unify cargo versions at root	2025-09-22 16:47:01 +00:00
pakrym-oai	14a115d488	Add non_sandbox_test helper (#3880 ) Makes tests shorter	2025-09-22 14:50:41 +00:00
dedrisian-oai	5996ee0e5f	feat: Add more /review options (#3961 ) Adds the following options: 1. Review current changes 2. Review a specific commit 3. Review against a base branch (PR style) 4. Custom instructions <img width="487" height="330" alt="Screenshot 2025-09-20 at 2 11 36 PM" src="https://github.com/user-attachments/assets/edb0aaa5-5747-47fa-881f-cc4c4f7fe8bc" /> --- \+ Adds the following UI helpers: 1. Makes list selection searchable 2. Adds navigation to the bottom pane, so you could add a stack of popups 3. Basic custom prompt view	2025-09-21 20:18:35 -07:00
Ahmed Ibrahim	04504d8218	Forward Rate limits to the UI (#3965 ) We currently get information about rate limits in the response headers. We want to forward them to the clients to have better transparency. UI/UX plans have been discussed and this information is needed.	2025-09-20 21:26:16 -07:00
pakrym-oai	9b18875a42	Use helpers instead of fixtures (#3888 ) Move to using test helper method everywhere.	2025-09-19 06:46:25 -07:00
pakrym-oai	881c7978f1	Move responses mocking helpers to a shared lib (#3878 ) These are generally useful	2025-09-18 17:53:14 -07:00
Ahmed Ibrahim	a7fda70053	Use a unified shell tell to not break cache (#3814 ) Currently, we change the tool description according to the sandbox policy and approval policy. This breaks the cache when the user hits `/approvals`. This PR does the following: - Always use the shell with escalation parameter: - removes `create_shell_tool_for_sandbox` and always uses unified tool via `create_shell_tool` - Reject the func call when the model uses escalation parameter when it cannot.	2025-09-19 00:08:28 +00:00
Michael Bolin	de64f5f007	fix: update try_parse_word_only_commands_sequence() to return commands in order (#3881 ) Incidentally, we had a test for this in `accepts_multiple_commands_with_allowed_operators()`, but it was verifying the bad behavior. Oops!	2025-09-18 16:07:38 -07:00
Michael Bolin	8595237505	fix: ensure cwd for conversation and sandbox are separate concerns (#3874 ) Previous to this PR, both of these functions take a single `cwd`: `71038381aa/codex-rs/core/src/seatbelt.rs (L19-L25)` `71038381aa/codex-rs/core/src/landlock.rs (L16-L23)` whereas `cwd` and `sandbox_cwd` should be set independently (fixed in this PR). Added `sandbox_distinguishes_command_and_policy_cwds()` to `codex-rs/exec/tests/suite/sandbox.rs` to verify this.	2025-09-18 14:37:06 -07:00
dedrisian-oai	62258df92f	feat: /review (#3774 ) Adds `/review` action in TUI <img width="637" height="370" alt="Screenshot 2025-09-17 at 12 41 19 AM" src="https://github.com/user-attachments/assets/b1979a6e-844a-4b97-ab20-107c185aec1d" />	2025-09-18 14:14:16 -07:00
Jeremy Rose	b34e906396	Reland "refactor transcript view to handle HistoryCells" (#3753 ) Reland of #3538	2025-09-18 20:55:53 +00:00
Jeremy Rose	71038381aa	fix error on missing notifications in [tui] (#3867 ) Fixes #3811.	2025-09-18 11:25:09 -07:00
jif-oai	277fc6254e	chore: use tokio mutex and async function to prevent blocking a worker (#3850 ) ### Why Use `tokio::sync::Mutex` `std::sync::Mutex` are not _async-aware_. As a result, they will block the entire thread instead of just yielding the task. Furthermore they can be poisoned which is not the case of `tokio` Mutex. This allows the Tokio runtime to continue running other tasks while waiting for the lock, preventing deadlocks and performance bottlenecks. In general, this is preferred in async environment	2025-09-18 18:21:52 +01:00
jif-oai	992b531180	fix: some nit Rust reference issues (#3849 ) Fix some small references issue. No behavioural change. Just making the code cleaner	2025-09-18 18:18:06 +01:00
jif-oai	4a5d6f7c71	Make ESC button work when auto-compaction (#3857 ) Only emit a task finished when the compaction comes from a `/compact`	2025-09-18 15:34:16 +00:00
pakrym-oai	d4aba772cb	Switch to uuid_v7 and tighten ConversationId usage (#3819 ) Make sure conversations have a timestamp.	2025-09-18 14:37:03 +00:00
jif-oai	4c97eeb32a	bug: Ignore tests for now (#3777 ) Ignore flaky / long tests for now	2025-09-18 10:43:45 +01:00
Thibault Sottiaux	c9505488a1	chore: update "Codex CLI harness, sandboxing, and approvals" section (#3822 )	2025-09-17 16:48:20 -07:00
dedrisian-oai	72733e34c4	Add dev message upon review out (#3758 ) Proposal: We want to record a dev message like so: ``` { "type": "message", "role": "user", "content": [ { "type": "input_text", "text": "<user_action> <context>User initiated a review task. Here's the full review output from reviewer model. User may select one or more comments to resolve.</context> <action>review</action> <results> {findings_str} </results> </user_action>" } ] }, ``` Without showing in the chat transcript. Rough idea, but it fixes issue where the user finishes a review thread, and asks the parent "fix the rest of the review issues" thinking that the parent knows about it. ### Question: Why not a tool call? Because the agent didn't make the call, it was a human. + we haven't implemented sub-agents yet, and we'll need to think about the way we represent these human-led tool calls for the agent.	2025-09-16 18:43:32 -07:00
dedrisian-oai	7fe4021f95	Review mode core updates (#3701 ) 1. Adds the environment prompt (including cwd) to review thread 2. Prepends the review prompt as a user message (temporary fix so the instructions are not replaced on backend) 3. Sets reasoning to low 4. Sets default review model to `gpt-5-codex`	2025-09-16 13:36:51 -07:00
Dylan	11285655c4	fix: Record EnvironmentContext in SendUserTurn (#3678 ) ## Summary SendUserTurn has not been correctly handling updates to policies. While the tui protocol handles this in `Op::OverrideTurnContext`, the SendUserTurn should be appending `EnvironmentContext` messages when the sandbox settings change. MCP client behavior should match the cli behavior, so we update `SendUserTurn` message to match. ## Testing - [x] Added prompt caching tests	2025-09-16 11:32:20 -07:00
Ahmed Ibrahim	244687303b	Persist search items (#3745 ) Let's record the search items because they are part of the history.	2025-09-16 18:02:15 +00:00
Dylan	a8026d3846	fix: read-only escalations (#3673 ) ## Summary Splitting out this smaller fix from #2694 - fixes the sandbox permissions so Chat / read-only mode tool definition matches expectations ## Testing - [x] Tested locally <img width="1271" height="629" alt="Screenshot 2025-09-15 at 2 51 19 PM" src="https://github.com/user-attachments/assets/fcb247e4-30b6-4199-80d7-a2876d79ad7d" />	2025-09-15 19:01:10 -07:00
dependabot[bot]	404c126fc3	chore(deps): bump wildmatch from 2.4.0 to 2.5.0 in /codex-rs (#3619 ) Bumps [wildmatch](https://github.com/becheran/wildmatch) from 2.4.0 to 2.5.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/becheran/wildmatch/releases">wildmatch's releases</a>.</em></p> <blockquote> <h2>v2.5.0</h2> <p><a href="https://redirect.github.com/becheran/wildmatch/pull/27">becheran/wildmatch#27</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`b39902c120`"><code>b39902c</code></a> chore: Release wildmatch version 2.5.0</li> <li><a href="`87a8cf4c80`"><code>87a8cf4</code></a> Merge pull request <a href="https://redirect.github.com/becheran/wildmatch/issues/28">#28</a> from smichaku/micha/fix-unicode-case-insensitive-matching</li> <li><a href="`a3ab4903f5`"><code>a3ab490</code></a> fix: Fix unicode matching for non-ASCII characters</li> <li>See full diff in <a href="https://github.com/becheran/wildmatch/compare/v2.4.0...v2.5.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=wildmatch&package-manager=cargo&previous-version=2.4.0&new-version=2.5.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-15 12:57:17 -07:00
Jeremy Rose	0560079c41	notifications on approvals and turn end (#3329 ) uses OSC 9 to notify when a turn ends or approval is required. won't work in vs code or terminal.app but iterm2/kitty/wezterm supports it :)	2025-09-15 10:22:02 -07:00
ae	5c583fe89b	feat: tweak onboarding strings (#3650 )	2025-09-15 08:49:37 -07:00
pakrym-oai	b1c291e2bb	Add file reference guidelines to gpt-5 prompt (#3651 )	2025-09-15 08:35:30 -07:00
Michael Bolin	f037b2fd56	chore: rename (#3648 )	2025-09-15 08:17:13 -07:00
Thibault Sottiaux	d60cbed691	fix: add references (#3633 )	2025-09-15 07:48:22 -07:00
jimmyfraiture2	d555b68469	fix: race condition unified exec (#3644 ) Fix race condition without storing an rx in the session	2025-09-15 06:52:39 -07:00
Thibault Sottiaux	6039f8a126	feat: tighten preset filter, tame storage load logs, enable rollout prompt by default (#3628 ) Summary - common: use exact equality for Swiftfox exclusion to avoid hiding future slugs that merely contain the substring - core: treat missing internal_storage.json as expected (debug), warn only on real IO/parse errors - tui: drop DEBUG_HIGH gate; always consider showing rollout prompt, but suppress under ApiKey auth mode	2025-09-14 23:05:41 -07:00
Ahmed Ibrahim	50262a44ce	Show abort in the resume (#3629 ) Show abort error when resuming a session	2025-09-15 05:24:30 +00:00
easong-openai	6a8e743d57	initial mcp add interface (#3543 ) Adds `codex mcp add`, `codex mcp list`, `codex mcp remove`. Currently writes to global config.	2025-09-15 04:30:56 +00:00
Thibault Sottiaux	a797051921	chore: update swiftfox_prompt.md (#3624 )	2025-09-15 04:10:35 +00:00
Ahmed Ibrahim	26f1246a89	Revert "refactor transcript view to handle HistoryCells" (#3614 ) Reverts openai/codex#3538 It panics on forking first message. It also calculates the index in a wrong way.	2025-09-15 03:39:36 +00:00
Eric Traut	900bb01486	When logging in using ChatGPT, make sure to overwrite API key (#3611 ) When logging in using ChatGPT using the `codex login` command, a successful login should write a new `auth.json` file with the ChatGPT token information. The old code attempted to retain the API key and merge the token information into the existing `auth.json` file. With the new simplified login mechanism, `auth.json` should have auth information for only ChatGPT or API Key, not both. The `codex login --api-key <key>` code path was already doing the right thing here, but the `codex login` command was incorrect. This PR fixes the problem and adds test cases for both commands.	2025-09-14 19:48:18 -07:00
Dylan	b6673838e8	fix: model family and apply_patch consistency (#3603 ) ## Summary Resolves a merge conflict between #3597 and #3560, and adds tests to double check our apply_patch configuration. ## Testing - [x] Added unit tests --------- Co-authored-by: dedrisian-oai <dedrisian@openai.com>	2025-09-14 18:20:37 -07:00
Fouad Matin	5185d69f13	fix(core): flaky test `completed_commands_do_not_persist_sessions` (#3596 ) Fix flaky test: ``` FAIL [ 2.641s] codex-core unified_exec::tests::completed_commands_do_not_persist_sessions stdout ─── running 1 test test unified_exec::tests::completed_commands_do_not_persist_sessions ... FAILED failures: failures: unified_exec::tests::completed_commands_do_not_persist_sessions test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 235 filtered out; finished in 2.63s stderr ─── thread 'unified_exec::tests::completed_commands_do_not_persist_sessions' panicked at core/src/unified_exec/mod.rs:582:9: assertion failed: result.output.contains("codex") ```	2025-09-14 18:04:05 -07:00
dedrisian-oai	2aa84b8891	Fix EventMsg Optional (#3604 )	2025-09-15 00:34:33 +00:00
pakrym-oai	9177bdae5e	Only one branch for swiftfox (#3601 ) Make each model family have a single branch.	2025-09-14 16:56:22 -07:00
Ahmed Ibrahim	a30e5e40ee	enable-resume (#3537 ) Adding the ability to resume conversations. we have one verb `resume`. Behavior: `tui`: `codex resume`: opens session picker `codex resume --last`: continue last message `codex resume <session id>`: continue conversation with `session id` `exec`: `codex resume --last`: continue last conversation `codex resume <session id>`: continue conversation with `session id` Implementation: - I added a function to find the path in `~/.codex/sessions/` with a `UUID`. This is helpful in resuming with session id. - Added the above mentioned flags - Added lots of testing	2025-09-14 19:33:19 -04:00
dedrisian-oai	b2f6fc3b9a	Fix flaky windows test (#3564 ) There are exactly 4 types of flaky tests in Windows x86 right now: 1. `review_input_isolated_from_parent_history` => Times out waiting for closing events 2. `review_does_not_emit_agent_message_on_structured_output` => Times out waiting for closing events 3. `auto_compact_runs_after_token_limit_hit` => Times out waiting for closing events 4. `auto_compact_runs_after_token_limit_hit` => Also has a problem where auto compact should add a third request, but receives 4 requests. 1, 2, and 3 seem to be solved with increasing threads on windows runner from 2 -> 4. Don't know yet why # 4 is happening, but probably also because of WireMock issues on windows causing races.	2025-09-14 23:20:25 +00:00
pakrym-oai	51f88fd04a	Fix swiftfox model selector (#3598 ) The model shouldn't be saved with a suffix. The effort is a separate field.	2025-09-14 23:12:21 +00:00
pakrym-oai	916fdc2a37	Add per-model-family prompts (#3597 ) Allows more flexibility in defining prompts.	2025-09-14 22:45:15 +00:00
pakrym-oai	863d9c237e	Include command output when sending timeout to model (#3576 ) Being able to see the output helps the model decide how to handle the timeout.	2025-09-14 14:38:26 -07:00
Ahmed Ibrahim	bbea6bbf7e	Handle resuming/forking after compact (#3533 ) We need to construct the history different when compact happens. For this, we need to just consider the history after compact and convert compact to a response item. This needs to change and use `build_compact_history` when this #3446 is merged.	2025-09-14 13:23:31 +00:00
Jeremy Rose	4891ee29c5	refactor transcript view to handle HistoryCells (#3538 ) No (intended) functional change. This refactors the transcript view to hold a list of HistoryCells instead of a list of Lines. This simplifies and makes much of the logic more robust, as well as laying the groundwork for future changes, e.g. live-updating history cells in the transcript. Similar to #2879 in goal. Fixes #2755.	2025-09-13 19:23:14 -07:00
Thibault Sottiaux	bac8a427f3	chore: default swiftfox models to experimental reasoning summaries (#3560 )	2025-09-13 23:40:54 +00:00
Thibault Sottiaux	14ab1063a7	chore: rename	2025-09-12 23:17:41 -07:00
Thibault Sottiaux	19b4ed3c96	w	2025-09-12 22:44:05 -07:00

1 2 3 4 5 ...

447 Commits