valknar/llmx - llmx - dev.pivoine.art

Author	SHA1	Message	Date
Michael Bolin	08ed618f72	chore: introduce ConversationManager as a clearinghouse for all conversations (#2240 ) This PR does two things because after I got deep into the first one I started pulling on the thread to the second: - Makes `ConversationManager` the place where all in-memory conversations are created and stored. Previously, `MessageProcessor` in the `codex-mcp-server` crate was doing this via its `session_map`, but this is something that should be done in `codex-core`. - It unwinds the `ctrl_c: tokio::sync::Notify` that was threaded throughout our code. I think this made sense at one time, but now that we handle Ctrl-C within the TUI and have a proper `Op::Interrupt` event, I don't think this was quite right, so I removed it. For `codex exec` and `codex proto`, we now use `tokio::signal::ctrl_c()` directly, but we no longer make `Notify` a field of `Codex` or `CodexConversation`. Changes of note: - Adds the files `conversation_manager.rs` and `codex_conversation.rs` to `codex-core`. - `Codex` and `CodexSpawnOk` are no longer exported from `codex-core`: other crates must use `CodexConversation` instead (which is created via `ConversationManager`). - `core/src/codex_wrapper.rs` has been deleted in favor of `ConversationManager`. - `ConversationManager::new_conversation()` returns `NewConversation`, which is in line with the `new_conversation` tool we want to add to the MCP server. Note `NewConversation` includes `SessionConfiguredEvent`, so we eliminate checks in cases like `codex-rs/core/tests/client.rs` to verify `SessionConfiguredEvent` is the first event because that is now internal to `ConversationManager`. - Quite a bit of code was deleted from `codex-rs/mcp-server/src/message_processor.rs` since it no longer has to manage multiple conversations itself: it goes through `ConversationManager` instead. - `core/tests/live_agent.rs` has been deleted because I had to update a bunch of tests and all the tests in here were ignored, and I don't think anyone ever ran them, so this was just technical debt, at this point. - Removed `notify_on_sigint()` from `util.rs` (and in a follow-up, I hope to refactor the blandly-named `util.rs` into more descriptive files). - In general, I started replacing local variables named `codex` as `conversation`, where appropriate, though admittedly I didn't do it through all the integration tests because that would have added a lot of noise to this PR. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2240). * #2264 * #2263 * __->__ #2240	2025-08-13 13:38:18 -07:00
ae	30ee24521b	fix: remove behavioral prompting from update_plan tool def (#2261 ) - Moved some of the content to the main prompt.	2025-08-13 19:05:13 +00:00
easong-openai	6340acd885	Re-add markdown streaming (#2029 ) Wait for newlines, then render markdown on a line by line basis. Word wrap it for the current terminal size and then spit it out line by line into the UI. Also adds tests and fixes some UI regressions.	2025-08-12 17:37:28 -07:00
Dylan	90d892f4fd	[prompt] Restore important guidance for shell command usage (#2211 ) ## Summary In #1939 we overhauled a lot of our prompt. This was largely good, but we're seeing some specific points of confusion from the model! This prompt update attempts to address 3 of them: - Enforcing the use of `ripgrep`, which is bundled as a dependency when installed with homebrew. We should do the same on node (in progress) - Explicit guidance on reading files in chunks. - Slight adjustment to networking sandbox language. `enabled` / `restricted` is anecdotally less confusing to the model and requires less reasoning to escalate for approval. We are going to continue iterating on shell usage and tools, but this restores us to best practices for current model snapshots. ## Testing - [x] evals - [x] local testing	2025-08-12 10:19:07 -07:00
pakrym-oai	cb78f2333e	Set user-agent (#2230 ) Use the same well-defined value in all cases when sending user-agent header	2025-08-12 16:40:04 +00:00
Michael Bolin	596a9d6a96	fix: take ExecToolCallOutput by value to avoid clone() (#2197 ) Since the output could be a large string, it seemed like a win to avoid the `clone()` in the common case.	2025-08-12 08:59:35 -07:00
dependabot[bot]	8d2c5d0d98	chore(deps): bump toml from 0.9.4 to 0.9.5 in /codex-rs (#2157 ) Bumps [toml](https://github.com/toml-rs/toml) from 0.9.4 to 0.9.5. <details> <summary>Commits</summary> <ul> <li><a href="`bd21148c49`"><code>bd21148</code></a> chore: Release</li> <li><a href="`ff1cb9a263`"><code>ff1cb9a</code></a> docs: Update changelog</li> <li><a href="`39dd8b6422`"><code>39dd8b6</code></a> fix(parser): Improve bad quote error messages (<a href="https://redirect.github.com/toml-rs/toml/issues/1014">#1014</a>)</li> <li><a href="`137338eb26`"><code>137338e</code></a> chore(deps): Update Rust crate serde_json to v1.0.142 (<a href="https://redirect.github.com/toml-rs/toml/issues/1022">#1022</a>)</li> <li><a href="`d5b8c8a94e`"><code>d5b8c8a</code></a> fix(parser): Improve missing-open-quote errors</li> <li><a href="`ce91354fc7`"><code>ce91354</code></a> fix(parser): Don't treat trailing quotes as separate items</li> <li><a href="`8f424edd08`"><code>8f424ed</code></a> fix(parser): Conjoin more values in unquoted string errors</li> <li><a href="`2b9a81ae79`"><code>2b9a81a</code></a> fix(parser): Reduce float false positives</li> <li><a href="`f6538413bb`"><code>f653841</code></a> fix(parser): Reduce float/bool false positives</li> <li><a href="`f4864ef34b`"><code>f4864ef</code></a> test(parser): Add case for missing start quote</li> <li>See full diff in <a href="https://github.com/toml-rs/toml/compare/toml-v0.9.4...toml-v0.9.5">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=toml&package-manager=cargo&previous-version=0.9.4&new-version=0.9.5)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-11 17:13:37 -07:00
Dylan	d33793d31d	[prompts] integration test prompt caching (#2189 ) ## Summary Our current approach to prompt caching is fragile! The current approach works, but we are planning to update to a more resilient system (storing them in the rollout file). Let's start adding some integration tests to ensure stability while we migrate it. ## Testing - [x] These are the tests 😎	2025-08-11 17:03:13 -07:00
pakrym-oai	6a6bf99e2c	Send prompt_cache_key (#2200 ) To optimize prompt caching performance.	2025-08-11 16:37:45 -07:00
pakrym-oai	0cf57e1f42	Include output truncation message in tool call results (#2183 ) To avoid model being confused about incomplete output.	2025-08-11 11:52:05 -07:00
Gabriel Peal	7f6408720b	[1/3] Parse exec commands and format them more nicely in the UI (#2095 ) # Note for reviewers The bulk of this PR is in in the new file, `parse_command.rs`. This file is designed to be written TDD and implemented with Codex. Do not worry about reviewing the code, just review the unit tests (if you want). If any cases are missing, we'll add more tests and have Codex fix them. I think the best approach will be to land and iterate. I have some follow-ups I want to do after this lands. The next PR after this will let us merge (and dedupe) multiple sequential cells of the same such as multiple read commands. The deduping will also be important because the model often reads the same file multiple times in a row in chunks === This PR formats common commands like reading, formatting, testing, etc more nicely: It tries to extract things like file names, tests and falls back to the cmd if it doesn't. It also only shows stdout/err if the command failed. <img width="770" height="238" alt="CleanShot 2025-08-09 at 16 05 15" src="https://github.com/user-attachments/assets/0ead179a-8910-486b-aa3d-7d26264d751e" /> <img width="348" height="158" alt="CleanShot 2025-08-09 at 16 05 32" src="https://github.com/user-attachments/assets/4302681b-5e87-4ff3-85b4-0252c6c485a9" /> <img width="834" height="324" alt="CleanShot 2025-08-09 at 16 05 56 2" src="https://github.com/user-attachments/assets/09fb3517-7bd6-40f6-a126-4172106b700f" /> Part 2: https://github.com/openai/codex/pull/2097 Part 3: https://github.com/openai/codex/pull/2110	2025-08-11 14:26:15 -04:00
pakrym-oai	0aa7efe05b	Trace RAW sse events (#2056 ) For easier parsing.	2025-08-11 10:35:03 -07:00
dependabot[bot]	c61911524d	chore(deps): bump tokio-util from 0.7.15 to 0.7.16 in /codex-rs (#2155 ) Bumps [tokio-util](https://github.com/tokio-rs/tokio) from 0.7.15 to 0.7.16. <details> <summary>Commits</summary> <ul> <li><a href="`cf6b50a3fd`"><code>cf6b50a</code></a> chore: prepare tokio-util v0.7.16 (<a href="https://redirect.github.com/tokio-rs/tokio/issues/7507">#7507</a>)</li> <li><a href="`416e36b0df`"><code>416e36b</code></a> task: stabilise <code>JoinMap</code> (<a href="https://redirect.github.com/tokio-rs/tokio/issues/7075">#7075</a>)</li> <li><a href="`9741c90f9f`"><code>9741c90</code></a> sync: document cancel safety on <code>SetOnce::wait</code> (<a href="https://redirect.github.com/tokio-rs/tokio/issues/7506">#7506</a>)</li> <li><a href="`4e3f17bce3`"><code>4e3f17b</code></a> codec: also apply capacity to read buffer in <code>Framed::with_capacity</code> (<a href="https://redirect.github.com/tokio-rs/tokio/issues/7500">#7500</a>)</li> <li><a href="`86cbf81e15`"><code>86cbf81</code></a> Merge 'tokio-1.47.1' into 'master'</li> <li><a href="`be8ee45b3f`"><code>be8ee45</code></a> chore: prepare Tokio v1.47.1 (<a href="https://redirect.github.com/tokio-rs/tokio/issues/7504">#7504</a>)</li> <li><a href="`d9b19166cd`"><code>d9b1916</code></a> Merge 'tokio-1.43.2' into 'tokio-1.47.x' (<a href="https://redirect.github.com/tokio-rs/tokio/issues/7503">#7503</a>)</li> <li><a href="`db8edc620f`"><code>db8edc6</code></a> chore: prepare Tokio v1.43.2 (<a href="https://redirect.github.com/tokio-rs/tokio/issues/7502">#7502</a>)</li> <li><a href="`e47565b086`"><code>e47565b</code></a> blocking: clarify that spawn_blocking is aborted if not yet started (<a href="https://redirect.github.com/tokio-rs/tokio/issues/7501">#7501</a>)</li> <li><a href="`4730984d66`"><code>4730984</code></a> readme: add 1.47 as LTS release (<a href="https://redirect.github.com/tokio-rs/tokio/issues/7497">#7497</a>)</li> <li>Additional commits viewable in <a href="https://github.com/tokio-rs/tokio/compare/tokio-util-0.7.15...tokio-util-0.7.16">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=tokio-util&package-manager=cargo&previous-version=0.7.15&new-version=0.7.16)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-11 09:08:21 -07:00
Yaroslav	f146981b73	feat: add JSON schema sanitization for MCP tools to ensure compatibil… (#1975 ) …ity with internal JsonSchema enum Closes: #1973 Co-authored-by: Dylan Hurd <dylan.hurd@openai.com>	2025-08-10 17:57:39 -07:00
Dylan	0091930f5a	[core] Allow resume after client errors (#2053 ) ## Summary Allow tui conversations to resume after the client fails out of retries. I tested this with exec / mocked api failures as well, and it appears to be fine. But happy to add an exec integration test as well! ## Testing - [x] Added integration test - [x] Tested locally	2025-08-08 18:21:19 -07:00
aibrahim-oai	6cfee15612	Moving the compact prompt near where it's used (#2031 ) - Moved the prompt for compact to core - Renamed it to be more clear	2025-08-08 12:43:43 -07:00
pakrym-oai	307d9957fa	Fix usage limit banner grammar (#2018 ) ## Summary - fix typo in usage limit banner text - update error message tests ## Testing - `just fmt` - `RUSTC_BOOTSTRAP=1 just fix` (fails: `let` expressions in this position are unstable) - `RUSTC_BOOTSTRAP=1 cargo test --all-features` (fails: `let` expressions in this position are unstable) ------ https://chatgpt.com/codex/tasks/task_i_689610fc1fe4832081bdd1118779b60b	2025-08-08 08:50:44 -07:00
pakrym-oai	431c9299d4	Remove part of the error message (#1983 )	2025-08-08 02:01:53 +00:00
easong-openai	52e12f2b6c	Revert "Streaming markdown (#1920 )" (#1981 ) This reverts commit `2b7139859e`.	2025-08-08 01:38:39 +00:00
easong-openai	2b7139859e	Streaming markdown (#1920 ) We wait until we have an entire newline, then format it with markdown and stream in to the UI. This reduces time to first token but is the right thing to do with our current rendering model IMO. Also lets us add word wrapping!	2025-08-07 18:26:47 -07:00
pakrym-oai	fa0051190b	Adjust error messages (#1969 ) <img width="1378" height="285" alt="image" src="https://github.com/user-attachments/assets/f0283378-f839-4a1f-8331-909694a04b1f" />	2025-08-07 18:24:34 -07:00
Michael Bolin	295abf3e51	chore: change CodexAuth::from_api_key() to take &str instead of String (#1970 ) Good practice and simplifies some of the call sites. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1970). * #1971 * __->__ #1970 * #1966 * #1965 * #1962	2025-08-07 16:55:33 -07:00
Michael Bolin	b991c04f86	chore: move top-level load_auth() to CodexAuth::from_codex_home() (#1966 ) There are two valid ways to create an instance of `CodexAuth`: `from_api_key()` and `from_codex_home()`. Now both are static methods of `CodexAuth` and are listed first in the implementation. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1966). * #1971 * #1970 * __->__ #1966 * #1965 * #1962	2025-08-07 16:49:37 -07:00
Michael Bolin	db76f32888	chore: rename CodexAuth::new() to create_dummy_codex_auth_for_testing() because it is not for general consumption (#1962 ) `CodexAuth::new()` was the first method listed in `CodexAuth`, but it is only meant to be used by tests. Rename it to `create_dummy_chatgpt_auth_for_testing()` and move it to the end of the implementation. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1962). * #1971 * #1970 * #1966 * #1965 * __->__ #1962	2025-08-07 16:33:29 -07:00
Dylan	548466df09	[client] Tune retries and backoff (#1956 ) ## Summary 10 is a bit excessive 😅 Also updates our backoff factor to space out requests further.	2025-08-07 15:23:31 -07:00
Michael Bolin	7d67159587	fix: public load_auth() fn always called with include_env_var=true (#1961 ) Apparently `include_env_var=false` was only used for testing, so clean up the API a little to make that clear. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1961). * #1962 * __->__ #1961	2025-08-07 14:19:30 -07:00
pakrym-oai	f23c3066c8	Add capacity error (#1947 )	2025-08-07 10:46:43 -07:00
pakrym-oai	a593b1c3ab	Use different field for error type (#1945 )	2025-08-07 10:20:33 -07:00
Michael Bolin	107d2ce4e7	fix: change OPENAI_DEFAULT_MODEL to "gpt-5" (#1943 )	2025-08-07 10:13:13 -07:00
pakrym-oai	62ed5907f9	Better usage errors (#1941 ) <img width="771" height="279" alt="image" src="https://github.com/user-attachments/assets/e56f967f-bcd7-49f7-8a94-3d88df68b65a" />	2025-08-07 09:46:13 -07:00
Dylan	bc28b87c7b	[config] Onboarding flow with persistence (#1929 ) ## Summary In collaboration with @gpeal: upgrade the onboarding flow, and persist user settings. --------- Co-authored-by: Gabriel Peal <gabriel@openai.com>	2025-08-07 09:27:38 -07:00
pakrym-oai	7e9ecfbc6a	Rename the model (#1942 )	2025-08-07 09:07:51 -07:00
ae	81b148bda2	feat: update system prompt (#1939 )	2025-08-07 04:29:50 -07:00
Michael Bolin	c2c327c723	feat: change shell_environment_policy to default to inherit="all" (#1904 ) Trying to use `core` as the default has been "too clever." Users can always take responsibility for controlling the env without this setting at all by specifying the `env` they use when calling `codex` in the first place. See https://github.com/openai/codex/issues/1249.	2025-08-07 01:55:41 -07:00
Michael Bolin	13982d6b4e	chore: fix outstanding review comments from the bot on #1919 (#1928 ) I should have read the comments before submitting!	2025-08-07 01:30:13 -07:00
ae	0334476894	feat: parse info from auth.json and show in /status (#1923 ) - `/status` renders ``` signed in with chatgpt login: example@example.com plan: plus ``` - Setup for using this info in a few more places. --------- Co-authored-by: Michael Bolin <mbolin@openai.com>	2025-08-07 01:27:45 -07:00
ae	28395df957	[fix] fix absolute and % token counts (#1931 ) - For absolute, use non-cached input + output. - For estimating what % of the model's context window is used, we need to account for reasoning output tokens from prior turns being dropped from the context window. We approximate this here by subtracting reasoning output tokens from the total. This will be off for the current turn and pending function calls. We can improve it later.	2025-08-07 08:13:36 +00:00
Ed Bayes	eb80614a7c	Tint chat composer background (#1921 ) ## Summary - give the chat composer a subtle custom background and apply it across the full area drawn <img width="1008" height="718" alt="composer-bg" src="https://github.com/user-attachments/assets/4b0f7f69-722a-438a-b4e9-0165ae8865a6" /> - update turn interrupted to be more human readable <img width="648" height="170" alt="CleanShot 2025-08-06 at 22 44 47@2x" src="https://github.com/user-attachments/assets/8d35e53a-bbfa-48e7-8612-c280a54e01dd" /> ## Testing - `cargo test --all-features` (fails: `let` expressions in `core/src/client.rs` require newer rustc) - `just fix` (fails: `let` expressions in `core/src/client.rs` require newer rustc) ------ https://chatgpt.com/codex/tasks/task_i_68941f32c1008322bbcc39ee1d29a526	2025-08-07 00:46:45 -07:00
Michael Bolin	cd5f9074af	feat: add /tmp by default (#1919 ) Replaces the `include_default_writable_roots` option on `sandbox_workspace_write` (that defaulted to `true`, which was slightly weird/annoying) with `exclude_tmpdir_env_var`, which defaults to `false`. Though perhaps more importantly `/tmp` is now enabled by default as part of `sandbox_mode = "workspace-write"`, though `exclude_slash_tmp = false` can be used to disable this.	2025-08-07 00:17:00 -07:00
aibrahim-oai	f15e0fe1df	Ensure exec command end always emitted (#1908 ) ## Summary - defer ExecCommandEnd emission until after sandbox resolution - make sandbox error handler return final exec output and response - align sandbox error stderr with response content and rename to `final_output` - replace unstable `let` chains in client command header logic ## Testing - `just fmt` - `just fix` - `cargo test --all-features` (fails: NotPresent in core/tests/client.rs) ------ https://chatgpt.com/codex/tasks/task_i_6893e63b0c408321a8e1ff2a052c4c51	2025-08-07 06:25:56 +00:00
Gabriel Peal	8a990b5401	Migrate GitWarning to OnboardingScreen (#1915 ) This paves the way to do per-directory approval settings (https://github.com/openai/codex/pull/1912). This also lets us pass in a Config/ChatWidgetArgs into onboarding which can then mutate it and emit the ChatWidgetArgs it wants at the end which may be modified by the said approval settings. <img width="1180" height="428" alt="CleanShot 2025-08-06 at 19 30 55" src="https://github.com/user-attachments/assets/4dcfda42-0f5e-4b6d-a16d-2597109cc31c" />	2025-08-06 22:39:07 -04:00
pakrym-oai	57c973b571	Add 2025-08-06 model family (#1899 )	2025-08-06 23:14:02 +00:00
Gabriel Peal	2d5de795aa	First pass at a TUI onboarding (#1876 ) This sets up the scaffolding and basic flow for a TUI onboarding experience. It covers sign in with ChatGPT, env auth, as well as some safety guidance. Next up: 1. Replace the git warning screen 2. Use this to configure default approval/sandbox modes Note the shimmer flashes are from me slicing the video, not jank. https://github.com/user-attachments/assets/0fbe3479-fdde-41f3-87fb-a7a83ab895b8	2025-08-06 18:22:14 -04:00
pakrym-oai	8262ba58b2	Prefer env var auth over default codex auth (#1861 ) ## Summary - Prioritize provider-specific API keys over default Codex auth when building requests - Add test to ensure provider env var auth overrides default auth ## Testing - `just fmt` - `just fix` (fails: `let` expressions in this position are unstable) - `cargo test --all-features` (fails: `let` expressions in this position are unstable) ------ https://chatgpt.com/codex/tasks/task_i_68926a104f7483208f2c8fd36763e0e3	2025-08-06 13:02:00 -07:00
Michael Bolin	64f2f2eca2	fix: support $CODEX_HOME/AGENTS.md instead of $CODEX_HOME/instructions.md (#1891 ) The docs and code do not match. It turns out the docs are "right" in they are what we have been meaning to support, so this PR updates the code: `ae88b69b09/README.md (L298-L302)` Support for `instructions.md` is a holdover from the TypeScript CLI, so we are just going to drop support for it altogether rather than maintain it in perpetuity.	2025-08-06 11:48:03 -07:00
Dylan	dc468d563f	[env] Remove git config for now (#1884 ) ## Summary Forgot to remove this in #1869 last night! Too much of a performance hit on the main thread. We can bring it back via an async thread on startup.	2025-08-06 08:05:17 -07:00
Dylan	3e8bcf0247	[prompts] Add <environment_context> (#1869 ) ## Summary Includes a new user message in the api payload which provides useful environment context for the model, so it knows about things like the current working directory and the sandbox. ## Testing Updated unit tests	2025-08-06 01:13:31 -07:00
easong-openai	f8d70d67b6	Add OSS model info (#1860 ) Add somewhat arbitrarily chosen context window/output limit.	2025-08-05 22:35:00 -07:00
Dylan	725dd6be6a	[approval_policy] Add OnRequest approval_policy (#1865 ) ## Summary A split-up PR of #1763 , stacked on top of a tools refactor #1858 to make the change clearer. From the previous summary: > Let's try something new: tell the model about the sandbox, and let it decide when it will need to break the sandbox. Some local testing suggests that it works pretty well with zero iteration on the prompt! ## Testing - [x] Added unit tests - [x] Tested locally and it appears to work smoothly!	2025-08-05 20:44:20 -07:00
Dylan	aff97ed7dd	[core] Separate tools config from openai client (#1858 ) ## Summary In an effort to make tools easier to work with and more configurable, I'm introducing `ToolConfig` and updating `Prompt` to take in a general list of Tools. I think this is simpler and better for a few reasons: - We can easily assemble tools from various sources (our own harness, mcp servers, etc.) and we can consolidate the logic for constructing the logic in one place that is separate from serialization. - client.rs no longer needs arbitrary config values, it just takes in a list of tools to serialize A hefty portion of the PR is now updating our conversion of `mcp_types::Tool` to `OpenAITool`, but considering that @bolinfest accurately called this out as a TODO long ago, I think it's time we tackled it. ## Testing - [x] Experimented locally, no changes, as expected - [x] Added additional unit tests - [x] Responded to rust-review	2025-08-05 19:27:52 -07:00

1 2 3 4 5

232 Commits