valknar/llmx - llmx - dev.pivoine.art

Author	SHA1	Message	Date
Dylan	e4c275d615	[apply-patch] Clean up apply-patch tool definitions (#2539 ) ## Summary We've experienced a bit of drift in system prompting for `apply_patch`: - As pointed out in #2030 , our prettier formatting started altering prompt.md in a few ways - We introduced a separate markdown file for apply_patch instructions in #993, but currently duplicate them in the prompt.md file - We added a first-class apply_patch tool in #2303, which has yet another definition This PR starts to consolidate our logic in a few ways: - We now only use `apply_patch_tool_instructions.md](https://github.com/openai/codex/compare/dh--apply-patch-tool-definition?expand=1#diff-d4fffee5f85cb1975d3f66143a379e6c329de40c83ed5bf03ffd3829df985bea) for system instructions - We no longer include apply_patch system instructions if the tool is specified I'm leaving the definition in openai_tools.rs as duplicated text for now because we're going to be iterated on the first-class tool soon. ## Testing - [x] Added integration tests to verify prompt stability - [x] Tested locally with several different models (gpt-5, gpt-oss, o4-mini)	2025-08-21 20:07:41 -07:00
ae	30ee24521b	fix: remove behavioral prompting from update_plan tool def (#2261 ) - Moved some of the content to the main prompt.	2025-08-13 19:05:13 +00:00
Dylan	90d892f4fd	[prompt] Restore important guidance for shell command usage (#2211 ) ## Summary In #1939 we overhauled a lot of our prompt. This was largely good, but we're seeing some specific points of confusion from the model! This prompt update attempts to address 3 of them: - Enforcing the use of `ripgrep`, which is bundled as a dependency when installed with homebrew. We should do the same on node (in progress) - Explicit guidance on reading files in chunks. - Slight adjustment to networking sandbox language. `enabled` / `restricted` is anecdotally less confusing to the model and requires less reasoning to escalate for approval. We are going to continue iterating on shell usage and tools, but this restores us to best practices for current model snapshots. ## Testing - [x] evals - [x] local testing	2025-08-12 10:19:07 -07:00
ae	81b148bda2	feat: update system prompt (#1939 )	2025-08-07 04:29:50 -07:00
Dylan	d31e149cb1	[prompt] Update prompt.md (#1839 ) ## Summary Additional clarifications to our prompt. Still very concise, but we'll continue to add more here.	2025-08-05 00:43:23 -07:00
Dylan	063083af15	[prompts] Better user_instructions handling (#1836 ) ## Summary Our recent change in #1737 can sometimes lead to the model confusing AGENTS.md context as part of the message. But a little prompting and formatting can help fix this! ## Testing - Ran locally with a few different prompts to verify the model behaves well. - Updated unit tests	2025-08-04 18:55:57 -07:00
easong-openai	a6139aa003	Update prompt.md (#1819 ) The existing prompt is really bad. As a low-hanging fruit, let's correct the apply_patch instructions - this helps smaller models successfully apply patches.	2025-08-04 10:42:39 -07:00
easong-openai	6ce0a5875b	Initial planning tool (#1753 ) We need to optimize the prompt, but this causes the model to use the new planning_tool. <img width="765" height="110" alt="image" src="https://github.com/user-attachments/assets/45633f7f-3c85-4e60-8b80-902f1b3b508d" />	2025-07-31 20:45:52 +00:00
Michael Bolin	31d0d7a305	feat: initial import of Rust implementation of Codex CLI in codex-rs/ (#629 ) As stated in `codex-rs/README.md`: Today, Codex CLI is written in TypeScript and requires Node.js 22+ to run it. For a number of users, this runtime requirement inhibits adoption: they would be better served by a standalone executable. As maintainers, we want Codex to run efficiently in a wide range of environments with minimal overhead. We also want to take advantage of operating system-specific APIs to provide better sandboxing, where possible. To that end, we are moving forward with a Rust implementation of Codex CLI contained in this folder, which has the following benefits: - The CLI compiles to small, standalone, platform-specific binaries. - Can make direct, native calls to [seccomp](https://man7.org/linux/man-pages/man2/seccomp.2.html) and [landlock](https://man7.org/linux/man-pages/man7/landlock.7.html) in order to support sandboxing on Linux. - No runtime garbage collection, resulting in lower memory consumption and better, more predictable performance. Currently, the Rust implementation is materially behind the TypeScript implementation in functionality, so continue to use the TypeScript implmentation for the time being. We will publish native executables via GitHub Releases as soon as we feel the Rust version is usable.	2025-04-24 13:31:40 -07:00

9 Commits