feat: context compaction (#3446)

## Compact feature: 1. Stops the model when the context window become too large 2. Add a user turn, asking for the model to summarize 3. Build a bridge that contains all the previous user message + the summary. Rendered from a template 4. Start sampling again from a clean conversation with only that bridge
2025-09-12 13:07:10 -07:00
parent d4848e558b
commit ea225df22e
14 changed files with 1243 additions and 326 deletions
--- a/codex-rs/core/templates/compact/history_bridge.md
+++ b/codex-rs/core/templates/compact/history_bridge.md
@@ -0,0 +1,7 @@
+You were originally given instructions from a user over one or more turns. Here were the user messages:
+
+{{ user_messages_text }}
+
+Another language model started to solve this problem and produced a summary of its thinking process. You also have access to the state of the tools that were used by that language model. Use this to build on the work that has already been done and avoid duplicating work. Here is the summary produced by the other language model, use the information in this summary to assist with your own analysis:
+
+{{ summary_text }}
--- a/codex-rs/core/templates/compact/prompt.md
+++ b/codex-rs/core/templates/compact/prompt.md
@@ -0,0 +1,5 @@
+You have exceeded the maximum number of tokens, please stop coding and instead write a short memento message for the next agent. Your note should:
+- Summarize what you finished and what still needs work. If there was a recent update_plan call, repeat its steps verbatim.
+- List outstanding TODOs with file paths / line numbers so they're easy to find.
+- Flag code that needs more tests (edge cases, performance, integration, etc.).
+- Record any open bugs, quirks, or setup steps that will make it easier for the next agent to pick up where you left off.