valknar/llmx - llmx - dev.pivoine.art

Author	SHA1	Message	Date
Justin Lebar	18330c2362	Format large numbers in a more readable way. (#2046 ) - In the bottom line of the TUI, print the number of tokens to 3 sigfigs with an SI suffix, e.g. "1.23K". - Elsewhere where we print a number, I figure it's worthwhile to print the exact number, because e.g. it's a summary of your session. Here we print the numbers comma-separated.	2025-09-08 21:48:48 +00:00
Gabriel Peal	c8fab51372	Use ConversationId instead of raw Uuids (#3282 ) We're trying to migrate from `session_id: Uuid` to `conversation_id: ConversationId`. Not only does this give us more type safety but it unifies our terminology across Codex and with the implementation of session resuming, a conversation (which can span multiple sessions) is more appropriate. I started this impl on https://github.com/openai/codex/pull/3219 as part of getting resume working in the extension but it's big enough that it should be broken out.	2025-09-07 23:22:25 -04:00
pakrym-oai	0269096229	Move token usage/context information to session level (#3221 ) Move context information into the main loop so it can be used to interrupt the loop or start auto-compaction.	2025-09-06 15:19:23 +00:00
Ahmed Ibrahim	2b96f9f569	Dividing UserMsgs into categories to send it back to the tui (#3127 ) This PR does the following: - divides user msgs into 3 categories: plain, user instructions, and environment context - Centralizes adding user instructions and environment context to a degree - Improve the integration testing Building on top of #3123 Specifically this [comment](https://github.com/openai/codex/pull/3123#discussion_r2319885089). We need to send the user message while ignoring the User Instructions and Environment Context we attach.	2025-09-04 05:34:50 +00:00
Ahmed Ibrahim	f2036572b6	Replay EventMsgs from Response Items when resuming a session with history. (#3123 ) ### Overview This PR introduces the following changes: 1. Adds a unified mechanism to convert ResponseItem into EventMsg. 2. Ensures that when a session is initialized with initial history, a vector of EventMsg is sent along with the session configuration. This allows clients to re-render the UI accordingly. 3. Added integration testing ### Caveats This implementation does not send every EventMsg that was previously dispatched to clients. The excluded events fall into two categories: • “Arguably” rolled-out events Examples include tool calls and apply-patch calls. While these events are conceptually rolled out, we currently only roll out ResponseItems. These events are already being handled elsewhere and transformed into EventMsg before being sent. • Non-rolled-out events Certain events such as TurnDiff, Error, and TokenCount are not rolled out at all. ### Future Directions At present, resuming a session involves maintaining two states: • UI State Clients can replay most of the important UI from the provided EventMsg history. • Model State The model receives the complete session history to reconstruct its internal state. This design provides a solid foundation. If, in the future, more precise UI reconstruction is needed, we have two potential paths: 1. Introduce a third data structure that allows us to derive both ResponseItems and EventMsgs. 2. Clearly divide responsibilities: the core system ensures the integrity of the model state, while clients are responsible for reconstructing the UI.	2025-09-04 04:47:00 +00:00
Jeremy Rose	e442ecedab	rework message styling (#2877 ) https://github.com/user-attachments/assets/cf07f62b-1895-44bb-b9c3-7a12032eb371	2025-09-02 17:29:58 +00:00
Ahmed Ibrahim	9dbe7284d2	Following up on #2371 post commit feedback (#2852 ) - Introduce websearch end to complement the begin - Moves the logic of adding the sebsearch tool to create_tools_json_for_responses_api - Making it the client responsibility to toggle the tool on or off - Other misc in #2371 post commit feedback - Show the query: <img width="1392" height="151" alt="image" src="https://github.com/user-attachments/assets/8457f1a6-f851-44cf-bcca-0d4fe460ce89" />	2025-08-28 19:24:38 -07:00
dedrisian-oai	b8e8454b3f	Custom /prompts (#2696 ) Adds custom `/prompts` to `~/.codex/prompts/<command>.md`. <img width="239" height="107" alt="Screenshot 2025-08-25 at 6 22 42 PM" src="https://github.com/user-attachments/assets/fe6ebbaa-1bf6-49d3-95f9-fdc53b752679" /> --- Details: 1. Adds `Op::ListCustomPrompts` to core. 2. Returns `ListCustomPromptsResponse` with list of `CustomPrompt` (name, content). 3. TUI calls the operation on load, and populates the custom prompts (excluding prompts that collide with builtins). 4. Selecting the custom prompt automatically sends the prompt to the agent.	2025-08-29 02:16:39 +00:00
Ahmed Ibrahim	d0e06f74e2	send context window with task started (#2752 ) - Send context window with task started - Accounting for changing the model per turn	2025-08-27 00:04:21 -07:00
Reuben Narad	363636f5eb	Add web search tool (#2371 ) Adds web_search tool, enabling the model to use Responses API web_search tool. - Disabled by default, enabled by --search flag - When --search is passed, exposes web_search_request function tool to the model, which triggers user approval. When approved, the model can use the web_search tool for the remainder of the turn <img width="1033" height="294" alt="image" src="https://github.com/user-attachments/assets/62ac6563-b946-465c-ba5d-9325af28b28f" /> --------- Co-authored-by: easong-openai <easong@openai.com>	2025-08-23 22:58:56 -07:00
Ahmed Ibrahim	957d44918d	send-aggregated output (#2364 ) We want to send an aggregated output of stderr and stdout so we don't have to aggregate it stderr+stdout as we lose order sometimes. --------- Co-authored-by: Gabriel Peal <gpeal@users.noreply.github.com>	2025-08-23 16:54:31 +00:00
Ahmed Ibrahim	311ad0ce26	fork conversation from a previous message (#2575 ) This can be the underlying logic in order to start a conversation from a previous message. will need some love in the UI. Base for building this: #2588	2025-08-22 17:06:09 -07:00
Jeremy Rose	d994019f3f	tui: coalesce command output; show unabridged commands in transcript (#2590 ) https://github.com/user-attachments/assets/effec7c7-732a-4b61-a2ae-3cb297b6b19b	2025-08-22 16:32:31 -07:00
easong-openai	8ad56be06e	Parse and expose stream errors (#2540 )	2025-08-21 01:15:24 -07:00
Dylan	e7e5fe91c8	[tui] Support /mcp command (#2430 ) ## Summary Adds a `/mcp` command to list active tools. We can extend this command to allow configuration of MCP tools, but for now a simple list command will help debug if your config.toml and your tools are working as expected.	2025-08-19 09:00:31 -07:00
Michael Bolin	b581498882	fix: introduce EventMsg::TurnAborted (#2365 ) Introduces `EventMsg::TurnAborted` that should be sent in response to `Op::Interrupt`. In the MCP server, updates the handling of a `ClientRequest::InterruptConversation` request such that it sends the `Op::Interrupt` but does not respond to the request until it sees an `EventMsg::TurnAborted`.	2025-08-17 21:40:31 -07:00
Parker Thompson	a075424437	Added `allow-expect-in-tests` / `allow-unwrap-in-tests` (#2328 ) This PR: * Added the clippy.toml to configure allowable expect / unwrap usage in tests * Removed as many expect/allow lines as possible from tests * moved a bunch of allows to expects where possible Note: in integration tests, non `#[test]` helper functions are not covered by this so we had to leave a few lingering `expect(expect_used` checks around	2025-08-14 17:59:01 -07:00
easong-openai	6340acd885	Re-add markdown streaming (#2029 ) Wait for newlines, then render markdown on a line by line basis. Word wrap it for the current terminal size and then spit it out line by line into the UI. Also adds tests and fixes some UI regressions.	2025-08-12 17:37:28 -07:00
Gabriel Peal	7f6408720b	[1/3] Parse exec commands and format them more nicely in the UI (#2095 ) # Note for reviewers The bulk of this PR is in in the new file, `parse_command.rs`. This file is designed to be written TDD and implemented with Codex. Do not worry about reviewing the code, just review the unit tests (if you want). If any cases are missing, we'll add more tests and have Codex fix them. I think the best approach will be to land and iterate. I have some follow-ups I want to do after this lands. The next PR after this will let us merge (and dedupe) multiple sequential cells of the same such as multiple read commands. The deduping will also be important because the model often reads the same file multiple times in a row in chunks === This PR formats common commands like reading, formatting, testing, etc more nicely: It tries to extract things like file names, tests and falls back to the cmd if it doesn't. It also only shows stdout/err if the command failed. <img width="770" height="238" alt="CleanShot 2025-08-09 at 16 05 15" src="https://github.com/user-attachments/assets/0ead179a-8910-486b-aa3d-7d26264d751e" /> <img width="348" height="158" alt="CleanShot 2025-08-09 at 16 05 32" src="https://github.com/user-attachments/assets/4302681b-5e87-4ff3-85b4-0252c6c485a9" /> <img width="834" height="324" alt="CleanShot 2025-08-09 at 16 05 56 2" src="https://github.com/user-attachments/assets/09fb3517-7bd6-40f6-a126-4172106b700f" /> Part 2: https://github.com/openai/codex/pull/2097 Part 3: https://github.com/openai/codex/pull/2110	2025-08-11 14:26:15 -04:00
ae	28395df957	[fix] fix absolute and % token counts (#1931 ) - For absolute, use non-cached input + output. - For estimating what % of the model's context window is used, we need to account for reasoning output tokens from prior turns being dropped from the context window. We approximate this here by subtracting reasoning output tokens from the total. This will be off for the current turn and pending function calls. We can improve it later.	2025-08-07 08:13:36 +00:00
ae	d642b07fcc	[feat] add /status slash command (#1873 ) - Added a `/status` command, which will be useful when we update the home screen to print less status. - Moved `create_config_summary_entries` to common since it's used in a few places. - Noticed we inconsistently had periods in slash command descriptions and just removed them everywhere. - Noticed the diff description was overflowing so made it shorter.	2025-08-05 23:57:52 -07:00
easong-openai	e0303dbac0	Rescue chat completion changes (#1846 ) https://github.com/openai/codex/pull/1835 has some messed up history. This adds support for streaming chat completions, which is useful for ollama. We should probably take a very skeptical eye to the code introduced in this PR. --------- Co-authored-by: Ahmed Ibrahim <aibrahim@openai.com>	2025-08-05 08:56:13 +00:00
Ahmed Ibrahim	e38ce39c51	Revert to `3f13ebce10` without rewriting history. Wrong merge	2025-08-04 17:03:24 -07:00
Ahmed Ibrahim	bd171e5206	add raw reasoning	2025-08-04 16:49:42 -07:00
Michael Bolin	3f13ebce10	[codex] stop printing error message when --output-last-message is not specified (#1828 ) Previously, `codex exec` was printing `Warning: no file to write last message to` as a warning to stderr even though `--output-last-message` was not specified, which is wrong. This fixes the code and changes `handle_last_message()` so that it is only called when `last_message_path` is `Some`.	2025-08-04 15:56:32 -07:00
Gabriel Peal	1f3318c1c5	Add a TurnDiffTracker to create a unified diff for an entire turn (#1770 ) This lets us show an accumulating diff across all patches in a turn. Refer to the docs for TurnDiffTracker for implementation details. There are multiple ways this could have been done and this felt like the right tradeoff between reliability and completeness: Pros * It will pick up all changes to files that the model touched including if they prettier or another command that updates them. * It will not pick up changes made by the user or other agents to files it didn't modify. Cons * It will pick up changes that the user made to a file that the model also touched * It will not pick up changes to codegen or files that were not modified with apply_patch	2025-08-04 11:57:04 -04:00
Jeremy Rose	78a1d49fac	fix command duration display (#1806 ) we were always displaying "0ms" before. <img width="731" height="101" alt="Screenshot 2025-08-02 at 10 51 22 PM" src="https://github.com/user-attachments/assets/f56814ed-b9a4-4164-9e78-181c60ce19b7" />	2025-08-03 11:33:44 -07:00
aibrahim-oai	f20de21cb6	collabse `stdout` and `stderr` delta events into one (#1787 )	2025-08-01 14:00:19 -07:00
aibrahim-oai	bc7beddaa2	feat: stream exec stdout events (#1786 ) ## Summary - stream command stdout as `ExecCommandStdout` events - forward streamed stdout to clients and ignore in human output processor - adjust call sites for new streaming API	2025-08-01 13:04:34 -07:00
Jeremy Rose	347c81ad00	remove conversation history widget (#1727 ) this widget is no longer used.	2025-07-30 10:05:40 -07:00
Gabriel Peal	8828f6f082	Add an experimental plan tool (#1726 ) This adds a tool the model can call to update a plan. The tool doesn't actually _do_ anything but it gives clients a chance to read and render the structured plan. We will likely iterate on the prompt and tools exposed for planning over time.	2025-07-29 14:22:02 -04:00
aibrahim-oai	b4ab7c1b73	Flaky CI fix (#1647 ) Flushing before sending `TaskCompleteEvent` and ending the submission loop to avoid race conditions.	2025-07-23 15:03:26 -07:00
Michael Bolin	e16657ca45	feat: add --json flag to `codex exec` (#1603 ) This is designed to facilitate programmatic use of Codex in a more lightweight way than using `codex mcp`. Passing `--json` to `codex exec` will print each event as a line of JSON to stdout. Note that it does not print the individual tokens as they are streamed, only full messages, as this is aimed at programmatic use rather than to power UI. <img width="1348" height="1307" alt="image" src="https://github.com/user-attachments/assets/fc7908de-b78d-46e4-a6ff-c85de28415c7" /> I changed the existing `EventProcessor` into a trait and moved the implementation to `EventProcessorWithHumanOutput`. Then I introduced an alternative implementation, `EventProcessorWithJsonOutput`. The `--json` flag determines which implementation to use.	2025-07-17 15:10:15 -07:00

33 Commits