valknar/llmx - llmx - dev.pivoine.art

Author	SHA1	Message	Date
pakrym-oai	1b8f2543ac	Filter out reasoning items from previous turns (#5857 ) Reduces request size and prevents 400 errors when switching between API orgs. Based on Responses API behavior described in https://cookbook.openai.com/examples/responses_api/reasoning_items#caching	2025-10-28 11:39:34 -07:00
Eric Traut	0c1ff1d3fd	Made token refresh code resilient to missing `id_token` (#5782 ) This PR does the following: 1. Changes `try_refresh_token` to handle the case where the endpoint returns a response without an `id_token`. The OpenID spec indicates that this field is optional and clients should not assume it's present. 2. Changes the `attempt_stream_responses` to propagate token refresh errors rather than silently ignoring them. 3. Fixes a typo in a couple of error messages (unrelated to the above, but something I noticed in passing) - "reconnect" should be spelled without a hyphen. This PR does not implement the additional suggestion from @pakrym-oai that we should sign out when receiving `refresh_token_expired` from the refresh endpoint. Leaving this as a follow-on because I'm undecided on whether this should be implemented in `try_refresh_token` or its callers.	2025-10-27 10:09:53 -07:00
jif-oai	5e8659dcbc	chore: undo nits (#5631 )	2025-10-27 11:48:01 +00:00
Eric Traut	bcd64c7e72	Reduced runtime of unit test that was taking multiple minutes (#5688 ) Modified `build_compacted_history_truncates_overlong_user_messages` test to reduce runtime from minutes to tens of seconds	2025-10-25 23:46:08 -07:00
Ahmed Ibrahim	273819aaae	Move changing turn input functionalities to `ConversationHistory` (#5473 ) We are doing some ad-hoc logic while dealing with conversation history. Ideally, we shouldn't mutate `vec[responseitem]` manually at all and should depend on `ConversationHistory` for those changes. Those changes are: - Adding input to the history - Removing items from the history - Correcting history I am also adding some `error` logs for cases we shouldn't ideally face. For example, we shouldn't be missing `toolcalls` or `outputs`. We shouldn't hit `ContextWindowExceeded` while performing `compact` This refactor will give us granular control over our context management.	2025-10-22 13:08:46 -07:00
pakrym-oai	3c90728a29	Add new thread items and rewire event parsing to use them (#5418 ) 1. Adds AgentMessage, Reasoning, WebSearch items. 2. Switches the ResponseItem parsing to use new items and then also emit 3. Removes user-item kind and filters out "special" (environment) user items when returning to clients.	2025-10-22 10:14:50 -07:00
pakrym-oai	789e65b9d2	Pass TurnContext around instead of sub_id (#5421 ) Today `sub_id` is an ID of a single incoming Codex Op submition. We then associate all events triggered by this operation using the same `sub_id`. At the same time we are also creating a TurnContext per submission and we'd like to start associating some events (item added/item completed) with an entire turn instead of just the operation that started it. Using turn context when sending events give us flexibility to change notification scheme.	2025-10-21 08:04:16 -07:00
pakrym-oai	9c903c4716	Add ItemStarted/ItemCompleted events for UserInputItem (#5306 ) Adds a new ItemStarted event and delivers UserMessage as the first item type (more to come). Renames `InputItem` to `UserInput` considering we're using the `Item` suffix for actual items.	2025-10-20 13:34:44 -07:00
jif-oai	268a10f917	feat: add header for task kind (#5142 ) Add a header in the responses API request for the task kind (compact, review, ...) for observability purpose The header name is `codex-task-type`	2025-10-14 15:17:00 +00:00
jif-oai	f98fa85b44	feat: message when stream get correctly resumed (#4988 ) <img width="366" height="109" alt="Screenshot 2025-10-09 at 17 44 16" src="https://github.com/user-attachments/assets/26bc6f60-11bc-4fc6-a1cc-430ca1203969" />	2025-10-10 09:07:14 +00:00
jif-oai	687a13bbe5	feat: truncate on compact (#4942 ) Truncate the message during compaction if it is just too large Do it iteratively as tokenization is basically free on server-side	2025-10-08 18:11:08 +01:00
Ahmed Ibrahim	90ef94d3b3	Surface context window error to the client (#4675 ) In the past, we were treating `input exceeded context window` as a streaming error and retrying on it. Retrying on it has no point because it won't change the behavior. In this PR, we surface the error to the client without retry and also send a token count event to indicate that the context window is full. <img width="650" height="125" alt="image" src="https://github.com/user-attachments/assets/c26b1213-4c27-4bfc-90f4-51a270a3efd5" />	2025-10-05 01:40:06 +00:00
jif-oai	8797145678	fix: token usage for compaction (#4281 ) Emit token usage update when draining compaction	2025-09-26 16:24:27 +02:00
jif-oai	1fc3413a46	ref: state - 2 (#4229 ) Extracting tasks in a module and start abstraction behind a Trait (more to come on this but each task will be tackled in a dedicated PR) The goal was to drop the ActiveTask and to have a (potentially) set of tasks during each turn	2025-09-26 13:49:08 +00:00
jif-oai	250b244ab4	ref: full state refactor (#4174 ) ## Current State Observations - `Session` currently holds many unrelated responsibilities (history, approval queues, task handles, rollout recorder, shell discovery, token tracking, etc.), making it hard to reason about ownership and lifetimes. - The anonymous `State` struct inside `codex.rs` mixes session-long data with turn-scoped queues and approval bookkeeping. - Turn execution (`run_task`) relies on ad-hoc local variables that should conceptually belong to a per-turn state object. - External modules (`codex::compact`, tests) frequently poke the raw `Session.state` mutex, which couples them to implementation details. - Interrupts, approvals, and rollout persistence all have bespoke cleanup paths, contributing to subtle bugs when a turn is aborted mid-flight. ## Desired End State - Keep a slim `Session` object that acts as the orchestrator and façade. It should expose a focused API (submit, approvals, interrupts, event emission) without storing unrelated fields directly. - Introduce a `state` module that encapsulates all mutable data structures: - `SessionState`: session-persistent data (history, approved commands, token/rate-limit info, maybe user preferences). - `ActiveTurn`: metadata for the currently running turn (sub-id, task kind, abort handle) and an `Arc<TurnState>`. - `TurnState`: all turn-scoped pieces (pending inputs, approval waiters, diff tracker, review history, auto-compact flags, last agent message, outstanding tool call bookkeeping). - Group long-lived helpers/managers into a dedicated `SessionServices` struct so `Session` does not accumulate "random" fields. - Provide clear, lock-safe APIs so other modules never touch raw mutexes. - Ensure every turn creates/drops a `TurnState` and that interrupts/finishes delegate cleanup to it.	2025-09-25 12:16:06 +02:00
jif-oai	af6304c641	nit: drop instruction override for auto-compact (#4137 ) drop instruction override for auto-compact as this is not used and dangerous as it invalidates the cache	2025-09-24 10:47:12 +01:00
pakrym-oai	fdb8dadcae	Add exec output-schema parameter (#4079 ) Adds structured output to `exec` via the `--structured-output` parameter.	2025-09-23 13:59:16 -07:00
jif-oai	b84a920067	chore: compact do not modify instructions (#4088 ) Keep the developer instruction and insert the summarisation message as a user message instead	2025-09-23 17:59:17 +01:00
dedrisian-oai	c415827ac2	Truncate potentially long user messages in compact message. (#4068 ) If a prior user message is massive, any future `/compact` task would fail because we're verbatim copying the user message into the new chat.	2025-09-22 23:12:26 +00:00
Jeremy Rose	b34e906396	Reland "refactor transcript view to handle HistoryCells" (#3753 ) Reland of #3538	2025-09-18 20:55:53 +00:00
jif-oai	277fc6254e	chore: use tokio mutex and async function to prevent blocking a worker (#3850 ) ### Why Use `tokio::sync::Mutex` `std::sync::Mutex` are not _async-aware_. As a result, they will block the entire thread instead of just yielding the task. Furthermore they can be poisoned which is not the case of `tokio` Mutex. This allows the Tokio runtime to continue running other tasks while waiting for the lock, preventing deadlocks and performance bottlenecks. In general, this is preferred in async environment	2025-09-18 18:21:52 +01:00
jif-oai	4a5d6f7c71	Make ESC button work when auto-compaction (#3857 ) Only emit a task finished when the compaction comes from a `/compact`	2025-09-18 15:34:16 +00:00
Ahmed Ibrahim	bbea6bbf7e	Handle resuming/forking after compact (#3533 ) We need to construct the history different when compact happens. For this, we need to just consider the history after compact and convert compact to a response item. This needs to change and use `build_compact_history` when this #3446 is merged.	2025-09-14 13:23:31 +00:00
jif-oai	ea225df22e	feat: context compaction (#3446 ) ## Compact feature: 1. Stops the model when the context window become too large 2. Add a user turn, asking for the model to summarize 3. Build a bridge that contains all the previous user message + the summary. Rendered from a template 4. Start sampling again from a clean conversation with only that bridge	2025-09-12 13:07:10 -07:00

24 Commits