valknar/llmx - llmx - dev.pivoine.art

Author	SHA1	Message	Date
Abhishek Bhardwaj	89591e4246	feature: Add "!cmd" user shell execution (#2471 ) feature: Add "!cmd" user shell execution This change lets users run local shell commands directly from the TUI by prefixing their input with ! (e.g. !ls). Output is truncated to keep the exec cell usable, and Ctrl-C cleanly interrupts long-running commands (e.g. !sleep 10000). Summary of changes - Route Op::RunUserShellCommand through a dedicated UserShellCommandTask (core/src/tasks/user_shell.rs), keeping the task logic out of codex.rs. - Reuse the existing tool router: the task constructs a ToolCall for the local_shell tool and relies on ShellHandler, so no manual MCP tool lookup is required. - Emit exec lifecycle events (ExecCommandBegin/ExecCommandEnd) so the TUI can show command metadata, live output, and exit status. End-to-end flow TUI handling 1. ChatWidget::submit_user_message (TUI) intercepts messages starting with !. 2. Non-empty commands dispatch Op::RunUserShellCommand { command }; empty commands surface a help hint. 3. No UserInput items are created, so nothing is enqueued for the model. Core submission loop 4. The submission loop routes the op to handlers::run_user_shell_command (core/src/codex.rs). 5. A fresh TurnContext is created and Session::spawn_user_shell_command enqueues UserShellCommandTask. Task execution 6. UserShellCommandTask::run emits TaskStartedEvent, formats the command, and prepares a ToolCall targeting local_shell. 7. ToolCallRuntime::handle_tool_call dispatches to ShellHandler. Shell tool runtime 8. ShellHandler::run_exec_like launches the process via the unified exec runtime, honoring sandbox and shell policies, and emits ExecCommandBegin/End. 9. Stdout/stderr are captured for the UI, but the task does not turn the resulting ToolOutput into a model response. Completion 10. After ExecCommandEnd, the task finishes without an assistant message; the session marks it complete and the exec cell displays the final output. Conversation context - The command and its output never enter the conversation history or the model prompt; the flow is local-only. - Only exec/task events are emitted for UI rendering. Demo video https://github.com/user-attachments/assets/fcd114b0-4304-4448-a367-a04c43e0b996	2025-10-29 00:31:20 -07:00
Eric Traut	ba95d9862c	Fixed bug that results in a sporadic hang when attaching images (#5891 ) Addresses https://github.com/openai/codex/issues/5773 Testing: I tested that images work (regardless of order that they are associated with the task prompt) in both the CLI and Extension. Also verified that conversations in CLI and extension with images can be resumed.	2025-10-28 14:42:46 -07:00
Celia Chen	4a42c4e142	[Auth] Choose which auth storage to use based on config (#5792 ) This PR is a follow-up to #5591. It allows users to choose which auth storage mode they want by using the new `cli_auth_credentials_store_mode` config.	2025-10-27 19:41:49 -07:00
Josh McKinney	66a4b89822	feat(tui): clarify Windows auto mode requirements (#5568 ) ## Summary - Coerce Windows `workspace-write` configs back to read-only, surface the forced downgrade in the approvals popup, and funnel users toward WSL or Full Access. - Add WSL installation instructions to the Auto preset on Windows while keeping the preset available for other platforms. - Skip the trust-on-first-run prompt on native Windows so new folders remain read-only without additional confirmation. - Expose a structured sandbox policy resolution from config to flag Windows downgrades and adjust tests (core, exec, TUI) to reflect the new behavior; provide a Windows-only approvals snapshot. ## Testing - cargo fmt - cargo test -p codex-core config::tests::add_dir_override_extends_workspace_writable_roots - cargo test -p codex-exec suite::resume::exec_resume_preserves_cli_configuration_overrides - cargo test -p codex-tui chatwidget::tests::approvals_selection_popup_snapshot - cargo test -p codex-tui approvals_popup_includes_wsl_note_for_auto_mode - cargo test -p codex-tui windows_skips_trust_prompt - just fix -p codex-core - just fix -p codex-tui	2025-10-28 01:19:32 +00:00
jif-oai	5e8659dcbc	chore: undo nits (#5631 )	2025-10-27 11:48:01 +00:00
jif-oai	afc4eaab8b	feat: TUI undo op (#5629 )	2025-10-27 10:55:29 +00:00
Michael Bolin	5907422d65	feat: annotate conversations with model_provider for filtering (#5658 ) Because conversations that use the Responses API can have encrypted reasoning messages, trying to resume a conversation with a different provider could lead to confusing "failed to decrypt" errors. (This is reproducible by starting a conversation using ChatGPT login and resuming it as a conversation that uses OpenAI models via Azure.) This changes `ListConversationsParams` to take a `model_providers: Option<Vec<String>>` and adds `model_provider` on each `ConversationSummary` it returns so these cases can be disambiguated. Note this ended up making changes to `codex-rs/core/src/rollout/tests.rs` because it had a number of cases where it expected `Some` for the value of `next_cursor`, but the list of rollouts was complete, so according to this docstring: `bcd64c7e72/codex-rs/app-server-protocol/src/protocol.rs (L334-L337)` If there are no more items to return, then `next_cursor` should be `None`. This PR updates that logic. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/5658). * #5803 * #5793 * __->__ #5658	2025-10-27 02:03:30 -07:00
Ahmed Ibrahim	f178805252	Add feedback upload request handling (#5682 )	2025-10-27 05:53:39 +00:00
Anton Panasenko	6af83d86ff	[codex][app-server] introduce codex/event/raw_item events (#5578 )	2025-10-24 22:41:52 +00:00
Eric Traut	f8af4f5c8d	Added model summary and risk assessment for commands that violate sandbox policy (#5536 ) This PR adds support for a model-based summary and risk assessment for commands that violate the sandbox policy and require user approval. This aids the user in evaluating whether the command should be approved. The feature works by taking a failed command and passing it back to the model and asking it to summarize the command, give it a risk level (low, medium, high) and a risk category (e.g. "data deletion" or "data exfiltration"). It uses a new conversation thread so the context in the existing thread doesn't influence the answer. If the call to the model fails or takes longer than 5 seconds, it falls back to the current behavior. For now, this is an experimental feature and is gated by a config key `experimental_sandbox_command_assessment`. Here is a screen shot of the approval prompt showing the risk assessment and summary. <img width="723" height="282" alt="image" src="https://github.com/user-attachments/assets/4597dd7c-d5a0-4e9f-9d13-414bd082fd6b" />	2025-10-24 15:23:44 -07:00
jif-oai	f522aafb7f	chore: drop approve all (#5503 ) Not needed anymore	2025-10-22 16:55:06 +01:00
pakrym-oai	1b10a3a1b2	Enable plan tool by default (#5384 ) ## Summary - make the plan tool available by default by removing the feature flag and always registering the handler - drop plan-tool CLI and API toggles across the exec, TUI, MCP server, and app server code paths - update tests and configs to reflect the always-on plan tool and guard workspace restriction tests against env leakage ## Testing Manually tested the extension. ------ https://chatgpt.com/codex/tasks/task_i_68f67a3ff2d083209562a773f814c1f9	2025-10-21 16:25:05 +00:00
pakrym-oai	9c903c4716	Add ItemStarted/ItemCompleted events for UserInputItem (#5306 ) Adds a new ItemStarted event and delivers UserMessage as the first item type (more to come). Renames `InputItem` to `UserInput` considering we're using the `Item` suffix for actual items.	2025-10-20 13:34:44 -07:00
Ahmed Ibrahim	049a61bcfc	Auto compact at ~90% (#5292 ) Users now hit a window exceeded limit and they usually don't know what to do. This starts auto compact at ~90% of the window.	2025-10-20 11:29:49 -07:00
pakrym-oai	cda6db6ccf	Always enable plan tool in exec (#5380 ) Fixes: https://github.com/openai/codex/issues/5359	2025-10-20 11:05:55 -07:00
Gabriel Peal	d87f87e25b	Add forced_chatgpt_workspace_id and forced_login_method configuration options (#5303 ) This PR adds support for configs to specify a forced login method (chatgpt or api) as well as a forced chatgpt account id. This lets enterprises uses [managed configs](https://developers.openai.com/codex/security#managed-configuration) to force all employees to use their company's workspace instead of their own or any other. When a workspace id is set, a query param is sent to the login flow which auto-selects the given workspace or errors if the user isn't a member of it. This PR is large but a large % of it is tests, wiring, and required formatting changes. API login with chatgpt forced <img width="1592" height="116" alt="CleanShot 2025-10-19 at 22 40 04" src="https://github.com/user-attachments/assets/560c6bb4-a20a-4a37-95af-93df39d057dd" /> ChatGPT login with api forced <img width="1018" height="100" alt="CleanShot 2025-10-19 at 22 40 29" src="https://github.com/user-attachments/assets/d010bbbb-9c8d-4227-9eda-e55bf043b4af" /> Onboarding with api forced <img width="892" height="460" alt="CleanShot 2025-10-19 at 22 41 02" src="https://github.com/user-attachments/assets/cc0ed45c-b257-4d62-a32e-6ca7514b5edd" /> Onboarding with ChatGPT forced <img width="1154" height="426" alt="CleanShot 2025-10-19 at 22 41 27" src="https://github.com/user-attachments/assets/41c41417-dc68-4bb4-b3e7-3b7769f7e6a1" /> Logging in with the wrong workspace <img width="2222" height="84" alt="CleanShot 2025-10-19 at 22 42 31" src="https://github.com/user-attachments/assets/0ff4222c-f626-4dd3-b035-0b7fe998a046" />	2025-10-20 08:50:54 -07:00
Thibault Sottiaux	4f46360aa4	feat: add --add-dir flag for extra writable roots (#5335 ) Add a `--add-dir` CLI flag so sessions can use extra writable roots in addition to the ones specified in the config file. These are ephemerally added during the session only. Fixes #3303 Fixes #2797	2025-10-18 22:13:53 -07:00
jif-oai	774892c6d7	feat: add auto-approval for codex exec (#5043 )	2025-10-15 19:03:54 +01:00
dedrisian-oai	4300236681	revert /name for now (#4978 ) There was a regression where we'd read entire rollout contents if there was no /name present.	2025-10-08 17:13:49 -07:00
dedrisian-oai	ec238a2c39	feat: Set chat name (#4974 ) Set chat name with `/name` so they appear in the codex resume page: https://github.com/user-attachments/assets/c0252bba-3a53-44c7-a740-f4690a3ad405	2025-10-08 16:35:35 -07:00
pakrym-oai	60f9e85c16	Set codex SDK TypeScript originator (#4894 ) ## Summary - ensure the TypeScript SDK sets CODEX_INTERNAL_ORIGINATOR_OVERRIDE to codex_sdk_ts when spawning the Codex CLI - extend the responses proxy test helper to capture request headers for assertions - add coverage that verifies Codex threads launched from the TypeScript SDK send the codex_sdk_ts originator header ## Testing - Not Run (not requested) ------ https://chatgpt.com/codex/tasks/task_i_68e561b125248320a487f129093d16e7	2025-10-07 14:06:41 -07:00
Tamir Duberstein	b16c985ed2	cli: fix zsh completion (#4692 ) Before this change: ``` tamird@L03G26TD12 codex-rs % codex zsh: do you wish to see all 3864 possibilities (1285 lines)? ``` After this change: ``` tamird@L03G26TD12 codex-rs % codex app-server -- [experimental] Run the app server apply a -- Apply the latest diff produced by Codex agent as a `git apply` to your local working tree cloud -- [EXPERIMENTAL] Browse tasks from Codex Cloud and apply changes locally completion -- Generate shell completion scripts debug -- Internal debugging commands exec e -- Run Codex non-interactively generate-ts -- Internal: generate TypeScript protocol bindings help -- Print this message or the help of the given subcommand(s) login -- Manage login logout -- Remove stored authentication credentials mcp -- [experimental] Run Codex as an MCP server and manage MCP servers mcp-server -- [experimental] Run the Codex MCP server (stdio transport) responses-api-proxy -- Internal: run the responses API proxy resume -- Resume a previous interactive session (picker by default; use --last to continue the most recent) ```	2025-10-07 08:07:31 -07:00
pakrym-oai	35a770e871	Simplify request body assertions (#4845 ) We'll have a lot more test like these	2025-10-07 09:56:39 +01:00
pakrym-oai	191d620707	Use response helpers when mounting SSE test responses (#4783 ) ## Summary - replace manual wiremock SSE mounts in the compact suite with the shared response helpers - simplify the exec auth_env integration test by using the mount_sse_once_match helper - rely on mount_sse_sequence plus server request collection to replace the bespoke SeqResponder utility in tests ## Testing - just fmt ------ https://chatgpt.com/codex/tasks/task_i_68e2e238f2a88320a337f0b9e4098093	2025-10-05 21:58:16 +00:00
pakrym-oai	aecbe0f333	Add helper for response created SSE events in tests (#4758 ) ## Summary - add a reusable `ev_response_created` helper that builds `response.created` SSE events for integration tests - update the exec and core integration suites to use the new helper instead of repeating manual JSON literals - keep the streaming fixtures consistent by relying on the shared helper in every touched test ## Testing - `just fmt` ------ https://chatgpt.com/codex/tasks/task_i_68e1fe885bb883208aafffb94218da61	2025-10-05 21:11:43 +00:00
Dylan	4764fc1ee7	feat: Freeform apply_patch with simple shell output (#4718 ) ## Summary This PR is an alternative approach to #4711, but instead of changing our storage, parses out shell calls in the client and reserializes them on the fly before we send them out as part of the request. What this changes: 1. Adds additional serialization logic when the ApplyPatchToolType::Freeform is in use. 2. Adds a --custom-apply-patch flag to enable this setting on a session-by-session basis. This change is delicate, but is not meant to be permanent. It is meant to be the first step in a migration: 1. (This PR) Add in-flight serialization with config 2. Update model_family default 3. Update serialization logic to store turn outputs in a structured format, with logic to serialize based on model_family setting. 4. Remove this rewrite in-flight logic. ## Test Plan - [x] Additional unit tests added - [x] Integration tests added - [x] Tested locally	2025-10-04 19:16:36 -07:00
Fouad Matin	a5b7675e42	add(core): managed config (#3868 ) ## Summary - Factor `load_config_as_toml` into `core::config_loader` so config loading is reusable across callers. - Layer `~/.codex/config.toml`, optional `~/.codex/managed_config.toml`, and macOS managed preferences (base64) with recursive table merging and scoped threads per source. ## Config Flow ``` Managed prefs (macOS profile: com.openai.codex/config_toml_base64) ▲ │ ~/.codex/managed_config.toml │ (optional file-based override) ▲ │ ~/.codex/config.toml (user-defined settings) ``` - The loader searches under the resolved `CODEX_HOME` directory (defaults to `~/.codex`). - Managed configs let administrators ship fleet-wide overrides via device profiles which is useful for enforcing certain settings like sandbox or approval defaults. - For nested hash tables: overlays merge recursively. Child tables are merged key-by-key, while scalar or array values replace the prior layer entirely. This lets admins add or tweak individual fields without clobbering unrelated user settings.	2025-10-03 13:02:26 -07:00
Michael Bolin	042d4d55d9	feat: `codex exec` writes only the final message to stdout (#4644 ) This updates `codex exec` so that, by default, most of the agent's activity is written to stderr so that only the final agent message is written to stdout. This makes it easier to pipe `codex exec` into another tool without extra filtering. I introduced `#![deny(clippy::print_stdout)]` to help enforce this change and renamed the `ts_println!()` macro to `ts_msg()` because (1) it no longer calls `println!()` and (2), `ts_eprintln!()` seemed too long of a name. While here, this also adds `-o` as an alias for `--output-last-message`. Fixes https://github.com/openai/codex/issues/1670	2025-10-03 16:22:12 +00:00
jif-oai	33d3ecbccc	chore: refactor tool handling (#4510 ) # Tool System Refactor - Centralizes tool definitions and execution in `core/src/tools/`: specs (`spec.rs`), handlers (`handlers/`), router (`router.rs`), registry/dispatch (`registry.rs`), and shared context (`context.rs`). One registry now builds the model-visible tool list and binds handlers. - Router converts model responses to tool calls; Registry dispatches with consistent telemetry via `codex-rs/otel` and unified error handling. Function, Local Shell, MCP, and experimental `unified_exec` all flow through this path; legacy shell aliases still work. - Rationale: reduce per‑tool boilerplate, keep spec/handler in sync, and make adding tools predictable and testable. Example: `read_file` - Spec: `core/src/tools/spec.rs` (see `create_read_file_tool`, registered by `build_specs`). - Handler: `core/src/tools/handlers/read_file.rs` (absolute `file_path`, 1‑indexed `offset`, `limit`, `L#: ` prefixes, safe truncation). - E2E test: `core/tests/suite/read_file.rs` validates the tool returns the requested lines. ## Next steps: - Decompose `handle_container_exec_with_params` - Add parallel tool calls	2025-10-03 13:21:06 +01:00
pakrym-oai	1d94b9111c	Use supports_color in codex exec (#4633 ) It knows how to detect github actions	2025-10-03 01:15:03 +00:00
pakrym-oai	c405d8c06c	Rename assistant message to agent message and fix item type field naming (#4610 ) Naming cleanup	2025-10-02 15:07:14 -07:00
pakrym-oai	f895d4cbb3	Minor cleanup of codex exec output (#4585 ) <img width="850" height="723" alt="image" src="https://github.com/user-attachments/assets/2ae067bf-ba6b-47bf-9ffe-d1c3f3aa1870" /> <img width="872" height="547" alt="image" src="https://github.com/user-attachments/assets/9058be24-6513-4423-9dae-2d5fd4cbf162" />	2025-10-02 14:17:42 -07:00
pakrym-oai	4c566d484a	Separate interactive and non-interactive sessions (#4612 ) Do not show exec session in VSCode/TUI selector.	2025-10-02 13:06:21 -07:00
Jeremy Rose	45936f8fbd	show "Viewed Image" when the model views an image (#4475 ) <img width="1022" height="339" alt="Screenshot 2025-09-29 at 4 22 00 PM" src="https://github.com/user-attachments/assets/12da7358-19be-4010-a71b-496ede6dfbbf" />	2025-10-02 18:36:03 +00:00
pakrym-oai	2f6fb37d72	Support CODEX_API_KEY for codex exec (#4615 ) Allows to set API key per invocation of `codex exec`	2025-10-02 09:59:45 -07:00
pakrym-oai	31102af54b	Add initial set of doc comments to the SDK (#4513 ) Also perform minor code cleanup.	2025-10-01 13:12:59 -07:00
Michael Bolin	5881c0d6d4	fix: remove mcp-types from app server protocol (#4537 ) We continue the separation between `codex app-server` and `codex mcp-server`. In particular, we introduce a new crate, `codex-app-server-protocol`, and migrate `codex-rs/protocol/src/mcp_protocol.rs` into it, renaming it `codex-rs/app-server-protocol/src/protocol.rs`. Because `ConversationId` was defined in `mcp_protocol.rs`, we move it into its own file, `codex-rs/protocol/src/conversation_id.rs`, and because it is referenced in a ton of places, we have to touch a lot of files as part of this PR. We also decide to get away from proper JSON-RPC 2.0 semantics, so we also introduce `codex-rs/app-server-protocol/src/jsonrpc_lite.rs`, which is basically the same `JSONRPCMessage` type defined in `mcp-types` except with all of the `"jsonrpc": "2.0"` removed. Getting rid of `"jsonrpc": "2.0"` makes our serialization logic considerably simpler, as we can lean heavier on serde to serialize directly into the wire format that we use now.	2025-10-01 02:16:26 +00:00
pakrym-oai	7fc3edf8a7	Remove legacy codex exec --json format (#4525 ) `codex exec --json` now maps to the behavior of `codex exec --experimental-json` with new event and item shapes. Thread events: - thread.started - turn.started - turn.completed - turn.failed - item.started - item.updated - item.completed Item types: - assistant_message - reasoning - command_execution - file_change - mcp_tool_call - web_search - todo_list - error Sample output: <details> `codex exec "list my assigned github issues" --json \| jq` ``` { "type": "thread.started", "thread_id": "01999ce5-f229-7661-8570-53312bd47ea3" } { "type": "turn.started" } { "type": "item.completed", "item": { "id": "item_0", "item_type": "reasoning", "text": "Planning to list assigned GitHub issues" } } { "type": "item.started", "item": { "id": "item_1", "item_type": "mcp_tool_call", "server": "github", "tool": "search_issues", "status": "in_progress" } } { "type": "item.completed", "item": { "id": "item_1", "item_type": "mcp_tool_call", "server": "github", "tool": "search_issues", "status": "completed" } } { "type": "item.completed", "item": { "id": "item_2", "item_type": "reasoning", "text": "Organizing final message structure" } } { "type": "item.completed", "item": { "id": "item_3", "item_type": "assistant_message", "text": "Assigned Issues\n- openai/codex#3267 – “stream error: stream disconnected before completion…” (bug) – last update 2025-09-08\n- openai/codex#3257 – “You've hit your usage limit. Try again in 4 days 20 hours 9 minutes.” – last update 2025-09-23\n- openai/codex#3054 – “reqwest SSL panic (library has no ciphers)” (bug) – last update 2025-09-03\n- openai/codex#3051 – “thread 'main' panicked at linux-sandbox/src/linux_run_main.rs:53:5:” (bug) – last update 2025-09-10\n- openai/codex#3004 – “Auto-compact when approaching context limit” (enhancement) – last update 2025-09-26\n- openai/codex#2916 – “Feature request: Add OpenAI service tier support for cost optimization” – last update 2025-09-12\n- openai/codex#1581 – “stream error: stream disconnected before completion: stream closed before response.complete; retrying...” (bug) – last update 2025-09-17" } } { "type": "turn.completed", "usage": { "input_tokens": 34785, "cached_input_tokens": 12544, "output_tokens": 560 } } ``` </details>	2025-09-30 17:21:37 -07:00
pakrym-oai	a534356fe1	Wire up web search item (#4511 ) Add handling for web search events.	2025-09-30 12:01:17 -07:00
pakrym-oai	c09e131653	Set originator for codex exec (#4485 ) Distinct from the main CLI.	2025-09-29 20:59:19 -07:00
pakrym-oai	ea82f86662	Rename conversation to thread in codex exec (#4482 )	2025-09-29 20:18:30 -07:00
pakrym-oai	a8edc57740	Add MCP tool call item to codex exec (#4481 ) No arguments/results for now. ``` { "type": "item.started", "item": { "id": "item_1", "item_type": "mcp_tool_call", "server": "github", "tool": "search_issues", "status": "in_progress" } } { "type": "item.completed", "item": { "id": "item_1", "item_type": "mcp_tool_call", "server": "github", "tool": "search_issues", "status": "completed" } } ```	2025-09-29 19:45:11 -07:00
pakrym-oai	4a80059b1b	Add turn.failed and rename session created to thread started (#4478 ) Don't produce completed when turn failed.	2025-09-29 18:38:04 -07:00
vishnu-oai	04c1782e52	OpenTelemetry events (#2103 ) ### Title ## otel Codex can emit [OpenTelemetry](https://opentelemetry.io/) log events that describe each run: outbound API requests, streamed responses, user input, tool-approval decisions, and the result of every tool invocation. Export is disabled by default so local runs remain self-contained. Opt in by adding an `[otel]` table and choosing an exporter. ```toml [otel] environment = "staging" # defaults to "dev" exporter = "none" # defaults to "none"; set to otlp-http or otlp-grpc to send events log_user_prompt = false # defaults to false; redact prompt text unless explicitly enabled ``` Codex tags every exported event with `service.name = "codex-cli"`, the CLI version, and an `env` attribute so downstream collectors can distinguish dev/staging/prod traffic. Only telemetry produced inside the `codex_otel` crate—the events listed below—is forwarded to the exporter. ### Event catalog Every event shares a common set of metadata fields: `event.timestamp`, `conversation.id`, `app.version`, `auth_mode` (when available), `user.account_id` (when available), `terminal.type`, `model`, and `slug`. With OTEL enabled Codex emits the following event types (in addition to the metadata above): - `codex.api_request` - `cf_ray` (optional) - `attempt` - `duration_ms` - `http.response.status_code` (optional) - `error.message` (failures) - `codex.sse_event` - `event.kind` - `duration_ms` - `error.message` (failures) - `input_token_count` (completion only) - `output_token_count` (completion only) - `cached_token_count` (completion only, optional) - `reasoning_token_count` (completion only, optional) - `tool_token_count` (completion only) - `codex.user_prompt` - `prompt_length` - `prompt` (redacted unless `log_user_prompt = true`) - `codex.tool_decision` - `tool_name` - `call_id` - `decision` (`approved`, `approved_for_session`, `denied`, or `abort`) - `source` (`config` or `user`) - `codex.tool_result` - `tool_name` - `call_id` - `arguments` - `duration_ms` (execution time for the tool) - `success` (`"true"` or `"false"`) - `output` ### Choosing an exporter Set `otel.exporter` to control where events go: - `none` – leaves instrumentation active but skips exporting. This is the default. - `otlp-http` – posts OTLP log records to an OTLP/HTTP collector. Specify the endpoint, protocol, and headers your collector expects: ```toml [otel] exporter = { otlp-http = { endpoint = "https://otel.example.com/v1/logs", protocol = "binary", headers = { "x-otlp-api-key" = "${OTLP_TOKEN}" } }} ``` - `otlp-grpc` – streams OTLP log records over gRPC. Provide the endpoint and any metadata headers: ```toml [otel] exporter = { otlp-grpc = { endpoint = "https://otel.example.com:4317", headers = { "x-otlp-meta" = "abc123" } }} ``` If the exporter is `none` nothing is written anywhere; otherwise you must run or point to your own collector. All exporters run on a background batch worker that is flushed on shutdown. If you build Codex from source the OTEL crate is still behind an `otel` feature flag; the official prebuilt binaries ship with the feature enabled. When the feature is disabled the telemetry hooks become no-ops so the CLI continues to function without the extra dependencies. --------- Co-authored-by: Anton Panasenko <apanasenko@openai.com>	2025-09-29 11:30:55 -07:00
pakrym-oai	cc1b21e47f	Add turn started/completed events and correct exit code on error (#4309 ) Adds new event for session completed that includes usage. Also ensures we return 1 on failures. ``` { "type": "session.created", "session_id": "019987a7-93e7-7b20-9e05-e90060e411ea" } { "type": "turn.started" } ... { "type": "turn.completed", "usage": { "input_tokens": 78913, "cached_input_tokens": 65280, "output_tokens": 1099 } } ```	2025-09-26 16:21:50 -07:00
pakrym-oai	ea095e30c1	Add todo-list tool support (#4255 ) Adds a 1-per-turn todo-list item and item.updated event ```jsonl {"type":"item.started","item":{"id":"item_6","item_type":"todo_list","items":[{"text":"Record initial two-step plan now","completed":false},{"text":"Update progress to next step","completed":false}]}} {"type":"item.updated","item":{"id":"item_6","item_type":"todo_list","items":[{"text":"Record initial two-step plan now","completed":true},{"text":"Update progress to next step","completed":false}]}} {"type":"item.completed","item":{"id":"item_6","item_type":"todo_list","items":[{"text":"Record initial two-step plan now","completed":true},{"text":"Update progress to next step","completed":false}]}} ```	2025-09-26 09:35:47 -07:00
pakrym-oai	8e3a048fec	Add codex exec testing helpers (#4254 ) Add a shortcut to create working directories and run codex exec with fake server.	2025-09-25 17:12:45 -07:00
pakrym-oai	67aab04c66	[codex exec] Add item.started and support it for command execution (#4250 ) Adds a new `item.started` event to `codex exec` and implements it for command_execution item type. ```jsonl {"type":"session.created","session_id":"019982d1-75f0-7920-b051-e0d3731a5ed8"} {"type":"item.completed","item":{"id":"item_0","item_type":"reasoning","text":"Executing commands securely\n\nI'm thinking about how the default harness typically uses \"bash -lc,\" while historically \"bash\" is what we've been using. The command should be executed as a string in our CLI, so using \"bash -lc 'echo hello'\" is optimal but calling \"echo hello\" directly feels safer. The sandbox makes sure environment variables like CODEX_SANDBOX_NETWORK_DISABLED=1 are set, so I won't ask for approval. I just need to run \"echo hello\" and correctly present the output."}} {"type":"item.completed","item":{"id":"item_1","item_type":"reasoning","text":"Preparing for tool calls\n\nI realize that I need to include a preamble before making any tool calls. So, I'll first state the preamble in the commentary channel, then proceed with the tool call. After that, I need to present the final message along with the output. It's possible that the CLI will show the output inline, but I must ensure that I present the result clearly regardless. Let's move forward and get this organized!"}} {"type":"item.completed","item":{"id":"item_2","item_type":"assistant_message","text":"Running `echo` to confirm shell access and print output."}} {"type":"item.started","item":{"id":"item_3","item_type":"command_execution","command":"bash -lc echo hello","aggregated_output":"","exit_code":null,"status":"in_progress"}} {"type":"item.completed","item":{"id":"item_3","item_type":"command_execution","command":"bash -lc echo hello","aggregated_output":"hello\n","exit_code":0,"status":"completed"}} {"type":"item.completed","item":{"id":"item_4","item_type":"assistant_message","text":"hello"}} ```	2025-09-25 22:25:02 +00:00
Jeremy Rose	4a5f05c136	make tests pass cleanly in sandbox (#4067 ) This changes the reqwest client used in tests to be sandbox-friendly, and skips a bunch of other tests that don't work inside the sandbox/without network.	2025-09-25 13:11:14 -07:00
pakrym-oai	344d4a1d68	Add explicit codex exec events (#4177 ) This pull request add a new experimental format of JSON output. You can try it using `codex exec --experimental-json`. Design takes a lot of inspiration from Responses API items and stream format. # Session and items Each invocation of `codex exec` starts or resumes a session. Session contains multiple high-level item types: 1. Assistant message 2. Assistant thinking 3. Command execution 4. File changes 5. To-do lists 6. etc. # Events Session and items are going through their life cycles which is represented by events. Session is `session.created` or `session.resumed` Items are `item.added`, `item.updated`, `item.completed`, `item.require_approval` (or other item types like `item.output_delta` when we need streaming). So a typical session can look like: <details> ``` { "type": "session.created", "session_id": "01997dac-9581-7de3-b6a0-1df8256f2752" } { "type": "item.completed", "item": { "id": "itm_0", "item_type": "assistant_message", "text": "I’ll locate the top-level README and remove its first line. Then I’ll show a quick summary of what changed." } } { "type": "item.completed", "item": { "id": "itm_1", "item_type": "command_execution", "command": "bash -lc ls -la \| sed -n '1,200p'", "aggregated_output": "pyenv: cannot rehash: /Users/pakrym/.pyenv/shims isn't writable\ntotal 192\ndrwxr-xr-x@ 33 pakrym staff 1056 Sep 24 14:36 .\ndrwxr-xr-x 41 pakrym staff 1312 Sep 24 09:17 ..\n-rw-r--r--@ 1 pakrym staff 6 Jul 9 16:16 .codespellignore\n-rw-r--r--@ 1 pakrym staff 258 Aug 13 09:40 .codespellrc\ndrwxr-xr-x@ 5 pakrym staff 160 Jul 23 08:26 .devcontainer\n-rw-r--r--@ 1 pakrym staff 6148 Jul 22 10:03 .DS_Store\ndrwxr-xr-x@ 15 pakrym staff 480 Sep 24 14:38 .git\ndrwxr-xr-x@ 12 pakrym staff 384 Sep 2 16:00 .github\n-rw-r--r--@ 1 pakrym staff 778 Jul 9 16:16 .gitignore\ndrwxr-xr-x@ 3 pakrym staff 96 Aug 11 09:37 .husky\n-rw-r--r--@ 1 pakrym staff 104 Jul 9 16:16 .npmrc\n-rw-r--r--@ 1 pakrym staff 96 Sep 2 08:52 .prettierignore\n-rw-r--r--@ 1 pakrym staff 170 Jul 9 16:16 .prettierrc.toml\ndrwxr-xr-x@ 5 pakrym staff 160 Sep 14 17:43 .vscode\ndrwxr-xr-x@ 2 pakrym staff 64 Sep 11 11:37 2025-09-11\n-rw-r--r--@ 1 pakrym staff 5505 Sep 18 09:28 AGENTS.md\n-rw-r--r--@ 1 pakrym staff 92 Sep 2 08:52 CHANGELOG.md\n-rw-r--r--@ 1 pakrym staff 1145 Jul 9 16:16 cliff.toml\ndrwxr-xr-x@ 11 pakrym staff 352 Sep 24 13:03 codex-cli\ndrwxr-xr-x@ 38 pakrym staff 1216 Sep 24 14:38 codex-rs\ndrwxr-xr-x@ 18 pakrym staff 576 Sep 23 11:01 docs\n-rw-r--r--@ 1 pakrym staff 2038 Jul 9 16:16 flake.lock\n-rw-r--r--@ 1 pakrym staff 1434 Jul 9 16:16 flake.nix\n-rw-r--r--@ 1 pakrym staff 10926 Jul 9 16:16 LICENSE\ndrwxr-xr-x@ 465 pakrym staff 14880 Jul 15 07:36 node_modules\n-rw-r--r--@ 1 pakrym staff 242 Aug 5 08:25 NOTICE\n-rw-r--r--@ 1 pakrym staff 578 Aug 14 12:31 package.json\n-rw-r--r--@ 1 pakrym staff 498 Aug 11 09:37 pnpm-lock.yaml\n-rw-r--r--@ 1 pakrym staff 58 Aug 11 09:37 pnpm-workspace.yaml\n-rw-r--r--@ 1 pakrym staff 2402 Jul 9 16:16 PNPM.md\n-rw-r--r--@ 1 pakrym staff 4393 Sep 12 14:36 README.md\ndrwxr-xr-x@ 4 pakrym staff 128 Sep 18 09:28 scripts\ndrwxr-xr-x@ 2 pakrym staff 64 Sep 11 11:34 tmp\n", "exit_code": 0, "status": "completed" } } { "type": "item.completed", "item": { "id": "itm_2", "item_type": "reasoning", "text": "Reviewing README.md file\n\nI've located the README.md file at the root, and it’s 4393 bytes. Now, I need to remove the first line, but first, I should check its content to make sure I’m patching it correctly. I’ll use sed to display the first 20 lines. By reviewing those lines, I can determine exactly what needs to be removed before I proceed with the editing. Let's do this carefully!" } } { "type": "item.completed", "item": { "id": "itm_3", "item_type": "command_execution", "command": "bash -lc sed -n '1,40p' README.md", "aggregated_output": "<h1 align=\"center\">OpenAI Codex CLI</h1>\n\n<p align=\"center\"><code>npm i -g @openai/codex</code><br />or <code>brew install codex</code></p>\n\n<p align=\"center\"><strong>Codex CLI</strong> is a coding agent from OpenAI that runs locally on your computer.\n</br>\n</br>If you want Codex in your code editor (VS Code, Cursor, Windsurf), <a href=\"https://developers.openai.com/codex/ide\">install in your IDE</a>\n</br>If you are looking for the <em>cloud-based agent</em> from OpenAI, <strong>Codex Web</strong>, go to <a href=\"https://chatgpt.com/codex\">chatgpt.com/codex</a></p>\n\n<p align=\"center\">\n <img src=\"./.github/codex-cli-splash.png\" alt=\"Codex CLI splash\" width=\"80%\" />\n </p>\n\n---\n\n## Quickstart\n\n### Installing and running Codex CLI\n\nInstall globally with your preferred package manager. If you use npm:\n\n```shell\nnpm install -g @openai/codex\n```\n\nAlternatively, if you use Homebrew:\n\n```shell\nbrew install codex\n```\n\nThen simply run `codex` to get started:\n\n```shell\ncodex\n```\n\n<details>\n<summary>You can also go to the <a href=\"https://github.com/openai/codex/releases/latest\">latest GitHub Release</a> and download the appropriate binary for your platform.</summary>\n\n", "exit_code": 0, "status": "completed" } } { "type": "item.completed", "item": { "id": "itm_4", "item_type": "reasoning", "text": "Updating README file\n\nThe user wants to update the README by removing the first line, which is an HTML h1 tag. I'll proceed to delete that line while keeping the rest intact. To do this, I need to use the apply_patch tool and create a patch in the diff format. I’ll take care to include context lines around the change. After removing the header, I’m considering whether to delete the following blank line as well to keep it tidy." } } { "type": "item.completed", "item": { "id": "itm_5", "item_type": "reasoning", "text": "Crafting the patch\n\nThe user has asked to remove just the first line, without mentioning the blank lines. Technically, leaving a blank line at the start is acceptable, but it's not ideal. Ideally, I'd prefer to remove the first line while maintaining the subsequent blank line. However, if I can also remove that blank line, it could tidy things up even more. Still, since the user's request is strictly to remove only the first line, I'll stick to that and craft the patch accordingly." } } { "type": "item.completed", "item": { "id": "itm_6", "item_type": "assistant_message", "text": "Applying the change by removing the first line from `README.md`." } } { "type": "item.completed", "item": { "id": "itm_7", "item_type": "file_change", "changes": [ { "path": "/Users/pakrym/code/codex/README.md", "kind": "update" } ], "status": "completed" } } { "type": "item.completed", "item": { "id": "itm_8", "item_type": "reasoning", "text": "Validating README Update\n\nI updated the README.md and now I need to validate that the first line has been successfully removed. To do this, I'll check the first 10 lines of the document to ensure everything looks correct. It’s always good to double-check these kinds of changes, so I’m going to take that extra step to make sure the update was executed properly. Let's see what those initial lines say!" } } { "type": "item.completed", "item": { "id": "itm_9", "item_type": "command_execution", "command": "bash -lc sed -n '1,20p' README.md", "aggregated_output": "\n<p align=\"center\"><code>npm i -g @openai/codex</code><br />or <code>brew install codex</code></p>\n\n<p align=\"center\"><strong>Codex CLI</strong> is a coding agent from OpenAI that runs locally on your computer.\n</br>\n</br>If you want Codex in your code editor (VS Code, Cursor, Windsurf), <a href=\"https://developers.openai.com/codex/ide\">install in your IDE</a>\n</br>If you are looking for the <em>cloud-based agent</em> from OpenAI, <strong>Codex Web</strong>, go to <a href=\"https://chatgpt.com/codex\">chatgpt.com/codex</a></p>\n\n<p align=\"center\">\n <img src=\"./.github/codex-cli-splash.png\" alt=\"Codex CLI splash\" width=\"80%\" />\n </p>\n\n---\n\n## Quickstart\n\n### Installing and running Codex CLI\n\nInstall globally with your preferred package manager. If you use npm:\n\n", "exit_code": 0, "status": "completed" } } { "type": "item.completed", "item": { "id": "itm_10", "item_type": "assistant_message", "text": "- Updated `README.md:1` by removing the opening H1 line.\n- The README now starts with the install instructions paragraph.\n\nWant me to also remove the leading blank line at the top?" } } ``` </details> The idea is to give users fully formatted items they can use directly in their rendering/application logic and avoid having them building up items manually based on events (unless they want to for streaming). This PR implements only the `item.completed` payload for some event types, more event types and item types to come. --------- Co-authored-by: Michael Bolin <mbolin@openai.com>	2025-09-25 17:47:09 +00:00

1 2 3 4

178 Commits