OpenTelemetry events (#2103)

### Title ## otel Codex can emit [OpenTelemetry](https://opentelemetry.io/) **log events** that describe each run: outbound API requests, streamed responses, user input, tool-approval decisions, and the result of every tool invocation. Export is **disabled by default** so local runs remain self-contained. Opt in by adding an `[otel]` table and choosing an exporter. ```toml [otel] environment = "staging" # defaults to "dev" exporter = "none" # defaults to "none"; set to otlp-http or otlp-grpc to send events log_user_prompt = false # defaults to false; redact prompt text unless explicitly enabled ``` Codex tags every exported event with `service.name = "codex-cli"`, the CLI version, and an `env` attribute so downstream collectors can distinguish dev/staging/prod traffic. Only telemetry produced inside the `codex_otel` crate—the events listed below—is forwarded to the exporter. ### Event catalog Every event shares a common set of metadata fields: `event.timestamp`, `conversation.id`, `app.version`, `auth_mode` (when available), `user.account_id` (when available), `terminal.type`, `model`, and `slug`. With OTEL enabled Codex emits the following event types (in addition to the metadata above): - `codex.api_request` - `cf_ray` (optional) - `attempt` - `duration_ms` - `http.response.status_code` (optional) - `error.message` (failures) - `codex.sse_event` - `event.kind` - `duration_ms` - `error.message` (failures) - `input_token_count` (completion only) - `output_token_count` (completion only) - `cached_token_count` (completion only, optional) - `reasoning_token_count` (completion only, optional) - `tool_token_count` (completion only) - `codex.user_prompt` - `prompt_length` - `prompt` (redacted unless `log_user_prompt = true`) - `codex.tool_decision` - `tool_name` - `call_id` - `decision` (`approved`, `approved_for_session`, `denied`, or `abort`) - `source` (`config` or `user`) - `codex.tool_result` - `tool_name` - `call_id` - `arguments` - `duration_ms` (execution time for the tool) - `success` (`"true"` or `"false"`) - `output` ### Choosing an exporter Set `otel.exporter` to control where events go: - `none` – leaves instrumentation active but skips exporting. This is the default. - `otlp-http` – posts OTLP log records to an OTLP/HTTP collector. Specify the endpoint, protocol, and headers your collector expects: ```toml [otel] exporter = { otlp-http = { endpoint = "https://otel.example.com/v1/logs", protocol = "binary", headers = { "x-otlp-api-key" = "${OTLP_TOKEN}" } }} ``` - `otlp-grpc` – streams OTLP log records over gRPC. Provide the endpoint and any metadata headers: ```toml [otel] exporter = { otlp-grpc = { endpoint = "https://otel.example.com:4317", headers = { "x-otlp-meta" = "abc123" } }} ``` If the exporter is `none` nothing is written anywhere; otherwise you must run or point to your own collector. All exporters run on a background batch worker that is flushed on shutdown. If you build Codex from source the OTEL crate is still behind an `otel` feature flag; the official prebuilt binaries ship with the feature enabled. When the feature is disabled the telemetry hooks become no-ops so the CLI continues to function without the extra dependencies. --------- Co-authored-by: Anton Panasenko <apanasenko@openai.com>
2025-09-29 19:30:55 +01:00
parent d15253415a
commit 04c1782e52
38 changed files with 3069 additions and 142 deletions
--- a/docs/config.md
+++ b/docs/config.md
@@ -435,6 +435,117 @@ set = { PATH = "/usr/bin", MY_FLAG = "1" }

 Currently, `CODEX_SANDBOX_NETWORK_DISABLED=1` is also added to the environment, assuming network is disabled. This is not configurable.

+## otel
+
+Codex can emit [OpenTelemetry](https://opentelemetry.io/) **log events** that
+describe each run: outbound API requests, streamed responses, user input,
+tool-approval decisions, and the result of every tool invocation. Export is
+**disabled by default** so local runs remain self-contained. Opt in by adding an
+`[otel]` table and choosing an exporter.
+
+```toml
+[otel]
+environment = "staging"   # defaults to "dev"
+exporter = "none"          # defaults to "none"; set to otlp-http or otlp-grpc to send events
+log_user_prompt = false    # defaults to false; redact prompt text unless explicitly enabled
+```
+
+Codex tags every exported event with `service.name = $ORIGINATOR` (the same
+value sent in the `originator` header, `codex_cli_rs` by default), the CLI
+version, and an `env` attribute so downstream collectors can distinguish
+dev/staging/prod traffic. Only telemetry produced inside the `codex_otel`
+crate—the events listed below—is forwarded to the exporter.
+
+### Event catalog
+
+Every event shares a common set of metadata fields: `event.timestamp`,
+`conversation.id`, `app.version`, `auth_mode` (when available),
+`user.account_id` (when available), `terminal.type`, `model`, and `slug`.
+
+With OTEL enabled Codex emits the following event types (in addition to the
+metadata above):
+
+- `codex.conversation_starts`
+  - `provider_name`
+  - `reasoning_effort` (optional)
+  - `reasoning_summary`
+  - `context_window` (optional)
+  - `max_output_tokens` (optional)
+  - `auto_compact_token_limit` (optional)
+  - `approval_policy`
+  - `sandbox_policy`
+  - `mcp_servers` (comma-separated list)
+  - `active_profile` (optional)
+- `codex.api_request`
+  - `attempt`
+  - `duration_ms`
+  - `http.response.status_code` (optional)
+  - `error.message` (failures)
+- `codex.sse_event`
+  - `event.kind`
+  - `duration_ms`
+  - `error.message` (failures)
+  - `input_token_count` (responses only)
+  - `output_token_count` (responses only)
+  - `cached_token_count` (responses only, optional)
+  - `reasoning_token_count` (responses only, optional)
+  - `tool_token_count` (responses only)
+- `codex.user_prompt`
+  - `prompt_length`
+  - `prompt` (redacted unless `log_user_prompt = true`)
+- `codex.tool_decision`
+  - `tool_name`
+  - `call_id`
+  - `decision` (`approved`, `approved_for_session`, `denied`, or `abort`)
+  - `source` (`config` or `user`)
+- `codex.tool_result`
+  - `tool_name`
+  - `call_id` (optional)
+  - `arguments` (optional)
+  - `duration_ms` (execution time for the tool)
+  - `success` (`"true"` or `"false"`)
+  - `output`
+
+These event shapes may change as we iterate.
+
+### Choosing an exporter
+
+Set `otel.exporter` to control where events go:
+
+- `none` – leaves instrumentation active but skips exporting. This is the
+  default.
+- `otlp-http` – posts OTLP log records to an OTLP/HTTP collector. Specify the
+  endpoint, protocol, and headers your collector expects:
+
+  ```toml
+  [otel]
+  exporter = { otlp-http = {
+    endpoint = "https://otel.example.com/v1/logs",
+    protocol = "binary",
+    headers = { "x-otlp-api-key" = "${OTLP_TOKEN}" }
+  }}
+  ```
+
+- `otlp-grpc` – streams OTLP log records over gRPC. Provide the endpoint and any
+  metadata headers:
+
+  ```toml
+  [otel]
+  exporter = { otlp-grpc = {
+    endpoint = "https://otel.example.com:4317",
+    headers = { "x-otlp-meta" = "abc123" }
+  }}
+  ```
+
+If the exporter is `none` nothing is written anywhere; otherwise you must run or point to your
+own collector. All exporters run on a background batch worker that is flushed on
+shutdown.
+
+If you build Codex from source the OTEL crate is still behind an `otel` feature
+flag; the official prebuilt binaries ship with the feature enabled. When the
+feature is disabled the telemetry hooks become no-ops so the CLI continues to
+function without the extra dependencies.
+
 ## notify

 Specify a program that will be executed to get notified about events generated by Codex. Note that the program will receive the notification argument as a string of JSON, e.g.: