Files

Michael Bolin fcfe43c7df feat: show number of tokens remaining in UI (#1388 )

When using the OpenAI Responses API, we now record the `usage` field for
a `"response.completed"` event, which includes metrics about the number
of tokens consumed. We also introduce `openai_model_info.rs`, which
includes current data about the most common OpenAI models available via
the API (specifically `context_window` and `max_output_tokens`). If
Codex does not recognize the model, you can set `model_context_window`
and `model_max_output_tokens` explicitly in `config.toml`.

When then introduce a new event type to `protocol.rs`, `TokenCount`,
which includes the `TokenUsage` for the most recent turn.

Finally, we update the TUI to record the running sum of tokens used so
the percentage of available context window remaining can be reported via
the placeholder text for the composer:

![Screenshot 2025-06-25 at 11 20
55 PM](https://github.com/user-attachments/assets/6fd6982f-7247-4f14-84b2-2e600cb1fd49)

We could certainly get much fancier with this (such as reporting the
estimated cost of the conversation), but for now, we are just trying to
achieve feature parity with the TypeScript CLI.

Though arguably this improves upon the TypeScript CLI, as the TypeScript
CLI uses heuristics to estimate the number of tokens used rather than
using the `usage` information directly:


296996d74e/codex-cli/src/utils/approximate-tokens-used.ts (L3-L16)

Fixes https://github.com/openai/codex/issues/1242

2025-06-25 23:31:11 -07:00

17 KiB

Raw Blame History

Config

Codex supports several mechanisms for setting config values:

Config-specific command-line flags, such as --model o3 (highest precedence).
A generic -c/--config flag that takes a key=value pair, such as --config model="o3".
- The key can contain dots to set a value deeper than the root, e.g. --config model_providers.openai.wire_api="chat".
- Values can contain objects, such as --config shell_environment_policy.include_only=["PATH", "HOME", "USER"].
- For consistency with config.toml, values are in TOML format rather than JSON format, so use {a = 1, b = 2} rather than {"a": 1, "b": 2}.
- If value cannot be parsed as a valid TOML value, it is treated as a string value. This means that both -c model="o3" and -c model=o3 are equivalent.
The $CODEX_HOME/config.toml configuration file where the CODEX_HOME environment value defaults to ~/.codex. (Note CODEX_HOME will also be where logs and other Codex-related information are stored.)

Both the --config flag and the config.toml file support the following options:

model

The model that Codex should use.

model = "o3"  # overrides the default of "codex-mini-latest"

model_provider

Codex comes bundled with a number of "model providers" predefined. This config value is a string that indicates which provider to use. You can also define your own providers via model_providers.

For example, if you are running ollama with Mistral locally, then you would need to add the following to your config:

model = "mistral"
model_provider = "ollama"

because the following definition for ollama is included in Codex:

[model_providers.ollama]
name = "Ollama"
base_url = "http://localhost:11434/v1"
wire_api = "chat"

This option defaults to "openai" and the corresponding provider is defined as follows:

[model_providers.openai]
name = "OpenAI"
base_url = "https://api.openai.com/v1"
env_key = "OPENAI_API_KEY"
wire_api = "responses"

model_providers

This option lets you override and amend the default set of model providers bundled with Codex. This value is a map where the key is the value to use with model_provider to select the correspodning provider.

For example, if you wanted to add a provider that uses the OpenAI 4o model via the chat completions API, then you

# Recall that in TOML, root keys must be listed before tables.
model = "gpt-4o"
model_provider = "openai-chat-completions"

[model_providers.openai-chat-completions]
# Name of the provider that will be displayed in the Codex UI.
name = "OpenAI using Chat Completions"
# The path `/chat/completions` will be amended to this URL to make the POST
# request for the chat completions.
base_url = "https://api.openai.com/v1"
# If `env_key` is set, identifies an environment variable that must be set when
# using Codex with this provider. The value of the environment variable must be
# non-empty and will be used in the `Bearer TOKEN` HTTP header for the POST request.
env_key = "OPENAI_API_KEY"
# valid values for wire_api are "chat" and "responses".
wire_api = "chat"

approval_policy

Determines when the user should be prompted to approve whether Codex can execute a command:

# Codex has hardcoded logic that defines a set of "trusted" commands.
# Setting the approval_policy to `untrusted` means that Codex will prompt the
# user before running a command not in the "trusted" set.
#
# See https://github.com/openai/codex/issues/1260 for the plan to enable
# end-users to define their own trusted commands.
approval_policy = "untrusted"

# If the command fails when run in the sandbox, Codex asks for permission to
# retry the command outside the sandbox.
approval_policy = "on-failure"

# User is never prompted: if the command fails, Codex will automatically try
# something out. Note the `exec` subcommand always uses this mode.
approval_policy = "never"

profiles

A profile is a collection of configuration values that can be set together. Multiple profiles can be defined in config.toml and you can specify the one you want to use at runtime via the --profile flag.

Here is an example of a config.toml that defines multiple profiles:

model = "o3"
approval_policy = "unless-allow-listed"
disable_response_storage = false

# Setting `profile` is equivalent to specifying `--profile o3` on the command
# line, though the `--profile` flag can still be used to override this value.
profile = "o3"

[model_providers.openai-chat-completions]
name = "OpenAI using Chat Completions"
base_url = "https://api.openai.com/v1"
env_key = "OPENAI_API_KEY"
wire_api = "chat"

[profiles.o3]
model = "o3"
model_provider = "openai"
approval_policy = "never"

[profiles.gpt3]
model = "gpt-3.5-turbo"
model_provider = "openai-chat-completions"

[profiles.zdr]
model = "o3"
model_provider = "openai"
approval_policy = "on-failure"
disable_response_storage = true

Users can specify config values at multiple levels. Order of precedence is as follows:

custom command-line argument, e.g., --model o3
as part of a profile, where the --profile is specified via a CLI (or in the config file itself)
as an entry in config.toml, e.g., model = "o3"
the default value that comes with Codex CLI (i.e., Codex CLI defaults to codex-mini-latest)

model_reasoning_effort

If the model name starts with "o" (as in "o3" or "o4-mini") or "codex", reasoning is enabled by default when using the Responses API. As explained in the OpenAI Platform documentation, this can be set to:

"low"
"medium" (default)
"high"

To disable reasoning, set model_reasoning_effort to "none" in your config:

model_reasoning_effort = "none"  # disable reasoning

model_reasoning_summary

"auto" (default)
"concise"
"detailed"

To disable reasoning summaries, set model_reasoning_summary to "none" in your config:

model_reasoning_summary = "none"  # disable reasoning summaries

sandbox

The sandbox configuration determines the sandbox policy that Codex uses to execute untrusted commands. The mode determines the "base policy." Currently, only workspace-write supports additional configuration options, but this may change in the future.

The default policy is read-only, which means commands can read any file on disk, but attempts to write a file or access the network will be blocked.

[sandbox]
mode = "read-only"

A more relaxed policy is workspace-write. When specified, the current working directory for the Codex task will be writable (as well as $TMPDIR on macOS). Note that the CLI defaults to using cwd where it was spawned, though this can be overridden using --cwd/-C.

[sandbox]
mode = "workspace-write"

# By default, only the cwd for the Codex session will be writable (and $TMPDIR on macOS),
# but you can specify additional writable folders in this array.
writable_roots = [
    "/tmp",
]
network_access = false  # Like read-only, this also defaults to false and can be omitted.

To disable sandboxing altogether, specify danger-full-access like so:

[sandbox]
mode = "danger-full-access"

This is reasonable to use if Codex is running in an environment that provides its own sandboxing (such as a Docker container) such that further sandboxing is unnecessary.

Though using this option may also be necessary if you try to use Codex in environments where its native sandboxing mechanisms are unsupported, such as older Linux kernels or on Windows.

mcp_servers

Defines the list of MCP servers that Codex can consult for tool use. Currently, only servers that are launched by executing a program that communicate over stdio are supported. For servers that use the SSE transport, consider an adapter like mcp-proxy.

Note: Codex may cache the list of tools and resources from an MCP server so that Codex can include this information in context at startup without spawning all the servers. This is designed to save resources by loading MCP servers lazily.

This config option is comparable to how Claude and Cursor define mcpServers in their respective JSON config files, though because Codex uses TOML for its config language, the format is slightly different. For example, the following config in JSON:

{
  "mcpServers": {
    "server-name": {
      "command": "npx",
      "args": ["-y", "mcp-server"],
      "env": {
        "API_KEY": "value"
      }
    }
  }
}

Should be represented as follows in ~/.codex/config.toml:

# IMPORTANT: the top-level key is `mcp_servers` rather than `mcpServers`.
[mcp_servers.server-name]
command = "npx"
args = ["-y", "mcp-server"]
env = { "API_KEY" = "value" }

disable_response_storage

Currently, customers whose accounts are set to use Zero Data Retention (ZDR) must set disable_response_storage to true so that Codex uses an alternative to the Responses API that works with ZDR:

disable_response_storage = true

shell_environment_policy

Codex spawns subprocesses (e.g. when executing a local_shell tool-call suggested by the assistant). By default it passes only a minimal core subset of your environment to those subprocesses to avoid leaking credentials. You can tune this behavior via the shell_environment_policy block in config.toml:

[shell_environment_policy]
# inherit can be "core" (default), "all", or "none"
inherit = "core"
# set to true to *skip* the filter for `"*KEY*"` and `"*TOKEN*"`
ignore_default_excludes = false
# exclude patterns (case-insensitive globs)
exclude = ["AWS_*", "AZURE_*"]
# force-set / override values
set = { CI = "1" }
# if provided, *only* vars matching these patterns are kept
include_only = ["PATH", "HOME"]

Field	Type	Default	Description
`inherit`	string	`core`	Starting template for the environment: `core` (`HOME`, `PATH`, `USER`, …), `all` (clone full parent env), or `none` (start empty).
`ignore_default_excludes`	boolean	`false`	When `false`, Codex removes any var whose name contains `KEY`, `SECRET`, or `TOKEN` (case-insensitive) before other rules run.
`exclude`	array<string>	`[]`	Case-insensitive glob patterns to drop after the default filter. Examples: `"AWS_"`, `"AZURE_"`.
`set`	table<string,string>	`{}`	Explicit key/value overrides or additions – always win over inherited values.
`include_only`	array<string>	`[]`	If non-empty, a whitelist of patterns; only variables that match one pattern survive the final step. (Generally used with `inherit = "all"`.)

The patterns are glob style, not full regular expressions: * matches any number of characters, ? matches exactly one, and character classes like [A-Z]/[^0-9] are supported. Matching is always case-insensitive. This syntax is documented in code as EnvironmentVariablePattern (see core/src/config_types.rs).

If you just need a clean slate with a few custom entries you can write:

[shell_environment_policy]
inherit = "none"
set = { PATH = "/usr/bin", MY_FLAG = "1" }

Currently, CODEX_SANDBOX_NETWORK_DISABLED=1 is also added to the environment, assuming network is disabled. This is not configurable.

notify

Specify a program that will be executed to get notified about events generated by Codex. Note that the program will receive the notification argument as a string of JSON, e.g.:

{
  "type": "agent-turn-complete",
  "turn-id": "12345",
  "input-messages": ["Rename `foo` to `bar` and update the callsites."],
  "last-assistant-message": "Rename complete and verified `cargo build` succeeds."
}

The "type" property will always be set. Currently, "agent-turn-complete" is the only notification type that is supported.

As an example, here is a Python script that parses the JSON and decides whether to show a desktop push notification using terminal-notifier on macOS:

#!/usr/bin/env python3

import json
import subprocess
import sys


def main() -> int:
    if len(sys.argv) != 2:
        print("Usage: notify.py <NOTIFICATION_JSON>")
        return 1

    try:
        notification = json.loads(sys.argv[1])
    except json.JSONDecodeError:
        return 1

    match notification_type := notification.get("type"):
        case "agent-turn-complete":
            assistant_message = notification.get("last-assistant-message")
            if assistant_message:
                title = f"Codex: {assistant_message}"
            else:
                title = "Codex: Turn Complete!"
            input_messages = notification.get("input_messages", [])
            message = " ".join(input_messages)
            title += message
        case _:
            print(f"not sending a push notification for: {notification_type}")
            return 0

    subprocess.check_output(
        [
            "terminal-notifier",
            "-title",
            title,
            "-message",
            message,
            "-group",
            "codex",
            "-ignoreDnD",
            "-activate",
            "com.googlecode.iterm2",
        ]
    )

    return 0


if __name__ == "__main__":
    sys.exit(main())

To have Codex use this script for notifications, you would configure it via notify in ~/.codex/config.toml using the appropriate path to notify.py on your computer:

notify = ["python3", "/Users/mbolin/.codex/notify.py"]

history

By default, Codex CLI records messages sent to the model in $CODEX_HOME/history.jsonl. Note that on UNIX, the file permissions are set to o600, so it should only be readable and writable by the owner.

To disable this behavior, configure [history] as follows:

[history]
persistence = "none"  # "save-all" is the default value

file_opener

Identifies the editor/URI scheme to use for hyperlinking citations in model output. If set, citations to files in the model output will be hyperlinked using the specified URI scheme so they can be ctrl/cmd-clicked from the terminal to open them.

For example, if the model output includes a reference such as 【F:/home/user/project/main.py†L42-L50】, then this would be rewritten to link to the URI vscode://file/home/user/project/main.py:42.

Note this is not a general editor setting (like $EDITOR), as it only accepts a fixed set of values:

"vscode" (default)
"vscode-insiders"
"windsurf"
"cursor"
"none" to explicitly disable this feature

Currently, "vscode" is the default, though Codex does not verify VS Code is installed. As such, file_opener may default to "none" or something else in the future.

hide_agent_reasoning

Codex intermittently emits "reasoning" events that show the model’s internal "thinking" before it produces a final answer. Some users may find these events distracting, especially in CI logs or minimal terminal output.

Setting hide_agent_reasoning to true suppresses these events in both the TUI as well as the headless exec sub-command:

hide_agent_reasoning = true   # defaults to false

model_context_window

The size of the context window for the model, in tokens.

In general, Codex knows the context window for the most common OpenAI models, but if you are using a new model with an old version of the Codex CLI, then you can use model_context_window to tell Codex what value to use to determine how much context is left during a conversation.

model_max_output_tokens

This is analogous to model_context_window, but for the maximum number of output tokens for the model.

project_doc_max_bytes

Maximum number of bytes to read from an AGENTS.md file to include in the instructions sent with the first turn of a session. Defaults to 32 KiB.

tui

Options that are specific to the TUI.

[tui]
# This will make it so that Codex does not try to process mouse events, which
# means your Terminal's native drag-to-text to text selection and copy/paste
# should work. The tradeoff is that Codex will not receive any mouse events, so
# it will not be possible to use the mouse to scroll conversation history.
#
# Note that most terminals support holding down a modifier key when using the
# mouse to support text selection. For example, even if Codex mouse capture is
# enabled (i.e., this is set to `false`), you can still hold down alt while
# dragging the mouse to select text.
disable_mouse_capture = true  # defaults to `false`

17 KiB Raw Blame History Unescape Escape