Commit Graph

153 Commits

Author SHA1 Message Date
ae
dc15a5cf0b feat: accept custom instructions in profiles (#1803)
Allows users to set their experimental_instructions_file in configs.

For example the below enables experimental instructions when running
`codex -p foo`.
```
[profiles.foo]
experimental_instructions_file = "/Users/foo/.codex/prompt.md"
```

# Testing
-  Running against a profile with experimental_instructions_file works.
-  Running against a profile without experimental_instructions_file
works.
-  Running against no profile with experimental_instructions_file
works.
-  Running against no profile without experimental_instructions_file
works.
2025-08-04 09:34:46 -07:00
Gabriel Peal
1f3318c1c5 Add a TurnDiffTracker to create a unified diff for an entire turn (#1770)
This lets us show an accumulating diff across all patches in a turn.
Refer to the docs for TurnDiffTracker for implementation details.

There are multiple ways this could have been done and this felt like the
right tradeoff between reliability and completeness:
*Pros*
* It will pick up all changes to files that the model touched including
if they prettier or another command that updates them.
* It will not pick up changes made by the user or other agents to files
it didn't modify.

*Cons*
* It will pick up changes that the user made to a file that the model
also touched
* It will not pick up changes to codegen or files that were not modified
with apply_patch
2025-08-04 11:57:04 -04:00
Dylan
e3565a3f43 [sandbox] Filter out certain non-sandbox errors (#1804)
## Summary
Users frequently complain about re-approving commands that have failed
for non-sandbox reasons. We can't diagnose with complete accuracy which
errors happened because of a sandbox failure, but we can start to
eliminate some common simple cases.

This PR captures the most common case I've seen, which is a `command not
found` error.

## Testing
- [x] Added unit tests
- [x] Ran a few cases locally
2025-08-03 13:05:48 -07:00
Jeremy Rose
78a1d49fac fix command duration display (#1806)
we were always displaying "0ms" before.

<img width="731" height="101" alt="Screenshot 2025-08-02 at 10 51 22 PM"
src="https://github.com/user-attachments/assets/f56814ed-b9a4-4164-9e78-181c60ce19b7"
/>
2025-08-03 11:33:44 -07:00
David Z Hao
75eecb656e Fix MacOS multiprocessing by relaxing sandbox (#1808)
The following test script fails in the codex sandbox:
```
import multiprocessing
from multiprocessing import Lock, Process

def f(lock):
    with lock:
        print("Lock acquired in child process")

if __name__ == '__main__':
    lock = Lock()
    p = Process(target=f, args=(lock,))
    p.start()
    p.join()
```

with 
```
Traceback (most recent call last):
  File "/Users/david.hao/code/codex/codex-rs/cli/test.py", line 9, in <module>
    lock = Lock()
           ^^^^^^
  File "/Users/david.hao/.local/share/uv/python/cpython-3.12.9-macos-aarch64-none/lib/python3.12/multiprocessing/context.py", line 68, in Lock
    return Lock(ctx=self.get_context())
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/david.hao/.local/share/uv/python/cpython-3.12.9-macos-aarch64-none/lib/python3.12/multiprocessing/synchronize.py", line 169, in __init__
    SemLock.__init__(self, SEMAPHORE, 1, 1, ctx=ctx)
  File "/Users/david.hao/.local/share/uv/python/cpython-3.12.9-macos-aarch64-none/lib/python3.12/multiprocessing/synchronize.py", line 57, in __init__
    sl = self._semlock = _multiprocessing.SemLock(
                         ^^^^^^^^^^^^^^^^^^^^^^^^^
PermissionError: [Errno 1] Operation not permitted
```

After reading, adding this line to the sandbox configs fixes things -
MacOS multiprocessing appears to use sem_lock(), which opens an IPC
which is considered a disk write even though no file is created. I
interrogated ChatGPT about whether it's okay to loosen, and my
impression after reading is that it is, although would appreciate a
close look


Breadcrumb: You can run `cargo run -- debug seatbelt --full-auto <cmd>`
to test the sandbox
2025-08-03 06:59:26 -07:00
aibrahim-oai
81bb1c9e26 Fix compact (#1798)
We are not recording the summary in the history.
2025-08-02 12:05:06 -07:00
Michael Bolin
80555d4ff2 feat: make .git read-only within a writable root when using Seatbelt (#1765)
To make `--full-auto` safer, this PR updates the Seatbelt policy so that
a `SandboxPolicy` with a `writable_root` that contains a `.git/`
_directory_ will make `.git/` _read-only_ (though as a follow-up, we
should also consider the case where `.git` is a _file_ with a `gitdir:
/path/to/actual/repo/.git` entry that should also be protected).

The two major changes in this PR:

- Updating `SandboxPolicy::get_writable_roots_with_cwd()` to return a
`Vec<WritableRoot>` instead of a `Vec<PathBuf>` where a `WritableRoot`
can specify a list of read-only subpaths.
- Updating `create_seatbelt_command_args()` to honor the read-only
subpaths in `WritableRoot`.

The logic to update the policy is a fairly straightforward update to
`create_seatbelt_command_args()`, but perhaps the more interesting part
of this PR is the introduction of an integration test in
`tests/sandbox.rs`. Leveraging the new API in #1785, we test
`SandboxPolicy` under various conditions, including ones where `$TMPDIR`
is not readable, which is critical for verifying the new behavior.

To ensure that Codex can run its own tests, e.g.:

```
just codex debug seatbelt --full-auto -- cargo test if_git_repo_is_writable_root_then_dot_git_folder_is_read_only
```

I had to introduce the use of `CODEX_SANDBOX=sandbox`, which is
comparable to how `CODEX_SANDBOX_NETWORK_DISABLED=1` was already being
used.

Adding a comparable change for Landlock will be done in a subsequent PR.
2025-08-01 16:11:24 -07:00
Michael Bolin
92f3566d78 chore: introduce SandboxPolicy::WorkspaceWrite::include_default_writable_roots (#1785)
Without this change, it is challenging to create integration tests to
verify that the folders not included in `writable_roots` in
`SandboxPolicy::WorkspaceWrite` are read-only because, by default,
`get_writable_roots_with_cwd()` includes `TMPDIR`, which is where most
integrationt
tests do their work.

This introduces a `use_exact_writable_roots` option to disable the
default
includes returned by `get_writable_roots_with_cwd()`.




---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1785).
* #1765
* __->__ #1785
2025-08-01 14:15:55 -07:00
aibrahim-oai
f20de21cb6 collabse stdout and stderr delta events into one (#1787) 2025-08-01 14:00:19 -07:00
aibrahim-oai
bc7beddaa2 feat: stream exec stdout events (#1786)
## Summary
- stream command stdout as `ExecCommandStdout` events
- forward streamed stdout to clients and ignore in human output
processor
- adjust call sites for new streaming API
2025-08-01 13:04:34 -07:00
pakrym-oai
88ea215c80 Add a custom originator setting (#1781) 2025-08-01 09:55:23 -07:00
aibrahim-oai
e2c994e32a Add /compact (#1527)
- Add operation to summarize the context so far.
- The operation runs a compact task that summarizes the context.
- The operation clear the previous context to free the context window
- The operation didn't use `run_task` to avoid corrupting the session
- Add /compact in the tui



https://github.com/user-attachments/assets/e06c24e5-dcfb-4806-934a-564d425a919c
2025-07-31 21:34:32 -07:00
pakrym-oai
0935e6a875 Send account id when available (#1767)
For users with multiple accounts we need to specify the account to use.
2025-07-31 15:40:19 -07:00
Michael Bolin
5a0ad5ab8f chore: refactor exec.rs: create separate seatbelt.rs and spawn.rs files (#1762)
At 550 lines, `exec.rs` was a bit large. In particular, I found it hard
to locate the Seatbelt-related code quickly without a file with
`seatbelt` in the name, so this refactors things so:

- `spawn_command_under_seatbelt()` and dependent code moves to a new
`seatbelt.rs` file
- `spawn_child_async()` and dependent code moves to a new `spawn.rs`
file
2025-07-31 13:11:47 -07:00
Michael Bolin
06c786b2da fix: ensure PatchApplyBeginEvent and PatchApplyEndEvent are dispatched reliably (#1760)
This is a follow-up to https://github.com/openai/codex/pull/1705, as
that PR inadvertently lost the logic where `PatchApplyBeginEvent` and
`PatchApplyEndEvent` events were sent when patches were auto-approved.

Though as part of this fix, I believe this also makes an important
safety fix to `assess_patch_safety()`, as there was a case that returned
`SandboxType::None`, which arguably is the thing we were trying to avoid
in #1705.

On a high level, we want there to be only one codepath where
`apply_patch` happens, which should be unified with the patch to run
`exec`, in general, so that sandboxing is applied consistently for both
cases.

Prior to this change, `apply_patch()` in `core` would either:

* exit early, delegating to `exec()` to shell out to `apply_patch` using
the appropriate sandbox
* proceed to run the logic for `apply_patch` in memory


549846b29a/codex-rs/core/src/apply_patch.rs (L61-L63)

In this implementation, only the latter would dispatch
`PatchApplyBeginEvent` and `PatchApplyEndEvent`, though the former would
dispatch `ExecCommandBeginEvent` and `ExecCommandEndEvent` for the
`apply_patch` call (or, more specifically, the `codex
--codex-run-as-apply-patch PATCH` call).

To unify things in this PR, we:

* Eliminate the back half of the `apply_patch()` function, and instead
have it also return with `DelegateToExec`, though we add an extra field
to the return value, `user_explicitly_approved_this_action`.
* In `codex.rs` where we process `DelegateToExec`, we use
`SandboxType::None` when `user_explicitly_approved_this_action` is
`true`. This means **we no longer run the apply_patch logic in memory**,
as we always `exec()`. (Note this is what allowed us to delete so much
code in `apply_patch.rs`.)
* In `codex.rs`, we further update `notify_exec_command_begin()` and
`notify_exec_command_end()` to take additional fields to determine what
type of notification to send: `ExecCommand` or `PatchApply`.

Admittedly, this PR also drops some of the functionality about giving
the user the opportunity to expand the set of writable roots as part of
approving the `apply_patch` command. I'm not sure how much that was
used, and we should probably rethink how that works as we are currently
tidying up the protocol to the TUI, in general.
2025-07-31 11:13:57 -07:00
pakrym-oai
549846b29a Add codex login --api-key (#1759)
Allow setting the API key via `codex login --api-key`
2025-07-31 17:48:49 +00:00
Jeremy Rose
be0cd34300 fix git tests (#1747)
the git tests were failing on my local machine due to gpg signing config
in my ~/.gitconfig. tests should not be affected by ~/.gitconfig, so
configure them to ignore it.
2025-07-31 09:17:59 -07:00
Michael Bolin
221ebfcccc fix: run apply_patch calls through the sandbox (#1705)
Building on the work of https://github.com/openai/codex/pull/1702, this
changes how a shell call to `apply_patch` is handled.

Previously, a shell call to `apply_patch` was always handled in-process,
never leveraging a sandbox. To determine whether the `apply_patch`
operation could be auto-approved, the
`is_write_patch_constrained_to_writable_paths()` function would check if
all the paths listed in the paths were writable. If so, the agent would
apply the changes listed in the patch.

Unfortunately, this approach afforded a loophole: symlinks!

* For a soft link, we could fix this issue by tracing the link and
checking whether the target is in the set of writable paths, however...
* ...For a hard link, things are not as simple. We can run `stat FILE`
to see if the number of links is greater than 1, but then we would have
to do something potentially expensive like `find . -inum <inode_number>`
to find the other paths for `FILE`. Further, even if this worked, this
approach runs the risk of a
[TOCTOU](https://en.wikipedia.org/wiki/Time-of-check_to_time-of-use)
race condition, so it is not robust.

The solution, implemented in this PR, is to take the virtual execution
of the `apply_patch` CLI into an _actual_ execution using `codex
--codex-run-as-apply-patch PATCH`, which we can run under the sandbox
the user specified, just like any other `shell` call.

This, of course, assumes that the sandbox prevents writing through
symlinks as a mechanism to write to folders that are not in the writable
set configured by the sandbox. I verified this by testing the following
on both Mac and Linux:

```shell
#!/usr/bin/env bash
set -euo pipefail

# Can running a command in SANDBOX_DIR write a file in EXPLOIT_DIR?

# Codex is run in SANDBOX_DIR, so writes should be constrianed to this directory.
SANDBOX_DIR=$(mktemp -d -p "$HOME" sandboxtesttemp.XXXXXX)
# EXPLOIT_DIR is outside of SANDBOX_DIR, so let's see if we can write to it.
EXPLOIT_DIR=$(mktemp -d -p "$HOME" sandboxtesttemp.XXXXXX)

echo "SANDBOX_DIR: $SANDBOX_DIR"
echo "EXPLOIT_DIR: $EXPLOIT_DIR"

cleanup() {
  # Only remove if it looks sane and still exists
  [[ -n "${SANDBOX_DIR:-}" && -d "$SANDBOX_DIR" ]] && rm -rf -- "$SANDBOX_DIR"
  [[ -n "${EXPLOIT_DIR:-}" && -d "$EXPLOIT_DIR" ]] && rm -rf -- "$EXPLOIT_DIR"
}

trap cleanup EXIT

echo "I am the original content" > "${EXPLOIT_DIR}/original.txt"

# Drop the -s to test hard links.
ln -s "${EXPLOIT_DIR}/original.txt" "${SANDBOX_DIR}/link-to-original.txt"

cat "${SANDBOX_DIR}/link-to-original.txt"

if [[ "$(uname)" == "Linux" ]]; then
    SANDBOX_SUBCOMMAND=landlock
else
    SANDBOX_SUBCOMMAND=seatbelt
fi

# Attempt the exploit
cd "${SANDBOX_DIR}"

codex debug "${SANDBOX_SUBCOMMAND}" bash -lc "echo pwned > ./link-to-original.txt" || true

cat "${EXPLOIT_DIR}/original.txt"
```

Admittedly, this change merits a proper integration test, but I think I
will have to do that in a follow-up PR.
2025-07-30 16:45:08 -07:00
pakrym-oai
e0e245cc1c Send AGENTS.md as a separate user message (#1737) 2025-07-30 13:56:24 -07:00
pakrym-oai
ea01a5ffe2 Add support for a separate chatgpt auth endpoint (#1712)
Adds a `CodexAuth` type that encapsulates information about available
auth modes and logic for refreshing the token.
Changes `Responses` API to send requests to different endpoints based on
the auth type.
Updates login_with_chatgpt to support API-less mode and skip the key
exchange.
2025-07-30 19:40:15 +00:00
Jeremy Rose
347c81ad00 remove conversation history widget (#1727)
this widget is no longer used.
2025-07-30 10:05:40 -07:00
aibrahim-oai
3823b32b7a Mcp protocol (#1715)
- Add typed MCP protocol surface in
`codex-rs/mcp-server/src/mcp_protocol.rs` for `requests`, `responses`,
and `notifications`
- Requests: `NewConversation`, `Connect`, `SendUserMessage`,
`GetConversations`
- Message content parts: `Text`, `Image` (`ImageUrl`/`FileId`, optional
`ImageDetail`), File (`Url`/`Id`/`inline Data`)
- Responses: `ToolCallResponseEnvelope` with optional `isError` and
`structuredContent` variants (`NewConversation`, `Connect`,
`SendUserMessageAccepted`, `GetConversations`)
- Notifications: `InitialState`, `ConnectionRevoked`, `CodexEvent`,
`Cancelled`
- Uniform `_meta` on `notifications` via `NotificationMeta`
(`conversationId`, `requestId`)
- Unit tests validate JSON wire shapes for key
`requests`/`responses`/`notifications`
2025-07-29 20:14:41 -07:00
pakrym-oai
6b10e22eb3 Trim bash lc and run with login shell (#1725)
include .zshenv, .zprofile by running with the `-l` flag and don't start
a shell inside a shell when we see the typical `bash -lc` invocation.
2025-07-29 16:49:02 -07:00
Gabriel Peal
8828f6f082 Add an experimental plan tool (#1726)
This adds a tool the model can call to update a plan. The tool doesn't
actually _do_ anything but it gives clients a chance to read and render
the structured plan. We will likely iterate on the prompt and tools
exposed for planning over time.
2025-07-29 14:22:02 -04:00
easong-openai
f8fcaaaf6f Relative instruction file (#1722)
Passing in an instruction file with a bad path led to silent failures,
also instruction relative paths were handled in an unintuitive fashion.
2025-07-29 10:06:05 -07:00
aibrahim-oai
19bef7659f Serializing the eventmsg type to snake_case (#1709)
This was an abrupt change on our clients. We need to serialize as
snake_case.
2025-07-28 10:26:27 -07:00
Michael Bolin
5ebb7dd34c chore: split apply_patch logic out of codex.rs and into apply_patch.rs (#1703)
This is a straight refactor, moving apply-patch-related code from
`codex.rs` and into the new `apply_patch.rs` file. The only "logical"
change is inlining `#[allow(clippy::unwrap_used)]` instead of declaring
`#![allow(clippy::unwrap_used)]` at the top of the file (which is
currently the case in `codex.rs`).

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1703).
* #1705
* __->__ #1703
* #1702
* #1698
* #1697
2025-07-28 09:51:22 -07:00
Michael Bolin
fcd197d596 fix: use std::env::args_os instead of std::env::args (#1698)
Apparently `std::env::args()` will panic during iteration if any
argument to the process is not valid Unicode:

https://doc.rust-lang.org/std/env/fn.args.html

Let's avoid the risk and just go with `std::env::args_os()`.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1698).
* #1705
* #1703
* #1702
* __->__ #1698
* #1697
2025-07-28 08:52:18 -07:00
Michael Bolin
2405c40026 chore: update Codex::spawn() to return a struct instead of a tuple (#1677)
Also update `init_codex()` to return a `struct` instead of a tuple, as well.
2025-07-27 20:01:35 -07:00
aibrahim-oai
5a0079fea2 Changing method in MCP notifications (#1684)
- Changing the codex/event type
2025-07-26 10:35:49 -07:00
pakrym-oai
7ee87123a6 Optionally run using user profile (#1678) 2025-07-25 11:45:23 -07:00
Michael Bolin
994c9a874d chore: use one write call per item in rollout_writer() (#1679)
Most of the time, we expect the `String` returned by
`serde_json::to_string()` to have extra capacity, so `push('\n')` is
unlikely to allocate, which seems cheaper than an extra `write(2)` call,
on average?
2025-07-25 10:43:36 -07:00
easong-openai
480e82b00d Easily Selectable History (#1672)
This update replaces the previous ratatui history widget with an
append-only log so that the terminal can handle text selection and
scrolling. It also disables streaming responses, which we'll do our best
to bring back in a later PR. It also adds a small summary of token use
after the TUI exits.
2025-07-25 01:56:40 -07:00
Pavel Bezglasny
508abbe990 Update render name in tui for approval_policy to match with config values (#1675)
Currently, codex on start shows the value for the approval policy as
name of
[AskForApproval](2437a8d17a/codex-rs/core/src/protocol.rs (L128))
enum, which differs from
[approval_policy](2437a8d17a/codex-rs/config.md (approval_policy))
config values.
E.g. "untrusted" becomes "UnlessTrusted", "on-failure" -> "OnFailure",
"never" -> "Never".
This PR changes render names of the approval policy to match with
configuration values.
2025-07-24 14:17:57 -07:00
Michael Bolin
a1641743a8 feat: expand the set of commands that can be safely identified as "trusted" (#1668)
This PR updates `is_known_safe_command()` to account for "safe
operators" to expand the set of commands that can be run without
approval. This concept existed in the TypeScript CLI, and we are
[finally!] porting it to the Rust one:


c9e2def494/codex-cli/src/approvals.ts (L531-L541)

The idea is that if we have `EXPR1 SAFE_OP EXPR2` and `EXPR1` and
`EXPR2` are considered safe independently, then `EXPR1 SAFE_OP EXPR2`
should be considered safe. Currently, `SAFE_OP` includes `&&`, `||`,
`;`, and `|`.

In the TypeScript implementation, we relied on
https://www.npmjs.com/package/shell-quote to parse the string of Bash,
as it could provide a "lightweight" parse tree, parsing `'beep || boop >
/byte'` as:

```
[ 'beep', { op: '||' }, 'boop', { op: '>' }, '/byte' ]
```

Though in this PR, we introduce the use of
https://crates.io/crates/tree-sitter-bash for parsing (which
incidentally we were already using in
[`codex-apply-patch`](c9e2def494/codex-rs/apply-patch/Cargo.toml (L18))),
which gives us a richer parse tree. (Incidentally, if you have never
played with tree-sitter, try the
[playground](https://tree-sitter.github.io/tree-sitter/7-playground.html)
and select **Bash** from the dropdown to see how it parses various
expressions.)

As a concrete example, prior to this change, our implementation of
`is_known_safe_command()` could verify things like:

```
["bash", "-lc", "grep -R \"Cargo.toml\" -n"]
```

but not:

```
["bash", "-lc", "grep -R \"Cargo.toml\" -n || true"]
```

With this change, the version with `|| true` is also accepted.

Admittedly, this PR does not expand the safety check to support
subshells, so it would reject, e.g. `bash -lc 'ls || (pwd && echo hi)'`,
but that can be addressed in a subsequent PR.
2025-07-24 14:13:30 -07:00
Michael Bolin
c9e2def494 fix: add true,false,nl to the list of trusted commands (#1676)
`nl` is a line-numbering tool that should be on the _trusted _ list, as
there is nothing concerning on https://gtfobins.github.io/gtfobins/nl/
that would merit exclusion.

`true` and `false` are also safe, though not particularly useful given
how `is_known_safe_command()` works today, but that will change with
https://github.com/openai/codex/pull/1668.
2025-07-24 12:59:36 -07:00
vishnu-oai
2437a8d17a Record Git metadata to rollout (#1598)
# Summary

- Writing effective evals for codex sessions requires context of the
overall repository state at the moment the session began
- This change adds this metadata (git repository, branch, commit hash)
to the top of the rollout of the session (if available - if not it
doesn't add anything)
- Currently, this is only effective on a clean working tree, as we can't
track uncommitted/untracked changes with the current metadata set.
Ideally in the future we may want to track unclean changes somehow, or
perhaps prompt the user to stash or commit them.

# Testing
- Added unit tests
- `cargo test && cargo clippy --tests && cargo fmt -- --config
imports_granularity=Item`

### Resulting Rollout
<img width="1243" height="127" alt="Screenshot 2025-07-17 at 1 50 00 PM"
src="https://github.com/user-attachments/assets/68108941-f015-45b2-985c-ea315ce05415"
/>
2025-07-24 11:35:28 -07:00
aibrahim-oai
b4ab7c1b73 Flaky CI fix (#1647)
Flushing before sending `TaskCompleteEvent` and ending the submission
loop to avoid race conditions.
2025-07-23 15:03:26 -07:00
Gabriel Peal
084236f717 Add call_id to patch approvals and elicitations (#1660)
Builds on https://github.com/openai/codex/pull/1659 and adds call_id to
a few more places for the same reason.
2025-07-23 15:55:35 -04:00
Gabriel Peal
bc944e77f5 Improve messages emitted for exec failures (#1659)
1. Emit call_id to exec approval elicitations for mcp client convenience
2. Remove the `-retry` from the call id for the same reason as above but
upstream the reset behavior to the mcp client
2025-07-23 14:43:53 -04:00
pakrym-oai
591cb6149a Always send entire request context (#1641)
Always store the entire conversation history.
Request encrypted COT when not storing Responses.
Send entire input context instead of sending previous_response_id
2025-07-23 10:37:45 -07:00
Michael Bolin
d6c4083f98 feat: support dotenv (including ~/.codex/.env) (#1653)
This PR adds a `load_dotenv()` helper function to the `codex-common`
crate that is available when the `cli` feature is enabled. The function
uses [`dotenvy`](https://crates.io/crates/dotenvy) to update the
environment from:

- `$CODEX_HOME/.env`
- `$(pwd)/.env`

To test:

- ran `printenv OPENAI_API_KEY` to verify the env var exists in my
environment
- ran `just codex exec hello` to verify the CLI uses my `OPENAI_API_KEY`
- ran `unset OPENAI_API_KEY`
- ran `just codex exec hello` again and got **ERROR: Missing environment
variable: `OPENAI_API_KEY`**, as expected
- created `~/.codex/.env` and added `OPENAI_API_KEY=sk-proj-...` (also
ran `chmod 400 ~/.codex/.env` for good measure)
- ran `just codex exec hello` again and it worked, verifying it picked
up `OPENAI_API_KEY` from `~/.codex/.env`

Note this functionality was available in the TypeScript CLI:
https://github.com/openai/codex/pull/122 and was recently requested over
on https://github.com/openai/codex/issues/1262#issuecomment-3093203551.
2025-07-22 15:54:33 -07:00
pakrym-oai
6d82907082 Add support for custom base instructions (#1645)
Allows providing custom instructions file as a config parameter and
custom instruction text via MCP tool call.
2025-07-22 09:42:22 -07:00
pakrym-oai
ed206d5687 Log response.failed error message and request-id (#1649)
To help with diagnosing failures.
2025-07-22 09:28:00 -07:00
Michael Bolin
d51654822f fix: use PR_SET_PDEATHSIG so to ensure child processes are killed in a timely manner (#1626)
Some users have reported issues where child processes are not cleaned up
after Codex exits (e.g., https://github.com/openai/codex/issues/1570).

This is generally a tricky issue on operating systems: if a parent
process receives `SIGKILL`, then it terminates immediately and cannot
communicate with the child.

**It only helps on Linux**, but this PR introduces the use of `prctl(2)`
so that if the parent process dies, `SIGTERM` will be delivered to the
child process. Whereas previously, I believe that if Codex spawned a
long-running process (like `tsc --watch`) and the Codex process received
`SIGKILL`, the `tsc --watch` process would be reparented to the init
process and would never be killed. Now with the use of `prctl(2)`, the
`tsc --watch` process should receive `SIGTERM` in that scenario.

We still need to come up with a solution for macOS. I've started to look
at `launchd`, but I'm researching a number of options.
2025-07-22 00:41:27 -07:00
Michael Bolin
6cf4b96f9d fix: check flags to ripgrep when deciding whether the invocation is "trusted" (#1644)
With this change, if any of `--pre`, `--hostname-bin`, `--search-zip`, or `-z` are used with a proposed invocation of `rg`, do not auto-approve.
2025-07-21 22:38:50 -07:00
Dylan
18b2b30841 [mcp-server] Add reply tool call (#1643)
## Summary
Adds a new mcp tool call, `codex-reply`, so we can continue existing
sessions. This is a first draft and does not yet support sessions from
previous processes.

## Testing
- [x] tested with mcp client
2025-07-21 21:01:56 -07:00
Michael Bolin
018003e52f feat: leverage elicitations in the MCP server (#1623)
This updates the MCP server so that if it receives an
`ExecApprovalRequest` from the `Codex` session, it in turn sends an [MCP
elicitation](https://modelcontextprotocol.io/specification/draft/client/elicitation)
to the client to ask for the approval decision. Upon getting a response,
it forwards the client's decision via `Op::ExecApproval`.

Admittedly, we should be doing the same thing for
`ApplyPatchApprovalRequest`, but this is our first time experimenting
with elicitations, so I'm inclined to defer wiring that code path up
until we feel good about how this one works.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1623).
* __->__ #1623
* #1622
* #1621
* #1620
2025-07-19 01:32:03 -04:00
Michael Bolin
e78ec00e73 chore: support MCP schema 2025-06-18 (#1621)
This updates the schema in `generate_mcp_types.py` from `2025-03-26` to
`2025-06-18`, regenerates `mcp-types/src/lib.rs`, and then updates all
the code that uses `mcp-types` to honor the changes.

Ran

```
npx @modelcontextprotocol/inspector just codex mcp
```

and verified that I was able to invoke the `codex` tool, as expected.


---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1621).
* #1623
* #1622
* __->__ #1621
2025-07-19 00:09:34 -04:00
aibrahim-oai
83eefb55fb Add session loading support to Codex (#1602)
## Summary
- extend rollout format to store all session data in JSON
- add resume/write helpers for rollouts
- track session state after each conversation
- support `LoadSession` op to resume a previous rollout
- allow starting Codex with an existing session via
`experimental_resume` config variable

We need a way later for exploring the available sessions in a user
friendly way.

## Testing
- `cargo test --no-run` *(fails: `cargo: command not found`)*

------
https://chatgpt.com/codex/tasks/task_i_68792a29dd5c832190bf6930d3466fba

This video is outdated. you should use `-c experimental_resume:<full
path>` instead of `--resume <full path>`


https://github.com/user-attachments/assets/7a9975c7-aa04-4f4e-899a-9e87defd947a
2025-07-18 17:04:04 -07:00