Commit Graph

16 Commits

Author SHA1 Message Date
Celia Chen
d3187dbc17 [App-server] v2 for account/updated and account/logout (#6175)
V2 for `account/updated` and `account/logout` for app server. correspond
to old `authStatusChange` and `LogoutChatGpt` respectively. Followup PRs
will make other v2 endpoints call `account/updated` instead of
`authStatusChange` too.
2025-11-03 22:01:33 -08:00
pakrym-oai
9c903c4716 Add ItemStarted/ItemCompleted events for UserInputItem (#5306)
Adds a new ItemStarted event and delivers UserMessage as the first item
type (more to come).


Renames `InputItem` to `UserInput` considering we're using the `Item`
suffix for actual items.
2025-10-20 13:34:44 -07:00
jif-oai
8fed0b53c4 test: reduce time dependency on test harness (#5053)
Tightened the CLI integration tests to stop relying on wall-clock
sleeps—new fs watcher helper waits for session files instead of timing
out, and SSE mocks/fixtures make the flows deterministic.
2025-10-15 09:56:59 +01:00
pakrym-oai
b2d81a7cac Make output assertions more explicit (#4784)
Match using precise regexes.
2025-10-05 16:01:38 -07:00
jif-oai
33d3ecbccc chore: refactor tool handling (#4510)
# Tool System Refactor

- Centralizes tool definitions and execution in `core/src/tools/*`:
specs (`spec.rs`), handlers (`handlers/*`), router (`router.rs`),
registry/dispatch (`registry.rs`), and shared context (`context.rs`).
One registry now builds the model-visible tool list and binds handlers.
- Router converts model responses to tool calls; Registry dispatches
with consistent telemetry via `codex-rs/otel` and unified error
handling. Function, Local Shell, MCP, and experimental `unified_exec`
all flow through this path; legacy shell aliases still work.
- Rationale: reduce per‑tool boilerplate, keep spec/handler in sync, and
make adding tools predictable and testable.

Example: `read_file`
- Spec: `core/src/tools/spec.rs` (see `create_read_file_tool`,
registered by `build_specs`).
- Handler: `core/src/tools/handlers/read_file.rs` (absolute `file_path`,
1‑indexed `offset`, `limit`, `L#: ` prefixes, safe truncation).
- E2E test: `core/tests/suite/read_file.rs` validates the tool returns
the requested lines.

## Next steps:
- Decompose `handle_container_exec_with_params` 
- Add parallel tool calls
2025-10-03 13:21:06 +01:00
pakrym-oai
8e3a048fec Add codex exec testing helpers (#4254)
Add a shortcut to create working directories and run codex exec with
fake server.
2025-09-25 17:12:45 -07:00
Jeremy Rose
4a5f05c136 make tests pass cleanly in sandbox (#4067)
This changes the reqwest client used in tests to be sandbox-friendly,
and skips a bunch of other tests that don't work inside the
sandbox/without network.
2025-09-25 13:11:14 -07:00
pakrym-oai
5c7d9e27b1 Add notifier tests (#4064)
Proposal:
1. Use anyhow for tests and avoid unwrap
2. Extract a helper for starting a test instance of codex
2025-09-23 14:25:46 +00:00
pakrym-oai
14a115d488 Add non_sandbox_test helper (#3880)
Makes tests shorter
2025-09-22 14:50:41 +00:00
pakrym-oai
881c7978f1 Move responses mocking helpers to a shared lib (#3878)
These are generally useful
2025-09-18 17:53:14 -07:00
Dylan
6df8e35314 [tools] Add apply_patch tool (#2303)
## Summary
We've been seeing a number of issues and reports with our synthetic
`apply_patch` tool, e.g. #802. Let's make this a real tool - in my
anecdotal testing, it's critical for GPT-OSS models, but I'd like to
make it the standard across GPT-5 and codex models as well.

## Testing
- [x] Tested locally
- [x] Integration test
2025-08-15 11:55:53 -04:00
Parker Thompson
a075424437 Added allow-expect-in-tests / allow-unwrap-in-tests (#2328)
This PR:
* Added the clippy.toml to configure allowable expect / unwrap usage in
tests
* Removed as many expect/allow lines as possible from tests
* moved a bunch of allows to expects where possible

Note: in integration tests, non `#[test]` helper functions are not
covered by this so we had to leave a few lingering `expect(expect_used`
checks around
2025-08-14 17:59:01 -07:00
Michael Bolin
08ed618f72 chore: introduce ConversationManager as a clearinghouse for all conversations (#2240)
This PR does two things because after I got deep into the first one I
started pulling on the thread to the second:

- Makes `ConversationManager` the place where all in-memory
conversations are created and stored. Previously, `MessageProcessor` in
the `codex-mcp-server` crate was doing this via its `session_map`, but
this is something that should be done in `codex-core`.
- It unwinds the `ctrl_c: tokio::sync::Notify` that was threaded
throughout our code. I think this made sense at one time, but now that
we handle Ctrl-C within the TUI and have a proper `Op::Interrupt` event,
I don't think this was quite right, so I removed it. For `codex exec`
and `codex proto`, we now use `tokio::signal::ctrl_c()` directly, but we
no longer make `Notify` a field of `Codex` or `CodexConversation`.

Changes of note:

- Adds the files `conversation_manager.rs` and `codex_conversation.rs`
to `codex-core`.
- `Codex` and `CodexSpawnOk` are no longer exported from `codex-core`:
other crates must use `CodexConversation` instead (which is created via
`ConversationManager`).
- `core/src/codex_wrapper.rs` has been deleted in favor of
`ConversationManager`.
- `ConversationManager::new_conversation()` returns `NewConversation`,
which is in line with the `new_conversation` tool we want to add to the
MCP server. Note `NewConversation` includes `SessionConfiguredEvent`, so
we eliminate checks in cases like `codex-rs/core/tests/client.rs` to
verify `SessionConfiguredEvent` is the first event because that is now
internal to `ConversationManager`.
- Quite a bit of code was deleted from
`codex-rs/mcp-server/src/message_processor.rs` since it no longer has to
manage multiple conversations itself: it goes through
`ConversationManager` instead.
- `core/tests/live_agent.rs` has been deleted because I had to update a
bunch of tests and all the tests in here were ignored, and I don't think
anyone ever ran them, so this was just technical debt, at this point.
- Removed `notify_on_sigint()` from `util.rs` (and in a follow-up, I
hope to refactor the blandly-named `util.rs` into more descriptive
files).
- In general, I started replacing local variables named `codex` as
`conversation`, where appropriate, though admittedly I didn't do it
through all the integration tests because that would have added a lot of
noise to this PR.




---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2240).
* #2264
* #2263
* __->__ #2240
2025-08-13 13:38:18 -07:00
easong-openai
6340acd885 Re-add markdown streaming (#2029)
Wait for newlines, then render markdown on a line by line basis. Word wrap it for the current terminal size and then spit it out line by line into the UI. Also adds tests and fixes some UI regressions.
2025-08-12 17:37:28 -07:00
Dylan
0091930f5a [core] Allow resume after client errors (#2053)
## Summary
Allow tui conversations to resume after the client fails out of retries.
I tested this with exec / mocked api failures as well, and it appears to
be fine. But happy to add an exec integration test as well!

## Testing
- [x] Added integration test
- [x] Tested locally
2025-08-08 18:21:19 -07:00
Michael Bolin
7af9cedbd7 fix: create separate test_support crates to eliminate #[allow(dead_code)] (#1667)
Because of a quirk of how implementation tests work in Rust, we had a
number of `#[allow(dead_code)]` annotations that were misleading because
the functions _were_ being used, just not by all integration tests in a
`tests/` folder, so when compiling the test that did not use the
function, clippy would complain that it was unused.

This fixes things by create a "test_support" crate under the `tests/`
folder that is imported as a dev dependency for the respective crate.
2025-07-24 12:19:46 -07:00