There are exactly 4 types of flaky tests in Windows x86 right now: 1. `review_input_isolated_from_parent_history` => Times out waiting for closing events 2. `review_does_not_emit_agent_message_on_structured_output` => Times out waiting for closing events 3. `auto_compact_runs_after_token_limit_hit` => Times out waiting for closing events 4. `auto_compact_runs_after_token_limit_hit` => Also has a problem where auto compact should add a third request, but receives 4 requests. 1, 2, and 3 seem to be solved with increasing threads on windows runner from 2 -> 4. Don't know yet why # 4 is happening, but probably also because of WireMock issues on windows causing races.
codex-core
This crate implements the business logic for Codex. It is designed to be used by the various Codex UIs written in Rust.
Dependencies
Note that codex-core makes some assumptions about certain helper utilities being available in the environment. Currently, this
macOS
Expects /usr/bin/sandbox-exec to be present.
Linux
Expects the binary containing codex-core to run the equivalent of codex debug landlock when arg0 is codex-linux-sandbox. See the codex-arg0 crate for details.
All Platforms
Expects the binary containing codex-core to simulate the virtual apply_patch CLI when arg1 is --codex-run-as-apply-patch. See the codex-arg0 crate for details.