Wait for newlines, then render markdown on a line by line basis. Word wrap it for the current terminal size and then spit it out line by line into the UI. Also adds tests and fixes some UI regressions.
298 lines
11 KiB
Plaintext
298 lines
11 KiB
Plaintext
thinking
|
||
Investigating binary sizes
|
||
|
||
I need to check the codex-rs repository to explain why the project's binaries
|
||
are large. The user is likely seeking specifics about the setup: are Rust builds
|
||
static, what features are enabled, and is debug information included? It could
|
||
be due to static linking, included OpenSSL, or how panic handling is set up. I
|
||
should look into the Cargo.toml file to confirm features and profiles without
|
||
needing to edit any code. Let's get started on this!
|
||
|
||
codex
|
||
I’m going to scan the workspace and Cargo manifests to see build profiles and
|
||
dependencies that impact binary size. Then I’ll summarize the main causes.
|
||
|
||
⚡ Running ls -la
|
||
⎿ total 6696
|
||
drwxr-xr-x@ 39 easong staff 1248 Aug 9 08:49 .
|
||
drwxr-xr-x@ 29 easong staff 928 Aug 9 08:50 ..
|
||
-rw-r--r--@ 1 easong staff 10244 Aug 8 21:13 .DS_Store
|
||
-rw-r--r--@ 1 easong staff 212 Aug 4 18:49 .gitignore
|
||
… +30 lines
|
||
-rw-r--r--@ 1 easong staff 9971 Aug 8 23:07 simpler.md
|
||
drwxr-xr-x@ 6 easong staff 192 Aug 4 22:40 target
|
||
drwxr-xr-x@ 6 easong staff 192 Aug 7 19:20 tui
|
||
-rw-r--r--@ 1 easong staff 8639 Aug 6 23:30 typewriter.md
|
||
-rw-r--r--@ 1 easong staff 7673 Aug 6 21:58 wrap-plan.md
|
||
⚡ Running sed -n '1,200p' Cargo.toml
|
||
⎿ [workspace]
|
||
members = [
|
||
"ansi-escape",
|
||
"apply-patch",
|
||
"arg0",
|
||
… +38 lines
|
||
codegen-units = 1
|
||
|
||
[patch.crates-io]
|
||
# ratatui = { path = "../../ratatui" }
|
||
ratatui = { git = "https://github.com/nornagon/ratatui", branch =
|
||
"nornagon-v0.29.0-patch" }
|
||
Optimizing Rust Release Profile
|
||
|
||
I'm reviewing the workspace's release profile, which has settings like lto=fat,
|
||
strip=symbols, and codegen-units=1 to reduce binary size. However, it seems that
|
||
"strip = symbols" only removes symbols, not debug information. In Rust 1.73+,
|
||
both "symbols" and "debuginfo" stripping options are available. I should check
|
||
whether using these results in a smaller binary. I’ll also skim through crates
|
||
for heavy dependencies in Cargo.toml, including cli, core, exec, linux-sandbox,
|
||
tui, login, ollama, and mcp.
|
||
|
||
⚡ Running for d in ansi-escape apply-patch arg0 cli common core exec execpolicy
|
||
file-search linux-sandbox login mcp-client mcp-server mcp-types ollama tui; do
|
||
echo "--- $d/Cargo.toml"; sed -n '1,200p' $d/Cargo.toml; echo; done
|
||
⎿ --- ansi-escape/Cargo.toml
|
||
[package]
|
||
edition = "2024"
|
||
name = "codex-ansi-escape"
|
||
version = { workspace = true }
|
||
… +237 lines
|
||
"process",
|
||
"rt-multi-thread",
|
||
"signal",
|
||
] }
|
||
tracing = { version
|
||
⚡ Running sed -n '1,200p' execpolicy/Cargo.toml
|
||
⎿ [package]
|
||
name = "codex-execpolicy"
|
||
version = { workspace = true }
|
||
edition = "2024"
|
||
|
||
… +23 lines
|
||
serde_json = "1.0.142"
|
||
serde_with = { version = "3", features = ["macros"] }
|
||
|
||
[dev-dependencies]
|
||
tempfile = "3.13.0"
|
||
⚡ Running sed -n '1,200p' file-search/Cargo.toml
|
||
⎿ [package]
|
||
edition = "2024"
|
||
name = "codex-file-search"
|
||
version = { workspace = true }
|
||
|
||
… +11 lines
|
||
ignore = "0.4.23"
|
||
nucleo-matcher = "0.3.1"
|
||
serde = { version = "1", features = ["derive"] }
|
||
serde_json = "1.0.142"
|
||
tokio = { version = "1", features = ["full"] }
|
||
⚡ Running sed -n '1,200p' linux-sandbox/Cargo.toml
|
||
⎿ [package]
|
||
edition = "2024"
|
||
name = "codex-linux-sandbox"
|
||
version = { workspace = true }
|
||
|
||
… +24 lines
|
||
"macros",
|
||
"process",
|
||
"rt-multi-thread",
|
||
"signal",
|
||
] }
|
||
⚡ Running sed -n '1,200p' login/Cargo.toml
|
||
⎿ [package]
|
||
edition = "2024"
|
||
name = "codex-login"
|
||
version = { workspace = true }
|
||
|
||
… +17 lines
|
||
] }
|
||
|
||
[dev-dependencies]
|
||
pretty_assertions = "1.4.1"
|
||
tempfile = "3"
|
||
⚡ Running sed -n '1,200p' mcp-client/Cargo.toml
|
||
⎿ [package]
|
||
name = "codex-mcp-client"
|
||
version = { workspace = true }
|
||
edition = "2024"
|
||
|
||
… +13 lines
|
||
"process",
|
||
"rt-multi-thread",
|
||
"sync",
|
||
"time",
|
||
] }
|
||
⚡ Running sed -n '1,200p' mcp-server/Cargo.toml
|
||
⎿ [package]
|
||
edition = "2024"
|
||
name = "codex-mcp-server"
|
||
version = { workspace = true }
|
||
|
||
… +35 lines
|
||
mcp_test_support = { path = "tests/common" }
|
||
pretty_assertions = "1.4.1"
|
||
tempfile = "3"
|
||
tokio-test = "0.4"
|
||
wiremock = "0.6"
|
||
⚡ Running sed -n '1,200p' mcp-types/Cargo.toml
|
||
⎿ [package]
|
||
edition = "2024"
|
||
name = "mcp-types"
|
||
version = { workspace = true }
|
||
|
||
… +1 lines
|
||
workspace = true
|
||
|
||
[dependencies]
|
||
serde = { version = "1", features = ["derive"] }
|
||
serde_json = "1"
|
||
⚡ Running sed -n '1,200p' ollama/Cargo.toml
|
||
⎿ [package]
|
||
edition = "2024"
|
||
name = "codex-ollama"
|
||
version = { workspace = true }
|
||
|
||
… +22 lines
|
||
tracing = { version = "0.1.41", features = ["log"] }
|
||
wiremock = "0.6"
|
||
|
||
[dev-dependencies]
|
||
tempfile = "3"
|
||
⚡ Running sed -n '1,200p' tui/Cargo.toml
|
||
⎿ [package]
|
||
edition = "2024"
|
||
name = "codex-tui"
|
||
version = { workspace = true }
|
||
|
||
… +74 lines
|
||
chrono = { version = "0.4", features = ["serde"] }
|
||
insta = "1.43.1"
|
||
pretty_assertions = "1"
|
||
rand = "0.8"
|
||
vt100 = "0.16.2"
|
||
⚡ Running sed -n '1,200p' chatgpt/Cargo.toml
|
||
⎿ [package]
|
||
edition = "2024"
|
||
name = "codex-chatgpt"
|
||
version = { workspace = true }
|
||
|
||
… +11 lines
|
||
serde_json = "1"
|
||
tokio = { version = "1", features = ["full"] }
|
||
|
||
[dev-dependencies]
|
||
tempfile = "3"
|
||
thinking
|
||
Considering binary size impacts
|
||
|
||
I’m evaluating how binary size is influenced by various factors. There are many
|
||
binaries, such as codex and codex-exec, which include dependencies by default,
|
||
pulling in features like Tokio and hyper. For instance, reqwest includes
|
||
default features like TLS through rustls. If default features remain enabled,
|
||
it inflates the size with many additional components. I notice that for musl
|
||
targets, using vendored builds of OpenSSL could significantly increase binary
|
||
size. This is quite the intricate web of dependencies!
|
||
|
||
Analyzing musl and binary size
|
||
|
||
I’m considering the implications of building for musl with static linking.
|
||
In this case, OpenSSL might come into play through some crates, but reqwest
|
||
defaults to using rustls, although openssl-sys vendored could increase binary
|
||
size. It’s interesting to think about whether they’re targeting musl, especially
|
||
since it seems they might often be building for Mac and Linux with dynamic
|
||
linking instead.
|
||
|
||
I also realize there’s a difference between debug and release builds — release
|
||
should be smaller due to optimizations, but if they’re defaulting to debug
|
||
builds, that could lead to much larger binaries.
|
||
|
||
Assessing crate impacts on size
|
||
|
||
I need to be precise when analyzing binary sizes from various crates. Some
|
||
crates, like ratatui-image, include heavy features such as libjpeg decoders,
|
||
which can really bulk things up. Then there are the tree-sitter crates that
|
||
contain large static grammar tables, adding significant size. The starlark crate
|
||
and tracing-subscriber also contribute to size increases.
|
||
|
||
With static linking per binary being standard in Rust, it leads to multiple
|
||
copies of the same dependencies unless cdylibs are used. That makes the compile
|
||
times longer, but using LTO helps reduce bloat somewhat through deduplication.
|
||
|
||
Evaluating binary size factors
|
||
|
||
Having multiple binaries indeed increases the overall size significantly.
|
||
I'm considering options like upgrading to "opt-level = 'z'" or using
|
||
"strip=debuginfo" for reducing size. It's interesting that while the profile
|
||
only sets symbols for stripping, not debuginfo, this could inflate sizes if
|
||
retained in release builds.
|
||
|
||
The distinction between "symbols" and "debuginfo" is critical; "symbols" keeps
|
||
debug information which may not align with the goal of minimizing size. I’d like
|
||
to verify with the Rust documentation, but my network access is restricted.
|
||
|
||
Explaining large binary sizes
|
||
|
||
I’m analyzing why binaries in this project are so large. The primary reasons
|
||
include monolithic static linking, where each binary includes the full
|
||
dependency graph without shared libraries. Heavy dependencies such as reqwest,
|
||
hyper, and TLS components also contribute significantly to size, along with
|
||
image decoders and other libraries.
|
||
|
||
Using MUSL targets means vendored OpenSSL can drastically increase the size due
|
||
to static linking. Furthermore, if the release profile keeps debug information
|
||
due to using "strip = symbols," that can lead to larger binaries as well. It’s a
|
||
mix of several factors, including panic strategies and platform-specific linking
|
||
behavior.
|
||
|
||
Identifying binary size contributors
|
||
|
||
I’m considering why binaries are still large even without static linking. The
|
||
tracing-subscriber crate potentially adds weight because it includes regex,
|
||
though it can sometimes use a lighter version. References from files like "core/
|
||
Cargo.toml" show dependencies like reqwest and tokio, while the tui uses the
|
||
image crate.
|
||
|
||
To present this clearly, I’ll format the final answer with two main sections:
|
||
"Main Causes" and "Build-Mode Notes." I can also include brief suggestions for
|
||
reducing size, but I want to stay focused on answering the user's question.
|
||
|
||
|
||
|
||
|
||
|
||
|
||
codex
|
||
Here’s what’s driving size in this workspace’s binaries.
|
||
|
||
Main Causes
|
||
|
||
- Static linking style: Each bin (codex, codex-tui, codex-exec,
|
||
codex-mcp-server, etc.) statically links its full dependency graph, so common
|
||
code isn’t shared at runtime across executables.
|
||
- Heavy deps (HTTP/TLS): reqwest brings in Hyper, HTTP/2, compressors, and a TLS
|
||
stack (rustls by default; OpenSSL on musl). In core, login, tui, and ollama you
|
||
enable reqwest with json/stream, which still pulls a large transitive set.
|
||
- Image/terminal stack: codex-tui includes image (with jpeg), ratatui,
|
||
crossterm, and ratatui-image, which together contribute significant code size
|
||
(decoders, pixel ops, terminal backends).
|
||
- Parsers/VMs: tree-sitter + tree-sitter-bash (in core and apply-patch) and
|
||
starlark (in execpolicy) include sizeable parser/VM tables and runtimes.
|
||
- Tokio runtime: Broad tokio features (rt-multi-thread, macros, process, signal)
|
||
across many crates inflate code size even if only subsets are used per bin.
|
||
- Panic + backtraces: Default panic = unwind and backtrace support keep
|
||
unwinding tables and symbols that add weight.
|
||
- Per-target OpenSSL (musl): For *-unknown-linux-musl, core enables openssl-sys
|
||
with vendored, compiling OpenSSL into the binary—this adds multiple megabytes
|
||
per executable.
|
||
|
||
Build-Mode Notes
|
||
|
||
- Release settings: You use lto = "fat" and codegen-units = 1 (good for size),
|
||
but strip = "symbols" keeps debuginfo. Debuginfo is often the largest single
|
||
contributor; if you build in release with that setting, binaries can still be
|
||
large.
|
||
- Debug builds: cargo build (dev profile) includes full debuginfo, no LTO, and
|
||
assertions—outputs are much larger than cargo build --release.
|
||
|
||
If you want, I can outline targeted trims (e.g., strip = "debuginfo", opt-level
|
||
= "z", panic abort, tighter tokio/reqwest features) and estimate impact per
|
||
binary. |