Files
llmx/codex-rs/tui/src/history_cell.rs

1105 lines
35 KiB
Rust
Raw Normal View History

use crate::diff_render::create_diff_summary;
use crate::exec_command::relativize_to_home;
use crate::exec_command::strip_bash_lc_and_escape;
use crate::markdown::append_markdown;
use crate::slash_command::SlashCommand;
codex-rs: make tool calls prettier (#1211) This PR overhauls how active tool calls and completed tool calls are displayed: 1. More use of colour to indicate success/failure and distinguish between components like tool name+arguments 2. Previously, the entire `CallToolResult` was serialized to JSON and pretty-printed. Now, we extract each individual `CallToolResultContent` and print those 1. The previous solution was wasting space by unnecessarily showing details of the `CallToolResult` struct to users, without formatting the actual tool call results nicely 2. We're now able to show users more information from tool results in less space, with nicer formatting when tools return JSON results ### Before: <img width="1251" alt="Screenshot 2025-06-03 at 11 24 26" src="https://github.com/user-attachments/assets/5a58f222-219c-4c53-ace7-d887194e30cf" /> ### After: <img width="1265" alt="image" src="https://github.com/user-attachments/assets/99fe54d0-9ebe-406a-855b-7aa529b91274" /> ## Future Work 1. Integrate image tool result handling better. We should be able to display images even if they're not the first `CallToolResultContent` 2. Users should have some way to view the full version of truncated tool results 3. It would be nice to add some left padding for tool results, make it more clear that they are results. This is doable, just a little fiddly due to the way `first_visible_line` scrolling works 4. There's almost certainly a better way to format JSON than "all on 1 line with spaces to make Ratatui wrapping work". But I think that works OK for now.
2025-06-03 14:29:26 -07:00
use crate::text_formatting::format_and_truncate_tool_result;
fix: introduce ResponseInputItem::McpToolCallOutput variant (#1151) The output of an MCP server tool call can be one of several types, but to date, we treated all outputs as text by showing the serialized JSON as the "tool output" in Codex: https://github.com/openai/codex/blob/25a9949c49194d5a64de54a11bcc5b4724ac9bd5/codex-rs/mcp-types/src/lib.rs#L96-L101 This PR adds support for the `ImageContent` variant so we can now display an image output from an MCP tool call. In making this change, we introduce a new `ResponseInputItem::McpToolCallOutput` variant so that we can work with the `mcp_types::CallToolResult` directly when the function call is made to an MCP server. Though arguably the more significant change is the introduction of `HistoryCell::CompletedMcpToolCallWithImageOutput`, which is a cell that uses `ratatui_image` to render an image into the terminal. To support this, we introduce `ImageRenderCache`, cache a `ratatui_image::picker::Picker`, and `ensure_image_cache()` to cache the appropriate scaled image data and dimensions based on the current terminal size. To test, I created a minimal `package.json`: ```json { "name": "kitty-mcp", "version": "1.0.0", "type": "module", "description": "MCP that returns image of kitty", "main": "index.js", "dependencies": { "@modelcontextprotocol/sdk": "^1.12.0" } } ``` with the following `index.js` to define the MCP server: ```js #!/usr/bin/env node import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js"; import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js"; import { readFile } from "node:fs/promises"; import { join } from "node:path"; const IMAGE_URI = "image://Ada.png"; const server = new McpServer({ name: "Demo", version: "1.0.0", }); server.tool( "get-cat-image", "If you need a cat image, this tool will provide one.", async () => ({ content: [ { type: "image", data: await getAdaPngBase64(), mimeType: "image/png" }, ], }) ); server.resource("Ada the Cat", IMAGE_URI, async (uri) => { const base64Image = await getAdaPngBase64(); return { contents: [ { uri: uri.href, mimeType: "image/png", blob: base64Image, }, ], }; }); async function getAdaPngBase64() { const __dirname = new URL(".", import.meta.url).pathname; // From https://github.com/benjajaja/ratatui-image/blob/9705ce2c59ec669abbce2924cbfd1f5ae22c9860/assets/Ada.png const filePath = join(__dirname, "Ada.png"); const imageData = await readFile(filePath); const base64Image = imageData.toString("base64"); return base64Image; } const transport = new StdioServerTransport(); await server.connect(transport); ``` With the local changes from this PR, I added the following to my `config.toml`: ```toml [mcp_servers.kitty] command = "node" args = ["/Users/mbolin/code/kitty-mcp/index.js"] ``` Running the TUI from source: ``` cargo run --bin codex -- --model o3 'I need a picture of a cat' ``` I get: <img width="732" alt="image" src="https://github.com/user-attachments/assets/bf80b721-9ca0-4d81-aec7-77d6899e2869" /> Now, that said, I have only tested in iTerm and there is definitely some funny business with getting an accurate character-to-pixel ratio (sometimes the `CompletedMcpToolCallWithImageOutput` thinks it needs 10 rows to render instead of 4), so there is still work to be done here.
2025-05-28 19:03:17 -07:00
use base64::Engine;
use codex_ansi_escape::ansi_escape_line;
use codex_common::create_config_summary_entries;
use codex_common::elapsed::format_duration;
use codex_core::config::Config;
use codex_core::plan_tool::PlanItemArg;
use codex_core::plan_tool::StepStatus;
use codex_core::plan_tool::UpdatePlanArgs;
use codex_core::project_doc::discover_project_doc_paths;
use codex_core::protocol::FileChange;
use codex_core::protocol::McpInvocation;
use codex_core::protocol::SandboxPolicy;
use codex_core::protocol::SessionConfiguredEvent;
use codex_core::protocol::TokenUsage;
use codex_login::get_auth_file;
use codex_login::try_read_auth_json;
use codex_protocol::parse_command::ParsedCommand;
fix: introduce ResponseInputItem::McpToolCallOutput variant (#1151) The output of an MCP server tool call can be one of several types, but to date, we treated all outputs as text by showing the serialized JSON as the "tool output" in Codex: https://github.com/openai/codex/blob/25a9949c49194d5a64de54a11bcc5b4724ac9bd5/codex-rs/mcp-types/src/lib.rs#L96-L101 This PR adds support for the `ImageContent` variant so we can now display an image output from an MCP tool call. In making this change, we introduce a new `ResponseInputItem::McpToolCallOutput` variant so that we can work with the `mcp_types::CallToolResult` directly when the function call is made to an MCP server. Though arguably the more significant change is the introduction of `HistoryCell::CompletedMcpToolCallWithImageOutput`, which is a cell that uses `ratatui_image` to render an image into the terminal. To support this, we introduce `ImageRenderCache`, cache a `ratatui_image::picker::Picker`, and `ensure_image_cache()` to cache the appropriate scaled image data and dimensions based on the current terminal size. To test, I created a minimal `package.json`: ```json { "name": "kitty-mcp", "version": "1.0.0", "type": "module", "description": "MCP that returns image of kitty", "main": "index.js", "dependencies": { "@modelcontextprotocol/sdk": "^1.12.0" } } ``` with the following `index.js` to define the MCP server: ```js #!/usr/bin/env node import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js"; import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js"; import { readFile } from "node:fs/promises"; import { join } from "node:path"; const IMAGE_URI = "image://Ada.png"; const server = new McpServer({ name: "Demo", version: "1.0.0", }); server.tool( "get-cat-image", "If you need a cat image, this tool will provide one.", async () => ({ content: [ { type: "image", data: await getAdaPngBase64(), mimeType: "image/png" }, ], }) ); server.resource("Ada the Cat", IMAGE_URI, async (uri) => { const base64Image = await getAdaPngBase64(); return { contents: [ { uri: uri.href, mimeType: "image/png", blob: base64Image, }, ], }; }); async function getAdaPngBase64() { const __dirname = new URL(".", import.meta.url).pathname; // From https://github.com/benjajaja/ratatui-image/blob/9705ce2c59ec669abbce2924cbfd1f5ae22c9860/assets/Ada.png const filePath = join(__dirname, "Ada.png"); const imageData = await readFile(filePath); const base64Image = imageData.toString("base64"); return base64Image; } const transport = new StdioServerTransport(); await server.connect(transport); ``` With the local changes from this PR, I added the following to my `config.toml`: ```toml [mcp_servers.kitty] command = "node" args = ["/Users/mbolin/code/kitty-mcp/index.js"] ``` Running the TUI from source: ``` cargo run --bin codex -- --model o3 'I need a picture of a cat' ``` I get: <img width="732" alt="image" src="https://github.com/user-attachments/assets/bf80b721-9ca0-4d81-aec7-77d6899e2869" /> Now, that said, I have only tested in iTerm and there is definitely some funny business with getting an accurate character-to-pixel ratio (sometimes the `CompletedMcpToolCallWithImageOutput` thinks it needs 10 rows to render instead of 4), so there is still work to be done here.
2025-05-28 19:03:17 -07:00
use image::DynamicImage;
use image::ImageReader;
codex-rs: make tool calls prettier (#1211) This PR overhauls how active tool calls and completed tool calls are displayed: 1. More use of colour to indicate success/failure and distinguish between components like tool name+arguments 2. Previously, the entire `CallToolResult` was serialized to JSON and pretty-printed. Now, we extract each individual `CallToolResultContent` and print those 1. The previous solution was wasting space by unnecessarily showing details of the `CallToolResult` struct to users, without formatting the actual tool call results nicely 2. We're now able to show users more information from tool results in less space, with nicer formatting when tools return JSON results ### Before: <img width="1251" alt="Screenshot 2025-06-03 at 11 24 26" src="https://github.com/user-attachments/assets/5a58f222-219c-4c53-ace7-d887194e30cf" /> ### After: <img width="1265" alt="image" src="https://github.com/user-attachments/assets/99fe54d0-9ebe-406a-855b-7aa529b91274" /> ## Future Work 1. Integrate image tool result handling better. We should be able to display images even if they're not the first `CallToolResultContent` 2. Users should have some way to view the full version of truncated tool results 3. It would be nice to add some left padding for tool results, make it more clear that they are results. This is doable, just a little fiddly due to the way `first_visible_line` scrolling works 4. There's almost certainly a better way to format JSON than "all on 1 line with spaces to make Ratatui wrapping work". But I think that works OK for now.
2025-06-03 14:29:26 -07:00
use mcp_types::EmbeddedResourceResource;
use mcp_types::ResourceLink;
use ratatui::prelude::*;
use ratatui::style::Color;
use ratatui::style::Modifier;
use ratatui::style::Style;
use ratatui::widgets::Paragraph;
use ratatui::widgets::WidgetRef;
use ratatui::widgets::Wrap;
use shlex::try_join as shlex_try_join;
use std::collections::HashMap;
fix: introduce ResponseInputItem::McpToolCallOutput variant (#1151) The output of an MCP server tool call can be one of several types, but to date, we treated all outputs as text by showing the serialized JSON as the "tool output" in Codex: https://github.com/openai/codex/blob/25a9949c49194d5a64de54a11bcc5b4724ac9bd5/codex-rs/mcp-types/src/lib.rs#L96-L101 This PR adds support for the `ImageContent` variant so we can now display an image output from an MCP tool call. In making this change, we introduce a new `ResponseInputItem::McpToolCallOutput` variant so that we can work with the `mcp_types::CallToolResult` directly when the function call is made to an MCP server. Though arguably the more significant change is the introduction of `HistoryCell::CompletedMcpToolCallWithImageOutput`, which is a cell that uses `ratatui_image` to render an image into the terminal. To support this, we introduce `ImageRenderCache`, cache a `ratatui_image::picker::Picker`, and `ensure_image_cache()` to cache the appropriate scaled image data and dimensions based on the current terminal size. To test, I created a minimal `package.json`: ```json { "name": "kitty-mcp", "version": "1.0.0", "type": "module", "description": "MCP that returns image of kitty", "main": "index.js", "dependencies": { "@modelcontextprotocol/sdk": "^1.12.0" } } ``` with the following `index.js` to define the MCP server: ```js #!/usr/bin/env node import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js"; import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js"; import { readFile } from "node:fs/promises"; import { join } from "node:path"; const IMAGE_URI = "image://Ada.png"; const server = new McpServer({ name: "Demo", version: "1.0.0", }); server.tool( "get-cat-image", "If you need a cat image, this tool will provide one.", async () => ({ content: [ { type: "image", data: await getAdaPngBase64(), mimeType: "image/png" }, ], }) ); server.resource("Ada the Cat", IMAGE_URI, async (uri) => { const base64Image = await getAdaPngBase64(); return { contents: [ { uri: uri.href, mimeType: "image/png", blob: base64Image, }, ], }; }); async function getAdaPngBase64() { const __dirname = new URL(".", import.meta.url).pathname; // From https://github.com/benjajaja/ratatui-image/blob/9705ce2c59ec669abbce2924cbfd1f5ae22c9860/assets/Ada.png const filePath = join(__dirname, "Ada.png"); const imageData = await readFile(filePath); const base64Image = imageData.toString("base64"); return base64Image; } const transport = new StdioServerTransport(); await server.connect(transport); ``` With the local changes from this PR, I added the following to my `config.toml`: ```toml [mcp_servers.kitty] command = "node" args = ["/Users/mbolin/code/kitty-mcp/index.js"] ``` Running the TUI from source: ``` cargo run --bin codex -- --model o3 'I need a picture of a cat' ``` I get: <img width="732" alt="image" src="https://github.com/user-attachments/assets/bf80b721-9ca0-4d81-aec7-77d6899e2869" /> Now, that said, I have only tested in iTerm and there is definitely some funny business with getting an accurate character-to-pixel ratio (sometimes the `CompletedMcpToolCallWithImageOutput` thinks it needs 10 rows to render instead of 4), so there is still work to be done here.
2025-05-28 19:03:17 -07:00
use std::io::Cursor;
use std::path::PathBuf;
use std::time::Duration;
use std::time::Instant;
fix: introduce ResponseInputItem::McpToolCallOutput variant (#1151) The output of an MCP server tool call can be one of several types, but to date, we treated all outputs as text by showing the serialized JSON as the "tool output" in Codex: https://github.com/openai/codex/blob/25a9949c49194d5a64de54a11bcc5b4724ac9bd5/codex-rs/mcp-types/src/lib.rs#L96-L101 This PR adds support for the `ImageContent` variant so we can now display an image output from an MCP tool call. In making this change, we introduce a new `ResponseInputItem::McpToolCallOutput` variant so that we can work with the `mcp_types::CallToolResult` directly when the function call is made to an MCP server. Though arguably the more significant change is the introduction of `HistoryCell::CompletedMcpToolCallWithImageOutput`, which is a cell that uses `ratatui_image` to render an image into the terminal. To support this, we introduce `ImageRenderCache`, cache a `ratatui_image::picker::Picker`, and `ensure_image_cache()` to cache the appropriate scaled image data and dimensions based on the current terminal size. To test, I created a minimal `package.json`: ```json { "name": "kitty-mcp", "version": "1.0.0", "type": "module", "description": "MCP that returns image of kitty", "main": "index.js", "dependencies": { "@modelcontextprotocol/sdk": "^1.12.0" } } ``` with the following `index.js` to define the MCP server: ```js #!/usr/bin/env node import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js"; import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js"; import { readFile } from "node:fs/promises"; import { join } from "node:path"; const IMAGE_URI = "image://Ada.png"; const server = new McpServer({ name: "Demo", version: "1.0.0", }); server.tool( "get-cat-image", "If you need a cat image, this tool will provide one.", async () => ({ content: [ { type: "image", data: await getAdaPngBase64(), mimeType: "image/png" }, ], }) ); server.resource("Ada the Cat", IMAGE_URI, async (uri) => { const base64Image = await getAdaPngBase64(); return { contents: [ { uri: uri.href, mimeType: "image/png", blob: base64Image, }, ], }; }); async function getAdaPngBase64() { const __dirname = new URL(".", import.meta.url).pathname; // From https://github.com/benjajaja/ratatui-image/blob/9705ce2c59ec669abbce2924cbfd1f5ae22c9860/assets/Ada.png const filePath = join(__dirname, "Ada.png"); const imageData = await readFile(filePath); const base64Image = imageData.toString("base64"); return base64Image; } const transport = new StdioServerTransport(); await server.connect(transport); ``` With the local changes from this PR, I added the following to my `config.toml`: ```toml [mcp_servers.kitty] command = "node" args = ["/Users/mbolin/code/kitty-mcp/index.js"] ``` Running the TUI from source: ``` cargo run --bin codex -- --model o3 'I need a picture of a cat' ``` I get: <img width="732" alt="image" src="https://github.com/user-attachments/assets/bf80b721-9ca0-4d81-aec7-77d6899e2869" /> Now, that said, I have only tested in iTerm and there is definitely some funny business with getting an accurate character-to-pixel ratio (sometimes the `CompletedMcpToolCallWithImageOutput` thinks it needs 10 rows to render instead of 4), so there is still work to be done here.
2025-05-28 19:03:17 -07:00
use tracing::error;
use uuid::Uuid;
#[derive(Clone, Debug)]
pub(crate) struct CommandOutput {
pub(crate) exit_code: i32,
pub(crate) stdout: String,
pub(crate) stderr: String,
}
pub(crate) enum PatchEventType {
ApprovalRequest,
ApplyBegin { auto_approved: bool },
}
/// Represents an event to display in the conversation history. Returns its
/// `Vec<Line<'static>>` representation to make it easier to display in a
/// scrollable list.
pub(crate) trait HistoryCell: std::fmt::Debug + Send + Sync {
fn display_lines(&self) -> Vec<Line<'static>>;
fn transcript_lines(&self) -> Vec<Line<'static>> {
self.display_lines()
}
fn desired_height(&self, width: u16) -> u16 {
Paragraph::new(Text::from(self.display_lines()))
.wrap(Wrap { trim: false })
.line_count(width)
.try_into()
.unwrap_or(0)
}
}
#[derive(Debug)]
pub(crate) struct PlainHistoryCell {
lines: Vec<Line<'static>>,
}
impl HistoryCell for PlainHistoryCell {
fn display_lines(&self) -> Vec<Line<'static>> {
self.lines.clone()
}
}
#[derive(Debug)]
pub(crate) struct TranscriptOnlyHistoryCell {
lines: Vec<Line<'static>>,
}
impl HistoryCell for TranscriptOnlyHistoryCell {
fn display_lines(&self) -> Vec<Line<'static>> {
Vec::new()
}
fn transcript_lines(&self) -> Vec<Line<'static>> {
self.lines.clone()
}
}
#[derive(Debug)]
pub(crate) struct ExecCell {
pub(crate) command: Vec<String>,
pub(crate) parsed: Vec<ParsedCommand>,
pub(crate) output: Option<CommandOutput>,
start_time: Option<Instant>,
}
impl HistoryCell for ExecCell {
fn display_lines(&self) -> Vec<Line<'static>> {
exec_command_lines(
&self.command,
&self.parsed,
self.output.as_ref(),
self.start_time,
)
}
}
impl WidgetRef for &ExecCell {
fn render_ref(&self, area: Rect, buf: &mut Buffer) {
Paragraph::new(Text::from(self.display_lines()))
.wrap(Wrap { trim: false })
.render(area, buf);
}
}
#[derive(Debug)]
struct CompletedMcpToolCallWithImageOutput {
_image: DynamicImage,
}
impl HistoryCell for CompletedMcpToolCallWithImageOutput {
fn display_lines(&self) -> Vec<Line<'static>> {
vec![
Line::from("tool result (image output omitted)"),
Line::from(""),
]
}
}
const TOOL_CALL_MAX_LINES: usize = 5;
fn title_case(s: &str) -> String {
if s.is_empty() {
return String::new();
}
let mut chars = s.chars();
let first = match chars.next() {
Some(c) => c,
None => return String::new(),
};
let rest: String = chars.as_str().to_ascii_lowercase();
first.to_uppercase().collect::<String>() + &rest
}
fn pretty_provider_name(id: &str) -> String {
if id.eq_ignore_ascii_case("openai") {
"OpenAI".to_string()
} else {
title_case(id)
}
}
pub(crate) fn new_session_info(
config: &Config,
event: SessionConfiguredEvent,
is_first_event: bool,
) -> PlainHistoryCell {
let SessionConfiguredEvent {
model,
session_id: _,
history_log_id: _,
history_entry_count: _,
} = event;
if is_first_event {
let cwd_str = match relativize_to_home(&config.cwd) {
Some(rel) if !rel.as_os_str().is_empty() => format!("~/{}", rel.display()),
Some(_) => "~".to_string(),
None => config.cwd.display().to_string(),
};
let lines: Vec<Line<'static>> = vec![
Line::from(vec![
Span::raw(">_ ").dim(),
Span::styled(
"You are using OpenAI Codex in",
Style::default().add_modifier(Modifier::BOLD),
),
Span::raw(format!(" {cwd_str}")).dim(),
]),
Line::from("".dim()),
Line::from(" To get started, describe a task or try one of these commands:".dim()),
Line::from("".dim()),
Line::from(format!(" /init - {}", SlashCommand::Init.description()).dim()),
Line::from(format!(" /status - {}", SlashCommand::Status.description()).dim()),
Line::from(format!(" /approvals - {}", SlashCommand::Approvals.description()).dim()),
Line::from(format!(" /model - {}", SlashCommand::Model.description()).dim()),
Line::from("".dim()),
];
PlainHistoryCell { lines }
} else if config.model == model {
PlainHistoryCell { lines: Vec::new() }
} else {
let lines = vec![
Line::from("model changed:".magenta().bold()),
Line::from(format!("requested: {}", config.model)),
Line::from(format!("used: {model}")),
Line::from(""),
];
PlainHistoryCell { lines }
}
}
pub(crate) fn new_user_prompt(message: String) -> PlainHistoryCell {
let mut lines: Vec<Line<'static>> = Vec::new();
lines.push(Line::from("user".cyan().bold()));
lines.extend(message.lines().map(|l| Line::from(l.to_string())));
lines.push(Line::from(""));
PlainHistoryCell { lines }
}
feat: support the chat completions API in the Rust CLI (#862) This is a substantial PR to add support for the chat completions API, which in turn makes it possible to use non-OpenAI model providers (just like in the TypeScript CLI): * It moves a number of structs from `client.rs` to `client_common.rs` so they can be shared. * It introduces support for the chat completions API in `chat_completions.rs`. * It updates `ModelProviderInfo` so that `env_key` is `Option<String>` instead of `String` (for e.g., ollama) and adds a `wire_api` field * It updates `client.rs` to choose between `stream_responses()` and `stream_chat_completions()` based on the `wire_api` for the `ModelProviderInfo` * It updates the `exec` and TUI CLIs to no longer fail if the `OPENAI_API_KEY` environment variable is not set * It updates the TUI so that `EventMsg::Error` is displayed more prominently when it occurs, particularly now that it is important to alert users to the `CodexErr::EnvVar` variant. * `CodexErr::EnvVar` was updated to include an optional `instructions` field so we can preserve the behavior where we direct users to https://platform.openai.com if `OPENAI_API_KEY` is not set. * Cleaned up the "welcome message" in the TUI to ensure the model provider is displayed. * Updated the docs in `codex-rs/README.md`. To exercise the chat completions API from OpenAI models, I added the following to my `config.toml`: ```toml model = "gpt-4o" model_provider = "openai-chat-completions" [model_providers.openai-chat-completions] name = "OpenAI using Chat Completions" base_url = "https://api.openai.com/v1" env_key = "OPENAI_API_KEY" wire_api = "chat" ``` Though to test a non-OpenAI provider, I installed ollama with mistral locally on my Mac because ChatGPT said that would be a good match for my hardware: ```shell brew install ollama ollama serve ollama pull mistral ``` Then I added the following to my `~/.codex/config.toml`: ```toml model = "mistral" model_provider = "ollama" ``` Note this code could certainly use more test coverage, but I want to get this in so folks can start playing with it. For reference, I believe https://github.com/openai/codex/pull/247 was roughly the comparable PR on the TypeScript side.
2025-05-08 21:46:06 -07:00
pub(crate) fn new_active_exec_command(
command: Vec<String>,
parsed: Vec<ParsedCommand>,
) -> ExecCell {
ExecCell {
command,
parsed,
output: None,
start_time: Some(Instant::now()),
}
}
pub(crate) fn new_completed_exec_command(
command: Vec<String>,
parsed: Vec<ParsedCommand>,
output: CommandOutput,
) -> ExecCell {
ExecCell {
command,
parsed,
output: Some(output),
start_time: None,
[1/3] Parse exec commands and format them more nicely in the UI (#2095) # Note for reviewers The bulk of this PR is in in the new file, `parse_command.rs`. This file is designed to be written TDD and implemented with Codex. Do not worry about reviewing the code, just review the unit tests (if you want). If any cases are missing, we'll add more tests and have Codex fix them. I think the best approach will be to land and iterate. I have some follow-ups I want to do after this lands. The next PR after this will let us merge (and dedupe) multiple sequential cells of the same such as multiple read commands. The deduping will also be important because the model often reads the same file multiple times in a row in chunks === This PR formats common commands like reading, formatting, testing, etc more nicely: It tries to extract things like file names, tests and falls back to the cmd if it doesn't. It also only shows stdout/err if the command failed. <img width="770" height="238" alt="CleanShot 2025-08-09 at 16 05 15" src="https://github.com/user-attachments/assets/0ead179a-8910-486b-aa3d-7d26264d751e" /> <img width="348" height="158" alt="CleanShot 2025-08-09 at 16 05 32" src="https://github.com/user-attachments/assets/4302681b-5e87-4ff3-85b4-0252c6c485a9" /> <img width="834" height="324" alt="CleanShot 2025-08-09 at 16 05 56 2" src="https://github.com/user-attachments/assets/09fb3517-7bd6-40f6-a126-4172106b700f" /> Part 2: https://github.com/openai/codex/pull/2097 Part 3: https://github.com/openai/codex/pull/2110
2025-08-11 11:26:15 -07:00
}
}
fn exec_duration(start: Instant) -> String {
format!("{}s", start.elapsed().as_secs())
}
fn exec_command_lines(
command: &[String],
parsed: &[ParsedCommand],
output: Option<&CommandOutput>,
start_time: Option<Instant>,
) -> Vec<Line<'static>> {
match parsed.is_empty() {
true => new_exec_command_generic(command, output, start_time),
false => new_parsed_command(command, parsed, output, start_time),
}
}
fn new_parsed_command(
command: &[String],
parsed_commands: &[ParsedCommand],
output: Option<&CommandOutput>,
start_time: Option<Instant>,
) -> Vec<Line<'static>> {
let mut lines: Vec<Line> = Vec::new();
match output {
None => {
let mut spans = vec!["⚙︎ Working".magenta().bold()];
if let Some(st) = start_time {
let dur = exec_duration(st);
spans.push(format!("{dur}").dim());
}
lines.push(Line::from(spans));
}
Some(o) if o.exit_code == 0 => {
lines.push(Line::from(vec!["".green(), " Completed".into()]));
}
Some(o) => {
lines.push(Line::from(vec![
"".red(),
format!(" Failed (exit {})", o.exit_code).into(),
]));
}
};
// Optionally include the complete, unaltered command from the model.
if std::env::var("SHOW_FULL_COMMANDS")
.map(|v| !v.is_empty())
.unwrap_or(false)
{
let full_cmd = shlex_try_join(command.iter().map(|s| s.as_str()))
.unwrap_or_else(|_| command.join(" "));
lines.push(Line::from(vec![
Span::styled("", Style::default().add_modifier(Modifier::DIM)),
Span::styled(
full_cmd,
Style::default()
.add_modifier(Modifier::DIM)
.add_modifier(Modifier::ITALIC),
),
]));
}
for (i, parsed) in parsed_commands.iter().enumerate() {
let text = match parsed {
ParsedCommand::Read { name, .. } => format!("📖 {name}"),
ParsedCommand::ListFiles { cmd, path } => match path {
Some(p) => format!("📂 {p}"),
None => format!("📂 {cmd}"),
},
ParsedCommand::Search { query, path, cmd } => match (query, path) {
(Some(q), Some(p)) => format!("🔎 {q} in {p}"),
(Some(q), None) => format!("🔎 {q}"),
(None, Some(p)) => format!("🔎 {p}"),
(None, None) => format!("🔎 {cmd}"),
},
ParsedCommand::Format { .. } => "✨ Formatting".to_string(),
ParsedCommand::Test { cmd } => format!("🧪 {cmd}"),
ParsedCommand::Lint { cmd, .. } => format!("🧹 {cmd}"),
ParsedCommand::Unknown { cmd } => format!("⌨️ {cmd}"),
ParsedCommand::Noop { cmd } => format!("🔄 {cmd}"),
};
[1/3] Parse exec commands and format them more nicely in the UI (#2095) # Note for reviewers The bulk of this PR is in in the new file, `parse_command.rs`. This file is designed to be written TDD and implemented with Codex. Do not worry about reviewing the code, just review the unit tests (if you want). If any cases are missing, we'll add more tests and have Codex fix them. I think the best approach will be to land and iterate. I have some follow-ups I want to do after this lands. The next PR after this will let us merge (and dedupe) multiple sequential cells of the same such as multiple read commands. The deduping will also be important because the model often reads the same file multiple times in a row in chunks === This PR formats common commands like reading, formatting, testing, etc more nicely: It tries to extract things like file names, tests and falls back to the cmd if it doesn't. It also only shows stdout/err if the command failed. <img width="770" height="238" alt="CleanShot 2025-08-09 at 16 05 15" src="https://github.com/user-attachments/assets/0ead179a-8910-486b-aa3d-7d26264d751e" /> <img width="348" height="158" alt="CleanShot 2025-08-09 at 16 05 32" src="https://github.com/user-attachments/assets/4302681b-5e87-4ff3-85b4-0252c6c485a9" /> <img width="834" height="324" alt="CleanShot 2025-08-09 at 16 05 56 2" src="https://github.com/user-attachments/assets/09fb3517-7bd6-40f6-a126-4172106b700f" /> Part 2: https://github.com/openai/codex/pull/2097 Part 3: https://github.com/openai/codex/pull/2110
2025-08-11 11:26:15 -07:00
let first_prefix = if i == 0 { "" } else { " " };
for (j, line_text) in text.lines().enumerate() {
let prefix = if j == 0 { first_prefix } else { " " };
lines.push(Line::from(vec![
Span::styled(prefix, Style::default().add_modifier(Modifier::DIM)),
line_text.to_string().dim(),
]));
}
[1/3] Parse exec commands and format them more nicely in the UI (#2095) # Note for reviewers The bulk of this PR is in in the new file, `parse_command.rs`. This file is designed to be written TDD and implemented with Codex. Do not worry about reviewing the code, just review the unit tests (if you want). If any cases are missing, we'll add more tests and have Codex fix them. I think the best approach will be to land and iterate. I have some follow-ups I want to do after this lands. The next PR after this will let us merge (and dedupe) multiple sequential cells of the same such as multiple read commands. The deduping will also be important because the model often reads the same file multiple times in a row in chunks === This PR formats common commands like reading, formatting, testing, etc more nicely: It tries to extract things like file names, tests and falls back to the cmd if it doesn't. It also only shows stdout/err if the command failed. <img width="770" height="238" alt="CleanShot 2025-08-09 at 16 05 15" src="https://github.com/user-attachments/assets/0ead179a-8910-486b-aa3d-7d26264d751e" /> <img width="348" height="158" alt="CleanShot 2025-08-09 at 16 05 32" src="https://github.com/user-attachments/assets/4302681b-5e87-4ff3-85b4-0252c6c485a9" /> <img width="834" height="324" alt="CleanShot 2025-08-09 at 16 05 56 2" src="https://github.com/user-attachments/assets/09fb3517-7bd6-40f6-a126-4172106b700f" /> Part 2: https://github.com/openai/codex/pull/2097 Part 3: https://github.com/openai/codex/pull/2110
2025-08-11 11:26:15 -07:00
}
lines.extend(output_lines(output, true, false));
lines.push(Line::from(""));
[1/3] Parse exec commands and format them more nicely in the UI (#2095) # Note for reviewers The bulk of this PR is in in the new file, `parse_command.rs`. This file is designed to be written TDD and implemented with Codex. Do not worry about reviewing the code, just review the unit tests (if you want). If any cases are missing, we'll add more tests and have Codex fix them. I think the best approach will be to land and iterate. I have some follow-ups I want to do after this lands. The next PR after this will let us merge (and dedupe) multiple sequential cells of the same such as multiple read commands. The deduping will also be important because the model often reads the same file multiple times in a row in chunks === This PR formats common commands like reading, formatting, testing, etc more nicely: It tries to extract things like file names, tests and falls back to the cmd if it doesn't. It also only shows stdout/err if the command failed. <img width="770" height="238" alt="CleanShot 2025-08-09 at 16 05 15" src="https://github.com/user-attachments/assets/0ead179a-8910-486b-aa3d-7d26264d751e" /> <img width="348" height="158" alt="CleanShot 2025-08-09 at 16 05 32" src="https://github.com/user-attachments/assets/4302681b-5e87-4ff3-85b4-0252c6c485a9" /> <img width="834" height="324" alt="CleanShot 2025-08-09 at 16 05 56 2" src="https://github.com/user-attachments/assets/09fb3517-7bd6-40f6-a126-4172106b700f" /> Part 2: https://github.com/openai/codex/pull/2097 Part 3: https://github.com/openai/codex/pull/2110
2025-08-11 11:26:15 -07:00
lines
}
[1/3] Parse exec commands and format them more nicely in the UI (#2095) # Note for reviewers The bulk of this PR is in in the new file, `parse_command.rs`. This file is designed to be written TDD and implemented with Codex. Do not worry about reviewing the code, just review the unit tests (if you want). If any cases are missing, we'll add more tests and have Codex fix them. I think the best approach will be to land and iterate. I have some follow-ups I want to do after this lands. The next PR after this will let us merge (and dedupe) multiple sequential cells of the same such as multiple read commands. The deduping will also be important because the model often reads the same file multiple times in a row in chunks === This PR formats common commands like reading, formatting, testing, etc more nicely: It tries to extract things like file names, tests and falls back to the cmd if it doesn't. It also only shows stdout/err if the command failed. <img width="770" height="238" alt="CleanShot 2025-08-09 at 16 05 15" src="https://github.com/user-attachments/assets/0ead179a-8910-486b-aa3d-7d26264d751e" /> <img width="348" height="158" alt="CleanShot 2025-08-09 at 16 05 32" src="https://github.com/user-attachments/assets/4302681b-5e87-4ff3-85b4-0252c6c485a9" /> <img width="834" height="324" alt="CleanShot 2025-08-09 at 16 05 56 2" src="https://github.com/user-attachments/assets/09fb3517-7bd6-40f6-a126-4172106b700f" /> Part 2: https://github.com/openai/codex/pull/2097 Part 3: https://github.com/openai/codex/pull/2110
2025-08-11 11:26:15 -07:00
fn new_exec_command_generic(
command: &[String],
output: Option<&CommandOutput>,
start_time: Option<Instant>,
) -> Vec<Line<'static>> {
let mut lines: Vec<Line<'static>> = Vec::new();
let command_escaped = strip_bash_lc_and_escape(command);
let mut cmd_lines = command_escaped.lines();
if let Some(first) = cmd_lines.next() {
let mut spans: Vec<Span> = vec!["⚡ Running".magenta()];
if let Some(st) = start_time {
let dur = exec_duration(st);
spans.push(format!("{dur}").dim());
}
spans.push(" ".into());
spans.push(first.to_string().into());
lines.push(Line::from(spans));
} else {
let mut spans: Vec<Span> = vec!["⚡ Running".magenta()];
if let Some(st) = start_time {
let dur = exec_duration(st);
spans.push(format!("{dur}").dim());
}
lines.push(Line::from(spans));
}
for cont in cmd_lines {
lines.push(Line::from(cont.to_string()));
}
[1/3] Parse exec commands and format them more nicely in the UI (#2095) # Note for reviewers The bulk of this PR is in in the new file, `parse_command.rs`. This file is designed to be written TDD and implemented with Codex. Do not worry about reviewing the code, just review the unit tests (if you want). If any cases are missing, we'll add more tests and have Codex fix them. I think the best approach will be to land and iterate. I have some follow-ups I want to do after this lands. The next PR after this will let us merge (and dedupe) multiple sequential cells of the same such as multiple read commands. The deduping will also be important because the model often reads the same file multiple times in a row in chunks === This PR formats common commands like reading, formatting, testing, etc more nicely: It tries to extract things like file names, tests and falls back to the cmd if it doesn't. It also only shows stdout/err if the command failed. <img width="770" height="238" alt="CleanShot 2025-08-09 at 16 05 15" src="https://github.com/user-attachments/assets/0ead179a-8910-486b-aa3d-7d26264d751e" /> <img width="348" height="158" alt="CleanShot 2025-08-09 at 16 05 32" src="https://github.com/user-attachments/assets/4302681b-5e87-4ff3-85b4-0252c6c485a9" /> <img width="834" height="324" alt="CleanShot 2025-08-09 at 16 05 56 2" src="https://github.com/user-attachments/assets/09fb3517-7bd6-40f6-a126-4172106b700f" /> Part 2: https://github.com/openai/codex/pull/2097 Part 3: https://github.com/openai/codex/pull/2110
2025-08-11 11:26:15 -07:00
lines.extend(output_lines(output, false, true));
lines
}
pub(crate) fn new_active_mcp_tool_call(invocation: McpInvocation) -> PlainHistoryCell {
let title_line = Line::from(vec!["tool".magenta(), " running...".dim()]);
let lines: Vec<Line> = vec![
title_line,
format_mcp_invocation(invocation.clone()),
Line::from(""),
];
PlainHistoryCell { lines }
}
/// If the first content is an image, return a new cell with the image.
/// TODO(rgwood-dd): Handle images properly even if they're not the first result.
fn try_new_completed_mcp_tool_call_with_image_output(
result: &Result<mcp_types::CallToolResult, String>,
) -> Option<CompletedMcpToolCallWithImageOutput> {
match result {
Ok(mcp_types::CallToolResult { content, .. }) => {
if let Some(mcp_types::ContentBlock::ImageContent(image)) = content.first() {
let raw_data = match base64::engine::general_purpose::STANDARD.decode(&image.data) {
Ok(data) => data,
Err(e) => {
error!("Failed to decode image data: {e}");
return None;
}
};
let reader = match ImageReader::new(Cursor::new(raw_data)).with_guessed_format() {
Ok(reader) => reader,
Err(e) => {
error!("Failed to guess image format: {e}");
return None;
}
};
let image = match reader.decode() {
Ok(image) => image,
Err(e) => {
error!("Image decoding failed: {e}");
return None;
}
};
Some(CompletedMcpToolCallWithImageOutput { _image: image })
} else {
None
}
}
_ => None,
}
}
pub(crate) fn new_completed_mcp_tool_call(
num_cols: usize,
invocation: McpInvocation,
duration: Duration,
success: bool,
result: Result<mcp_types::CallToolResult, String>,
) -> Box<dyn HistoryCell> {
if let Some(cell) = try_new_completed_mcp_tool_call_with_image_output(&result) {
return Box::new(cell);
}
let duration = format_duration(duration);
let status_str = if success { "success" } else { "failed" };
let title_line = Line::from(vec![
"tool".magenta(),
" ".into(),
if success {
status_str.green()
} else {
status_str.red()
},
format!(", duration: {duration}").dim(),
]);
let mut lines: Vec<Line<'static>> = Vec::new();
lines.push(title_line);
lines.push(format_mcp_invocation(invocation));
match result {
Ok(mcp_types::CallToolResult { content, .. }) => {
if !content.is_empty() {
lines.push(Line::from(""));
fix: introduce ResponseInputItem::McpToolCallOutput variant (#1151) The output of an MCP server tool call can be one of several types, but to date, we treated all outputs as text by showing the serialized JSON as the "tool output" in Codex: https://github.com/openai/codex/blob/25a9949c49194d5a64de54a11bcc5b4724ac9bd5/codex-rs/mcp-types/src/lib.rs#L96-L101 This PR adds support for the `ImageContent` variant so we can now display an image output from an MCP tool call. In making this change, we introduce a new `ResponseInputItem::McpToolCallOutput` variant so that we can work with the `mcp_types::CallToolResult` directly when the function call is made to an MCP server. Though arguably the more significant change is the introduction of `HistoryCell::CompletedMcpToolCallWithImageOutput`, which is a cell that uses `ratatui_image` to render an image into the terminal. To support this, we introduce `ImageRenderCache`, cache a `ratatui_image::picker::Picker`, and `ensure_image_cache()` to cache the appropriate scaled image data and dimensions based on the current terminal size. To test, I created a minimal `package.json`: ```json { "name": "kitty-mcp", "version": "1.0.0", "type": "module", "description": "MCP that returns image of kitty", "main": "index.js", "dependencies": { "@modelcontextprotocol/sdk": "^1.12.0" } } ``` with the following `index.js` to define the MCP server: ```js #!/usr/bin/env node import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js"; import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js"; import { readFile } from "node:fs/promises"; import { join } from "node:path"; const IMAGE_URI = "image://Ada.png"; const server = new McpServer({ name: "Demo", version: "1.0.0", }); server.tool( "get-cat-image", "If you need a cat image, this tool will provide one.", async () => ({ content: [ { type: "image", data: await getAdaPngBase64(), mimeType: "image/png" }, ], }) ); server.resource("Ada the Cat", IMAGE_URI, async (uri) => { const base64Image = await getAdaPngBase64(); return { contents: [ { uri: uri.href, mimeType: "image/png", blob: base64Image, }, ], }; }); async function getAdaPngBase64() { const __dirname = new URL(".", import.meta.url).pathname; // From https://github.com/benjajaja/ratatui-image/blob/9705ce2c59ec669abbce2924cbfd1f5ae22c9860/assets/Ada.png const filePath = join(__dirname, "Ada.png"); const imageData = await readFile(filePath); const base64Image = imageData.toString("base64"); return base64Image; } const transport = new StdioServerTransport(); await server.connect(transport); ``` With the local changes from this PR, I added the following to my `config.toml`: ```toml [mcp_servers.kitty] command = "node" args = ["/Users/mbolin/code/kitty-mcp/index.js"] ``` Running the TUI from source: ``` cargo run --bin codex -- --model o3 'I need a picture of a cat' ``` I get: <img width="732" alt="image" src="https://github.com/user-attachments/assets/bf80b721-9ca0-4d81-aec7-77d6899e2869" /> Now, that said, I have only tested in iTerm and there is definitely some funny business with getting an accurate character-to-pixel ratio (sometimes the `CompletedMcpToolCallWithImageOutput` thinks it needs 10 rows to render instead of 4), so there is still work to be done here.
2025-05-28 19:03:17 -07:00
for tool_call_result in content {
let line_text = match tool_call_result {
mcp_types::ContentBlock::TextContent(text) => {
format_and_truncate_tool_result(
&text.text,
TOOL_CALL_MAX_LINES,
num_cols,
)
}
mcp_types::ContentBlock::ImageContent(_) => {
// TODO show images even if they're not the first result, will require a refactor of `CompletedMcpToolCall`
"<image content>".to_string()
}
mcp_types::ContentBlock::AudioContent(_) => "<audio content>".to_string(),
mcp_types::ContentBlock::EmbeddedResource(resource) => {
let uri = match resource.resource {
EmbeddedResourceResource::TextResourceContents(text) => text.uri,
EmbeddedResourceResource::BlobResourceContents(blob) => blob.uri,
};
format!("embedded resource: {uri}")
}
mcp_types::ContentBlock::ResourceLink(ResourceLink { uri, .. }) => {
format!("link: {uri}")
fix: introduce ResponseInputItem::McpToolCallOutput variant (#1151) The output of an MCP server tool call can be one of several types, but to date, we treated all outputs as text by showing the serialized JSON as the "tool output" in Codex: https://github.com/openai/codex/blob/25a9949c49194d5a64de54a11bcc5b4724ac9bd5/codex-rs/mcp-types/src/lib.rs#L96-L101 This PR adds support for the `ImageContent` variant so we can now display an image output from an MCP tool call. In making this change, we introduce a new `ResponseInputItem::McpToolCallOutput` variant so that we can work with the `mcp_types::CallToolResult` directly when the function call is made to an MCP server. Though arguably the more significant change is the introduction of `HistoryCell::CompletedMcpToolCallWithImageOutput`, which is a cell that uses `ratatui_image` to render an image into the terminal. To support this, we introduce `ImageRenderCache`, cache a `ratatui_image::picker::Picker`, and `ensure_image_cache()` to cache the appropriate scaled image data and dimensions based on the current terminal size. To test, I created a minimal `package.json`: ```json { "name": "kitty-mcp", "version": "1.0.0", "type": "module", "description": "MCP that returns image of kitty", "main": "index.js", "dependencies": { "@modelcontextprotocol/sdk": "^1.12.0" } } ``` with the following `index.js` to define the MCP server: ```js #!/usr/bin/env node import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js"; import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js"; import { readFile } from "node:fs/promises"; import { join } from "node:path"; const IMAGE_URI = "image://Ada.png"; const server = new McpServer({ name: "Demo", version: "1.0.0", }); server.tool( "get-cat-image", "If you need a cat image, this tool will provide one.", async () => ({ content: [ { type: "image", data: await getAdaPngBase64(), mimeType: "image/png" }, ], }) ); server.resource("Ada the Cat", IMAGE_URI, async (uri) => { const base64Image = await getAdaPngBase64(); return { contents: [ { uri: uri.href, mimeType: "image/png", blob: base64Image, }, ], }; }); async function getAdaPngBase64() { const __dirname = new URL(".", import.meta.url).pathname; // From https://github.com/benjajaja/ratatui-image/blob/9705ce2c59ec669abbce2924cbfd1f5ae22c9860/assets/Ada.png const filePath = join(__dirname, "Ada.png"); const imageData = await readFile(filePath); const base64Image = imageData.toString("base64"); return base64Image; } const transport = new StdioServerTransport(); await server.connect(transport); ``` With the local changes from this PR, I added the following to my `config.toml`: ```toml [mcp_servers.kitty] command = "node" args = ["/Users/mbolin/code/kitty-mcp/index.js"] ``` Running the TUI from source: ``` cargo run --bin codex -- --model o3 'I need a picture of a cat' ``` I get: <img width="732" alt="image" src="https://github.com/user-attachments/assets/bf80b721-9ca0-4d81-aec7-77d6899e2869" /> Now, that said, I have only tested in iTerm and there is definitely some funny business with getting an accurate character-to-pixel ratio (sometimes the `CompletedMcpToolCallWithImageOutput` thinks it needs 10 rows to render instead of 4), so there is still work to be done here.
2025-05-28 19:03:17 -07:00
}
};
lines.push(Line::styled(
line_text,
Style::default().add_modifier(Modifier::DIM),
));
fix: introduce ResponseInputItem::McpToolCallOutput variant (#1151) The output of an MCP server tool call can be one of several types, but to date, we treated all outputs as text by showing the serialized JSON as the "tool output" in Codex: https://github.com/openai/codex/blob/25a9949c49194d5a64de54a11bcc5b4724ac9bd5/codex-rs/mcp-types/src/lib.rs#L96-L101 This PR adds support for the `ImageContent` variant so we can now display an image output from an MCP tool call. In making this change, we introduce a new `ResponseInputItem::McpToolCallOutput` variant so that we can work with the `mcp_types::CallToolResult` directly when the function call is made to an MCP server. Though arguably the more significant change is the introduction of `HistoryCell::CompletedMcpToolCallWithImageOutput`, which is a cell that uses `ratatui_image` to render an image into the terminal. To support this, we introduce `ImageRenderCache`, cache a `ratatui_image::picker::Picker`, and `ensure_image_cache()` to cache the appropriate scaled image data and dimensions based on the current terminal size. To test, I created a minimal `package.json`: ```json { "name": "kitty-mcp", "version": "1.0.0", "type": "module", "description": "MCP that returns image of kitty", "main": "index.js", "dependencies": { "@modelcontextprotocol/sdk": "^1.12.0" } } ``` with the following `index.js` to define the MCP server: ```js #!/usr/bin/env node import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js"; import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js"; import { readFile } from "node:fs/promises"; import { join } from "node:path"; const IMAGE_URI = "image://Ada.png"; const server = new McpServer({ name: "Demo", version: "1.0.0", }); server.tool( "get-cat-image", "If you need a cat image, this tool will provide one.", async () => ({ content: [ { type: "image", data: await getAdaPngBase64(), mimeType: "image/png" }, ], }) ); server.resource("Ada the Cat", IMAGE_URI, async (uri) => { const base64Image = await getAdaPngBase64(); return { contents: [ { uri: uri.href, mimeType: "image/png", blob: base64Image, }, ], }; }); async function getAdaPngBase64() { const __dirname = new URL(".", import.meta.url).pathname; // From https://github.com/benjajaja/ratatui-image/blob/9705ce2c59ec669abbce2924cbfd1f5ae22c9860/assets/Ada.png const filePath = join(__dirname, "Ada.png"); const imageData = await readFile(filePath); const base64Image = imageData.toString("base64"); return base64Image; } const transport = new StdioServerTransport(); await server.connect(transport); ``` With the local changes from this PR, I added the following to my `config.toml`: ```toml [mcp_servers.kitty] command = "node" args = ["/Users/mbolin/code/kitty-mcp/index.js"] ``` Running the TUI from source: ``` cargo run --bin codex -- --model o3 'I need a picture of a cat' ``` I get: <img width="732" alt="image" src="https://github.com/user-attachments/assets/bf80b721-9ca0-4d81-aec7-77d6899e2869" /> Now, that said, I have only tested in iTerm and there is definitely some funny business with getting an accurate character-to-pixel ratio (sometimes the `CompletedMcpToolCallWithImageOutput` thinks it needs 10 rows to render instead of 4), so there is still work to be done here.
2025-05-28 19:03:17 -07:00
}
}
lines.push(Line::from(""));
fix: introduce ResponseInputItem::McpToolCallOutput variant (#1151) The output of an MCP server tool call can be one of several types, but to date, we treated all outputs as text by showing the serialized JSON as the "tool output" in Codex: https://github.com/openai/codex/blob/25a9949c49194d5a64de54a11bcc5b4724ac9bd5/codex-rs/mcp-types/src/lib.rs#L96-L101 This PR adds support for the `ImageContent` variant so we can now display an image output from an MCP tool call. In making this change, we introduce a new `ResponseInputItem::McpToolCallOutput` variant so that we can work with the `mcp_types::CallToolResult` directly when the function call is made to an MCP server. Though arguably the more significant change is the introduction of `HistoryCell::CompletedMcpToolCallWithImageOutput`, which is a cell that uses `ratatui_image` to render an image into the terminal. To support this, we introduce `ImageRenderCache`, cache a `ratatui_image::picker::Picker`, and `ensure_image_cache()` to cache the appropriate scaled image data and dimensions based on the current terminal size. To test, I created a minimal `package.json`: ```json { "name": "kitty-mcp", "version": "1.0.0", "type": "module", "description": "MCP that returns image of kitty", "main": "index.js", "dependencies": { "@modelcontextprotocol/sdk": "^1.12.0" } } ``` with the following `index.js` to define the MCP server: ```js #!/usr/bin/env node import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js"; import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js"; import { readFile } from "node:fs/promises"; import { join } from "node:path"; const IMAGE_URI = "image://Ada.png"; const server = new McpServer({ name: "Demo", version: "1.0.0", }); server.tool( "get-cat-image", "If you need a cat image, this tool will provide one.", async () => ({ content: [ { type: "image", data: await getAdaPngBase64(), mimeType: "image/png" }, ], }) ); server.resource("Ada the Cat", IMAGE_URI, async (uri) => { const base64Image = await getAdaPngBase64(); return { contents: [ { uri: uri.href, mimeType: "image/png", blob: base64Image, }, ], }; }); async function getAdaPngBase64() { const __dirname = new URL(".", import.meta.url).pathname; // From https://github.com/benjajaja/ratatui-image/blob/9705ce2c59ec669abbce2924cbfd1f5ae22c9860/assets/Ada.png const filePath = join(__dirname, "Ada.png"); const imageData = await readFile(filePath); const base64Image = imageData.toString("base64"); return base64Image; } const transport = new StdioServerTransport(); await server.connect(transport); ``` With the local changes from this PR, I added the following to my `config.toml`: ```toml [mcp_servers.kitty] command = "node" args = ["/Users/mbolin/code/kitty-mcp/index.js"] ``` Running the TUI from source: ``` cargo run --bin codex -- --model o3 'I need a picture of a cat' ``` I get: <img width="732" alt="image" src="https://github.com/user-attachments/assets/bf80b721-9ca0-4d81-aec7-77d6899e2869" /> Now, that said, I have only tested in iTerm and there is definitely some funny business with getting an accurate character-to-pixel ratio (sometimes the `CompletedMcpToolCallWithImageOutput` thinks it needs 10 rows to render instead of 4), so there is still work to be done here.
2025-05-28 19:03:17 -07:00
}
Err(e) => {
lines.push(Line::from(vec![
Span::styled(
"Error: ",
Style::default().fg(Color::Red).add_modifier(Modifier::BOLD),
),
Span::raw(e),
]));
}
};
Box::new(PlainHistoryCell { lines })
}
pub(crate) fn new_status_output(
config: &Config,
usage: &TokenUsage,
session_id: &Option<Uuid>,
) -> PlainHistoryCell {
let mut lines: Vec<Line<'static>> = Vec::new();
lines.push(Line::from("/status".magenta()));
let config_entries = create_config_summary_entries(config);
let lookup = |k: &str| -> String {
config_entries
.iter()
.find(|(key, _)| *key == k)
.map(|(_, v)| v.clone())
.unwrap_or_default()
};
// 📂 Workspace
lines.push(Line::from(vec!["📂 ".into(), "Workspace".bold()]));
// Path (home-relative, e.g., ~/code/project)
let cwd_str = match relativize_to_home(&config.cwd) {
Some(rel) if !rel.as_os_str().is_empty() => format!("~/{}", rel.display()),
Some(_) => "~".to_string(),
None => config.cwd.display().to_string(),
};
lines.push(Line::from(vec![" • Path: ".into(), cwd_str.into()]));
// Approval mode (as-is)
lines.push(Line::from(vec![
" • Approval Mode: ".into(),
lookup("approval").into(),
]));
// Sandbox (simplified name only)
let sandbox_name = match &config.sandbox_policy {
SandboxPolicy::DangerFullAccess => "danger-full-access",
SandboxPolicy::ReadOnly => "read-only",
SandboxPolicy::WorkspaceWrite { .. } => "workspace-write",
};
lines.push(Line::from(vec![
" • Sandbox: ".into(),
sandbox_name.into(),
]));
// AGENTS.md files discovered via core's project_doc logic
let agents_list = {
match discover_project_doc_paths(config) {
Ok(paths) => {
let mut rels: Vec<String> = Vec::new();
for p in paths {
let display = if let Some(parent) = p.parent() {
if parent == config.cwd {
"AGENTS.md".to_string()
} else {
let mut cur = config.cwd.as_path();
let mut ups = 0usize;
let mut reached = false;
while let Some(c) = cur.parent() {
if cur == parent {
reached = true;
break;
}
cur = c;
ups += 1;
}
if reached {
format!("{}AGENTS.md", "../".repeat(ups))
} else if let Ok(stripped) = p.strip_prefix(&config.cwd) {
stripped.display().to_string()
} else {
p.display().to_string()
}
}
} else {
p.display().to_string()
};
rels.push(display);
}
rels
}
Err(_) => Vec::new(),
}
};
if agents_list.is_empty() {
lines.push(Line::from(" • AGENTS files: (none)"));
} else {
lines.push(Line::from(vec![
" • AGENTS files: ".into(),
agents_list.join(", ").into(),
]));
}
lines.push(Line::from(""));
// 👤 Account (only if ChatGPT tokens exist), shown under the first block
let auth_file = get_auth_file(&config.codex_home);
if let Ok(auth) = try_read_auth_json(&auth_file)
&& let Some(tokens) = auth.tokens.clone()
{
lines.push(Line::from(vec!["👤 ".into(), "Account".bold()]));
lines.push(Line::from(" • Signed in with ChatGPT"));
let info = tokens.id_token;
if let Some(email) = &info.email {
lines.push(Line::from(vec![" • Login: ".into(), email.clone().into()]));
}
match auth.openai_api_key.as_deref() {
Some(key) if !key.is_empty() => {
lines.push(Line::from(
" • Using API key. Run codex login to use ChatGPT plan",
));
}
_ => {
let plan_text = info
.get_chatgpt_plan_type()
.map(|s| title_case(&s))
.unwrap_or_else(|| "Unknown".to_string());
lines.push(Line::from(vec![" • Plan: ".into(), plan_text.into()]));
}
}
lines.push(Line::from(""));
}
// 🧠 Model
lines.push(Line::from(vec!["🧠 ".into(), "Model".bold()]));
lines.push(Line::from(vec![
" • Name: ".into(),
config.model.clone().into(),
]));
let provider_disp = pretty_provider_name(&config.model_provider_id);
lines.push(Line::from(vec![
" • Provider: ".into(),
provider_disp.into(),
]));
// Only show Reasoning fields if present in config summary
let reff = lookup("reasoning effort");
if !reff.is_empty() {
lines.push(Line::from(vec![
" • Reasoning Effort: ".into(),
title_case(&reff).into(),
]));
}
let rsum = lookup("reasoning summaries");
if !rsum.is_empty() {
lines.push(Line::from(vec![
" • Reasoning Summaries: ".into(),
title_case(&rsum).into(),
]));
}
lines.push(Line::from(""));
// 📊 Token Usage
lines.push(Line::from(vec!["📊 ".into(), "Token Usage".bold()]));
if let Some(session_id) = session_id {
lines.push(Line::from(vec![
" • Session ID: ".into(),
session_id.to_string().into(),
]));
}
// Input: <input> [+ <cached> cached]
let mut input_line_spans: Vec<Span<'static>> = vec![
" • Input: ".into(),
usage.non_cached_input().to_string().into(),
];
if let Some(cached) = usage.cached_input_tokens
&& cached > 0
{
input_line_spans.push(format!(" (+ {cached} cached)").into());
feat: support the chat completions API in the Rust CLI (#862) This is a substantial PR to add support for the chat completions API, which in turn makes it possible to use non-OpenAI model providers (just like in the TypeScript CLI): * It moves a number of structs from `client.rs` to `client_common.rs` so they can be shared. * It introduces support for the chat completions API in `chat_completions.rs`. * It updates `ModelProviderInfo` so that `env_key` is `Option<String>` instead of `String` (for e.g., ollama) and adds a `wire_api` field * It updates `client.rs` to choose between `stream_responses()` and `stream_chat_completions()` based on the `wire_api` for the `ModelProviderInfo` * It updates the `exec` and TUI CLIs to no longer fail if the `OPENAI_API_KEY` environment variable is not set * It updates the TUI so that `EventMsg::Error` is displayed more prominently when it occurs, particularly now that it is important to alert users to the `CodexErr::EnvVar` variant. * `CodexErr::EnvVar` was updated to include an optional `instructions` field so we can preserve the behavior where we direct users to https://platform.openai.com if `OPENAI_API_KEY` is not set. * Cleaned up the "welcome message" in the TUI to ensure the model provider is displayed. * Updated the docs in `codex-rs/README.md`. To exercise the chat completions API from OpenAI models, I added the following to my `config.toml`: ```toml model = "gpt-4o" model_provider = "openai-chat-completions" [model_providers.openai-chat-completions] name = "OpenAI using Chat Completions" base_url = "https://api.openai.com/v1" env_key = "OPENAI_API_KEY" wire_api = "chat" ``` Though to test a non-OpenAI provider, I installed ollama with mistral locally on my Mac because ChatGPT said that would be a good match for my hardware: ```shell brew install ollama ollama serve ollama pull mistral ``` Then I added the following to my `~/.codex/config.toml`: ```toml model = "mistral" model_provider = "ollama" ``` Note this code could certainly use more test coverage, but I want to get this in so folks can start playing with it. For reference, I believe https://github.com/openai/codex/pull/247 was roughly the comparable PR on the TypeScript side.
2025-05-08 21:46:06 -07:00
}
lines.push(Line::from(input_line_spans));
// Output: <output>
lines.push(Line::from(vec![
" • Output: ".into(),
usage.output_tokens.to_string().into(),
]));
// Total: <total>
lines.push(Line::from(vec![
" • Total: ".into(),
usage.blended_total().to_string().into(),
]));
lines.push(Line::from(""));
PlainHistoryCell { lines }
}
/// Render a summary of configured MCP servers from the current `Config`.
pub(crate) fn empty_mcp_output() -> PlainHistoryCell {
let lines: Vec<Line<'static>> = vec![
Line::from("/mcp".magenta()),
Line::from(""),
Line::from(vec!["🔌 ".into(), "MCP Tools".bold()]),
Line::from(""),
Line::from(" • No MCP servers configured.".italic()),
Line::from(vec![
" See the ".into(),
Span::styled(
"\u{1b}]8;;https://github.com/openai/codex/blob/main/codex-rs/config.md#mcp_servers\u{7}MCP docs\u{1b}]8;;\u{7}",
Style::default().add_modifier(Modifier::UNDERLINED),
),
" to configure them.".into(),
])
.style(Style::default().add_modifier(Modifier::DIM)),
Line::from(""),
];
PlainHistoryCell { lines }
}
/// Render MCP tools grouped by connection using the fully-qualified tool names.
pub(crate) fn new_mcp_tools_output(
config: &Config,
tools: std::collections::HashMap<String, mcp_types::Tool>,
) -> PlainHistoryCell {
let mut lines: Vec<Line<'static>> = vec![
Line::from("/mcp".magenta()),
Line::from(""),
Line::from(vec!["🔌 ".into(), "MCP Tools".bold()]),
Line::from(""),
];
if tools.is_empty() {
lines.push(Line::from(" • No MCP tools available.".italic()));
lines.push(Line::from(""));
return PlainHistoryCell { lines };
}
for (server, cfg) in config.mcp_servers.iter() {
let prefix = format!("{server}__");
let mut names: Vec<String> = tools
.keys()
.filter(|k| k.starts_with(&prefix))
.map(|k| k[prefix.len()..].to_string())
.collect();
names.sort();
lines.push(Line::from(vec![
" • Server: ".into(),
server.clone().into(),
]));
if !cfg.command.is_empty() {
let cmd_display = format!("{} {}", cfg.command, cfg.args.join(" "));
lines.push(Line::from(vec![
" • Command: ".into(),
cmd_display.into(),
]));
}
if let Some(env) = cfg.env.as_ref()
&& !env.is_empty()
{
let mut env_pairs: Vec<String> = env.iter().map(|(k, v)| format!("{k}={v}")).collect();
env_pairs.sort();
lines.push(Line::from(vec![
" • Env: ".into(),
env_pairs.join(" ").into(),
]));
}
if names.is_empty() {
lines.push(Line::from(" • Tools: (none)"));
} else {
lines.push(Line::from(vec![
" • Tools: ".into(),
names.join(", ").into(),
]));
}
lines.push(Line::from(""));
}
PlainHistoryCell { lines }
}
pub(crate) fn new_error_event(message: String) -> PlainHistoryCell {
let lines: Vec<Line<'static>> = vec![vec!["🖐 ".red().bold(), message.into()].into(), "".into()];
PlainHistoryCell { lines }
}
2025-08-21 01:15:24 -07:00
pub(crate) fn new_stream_error_event(message: String) -> PlainHistoryCell {
let lines: Vec<Line<'static>> =
vec![vec!["".magenta().bold(), message.dim()].into(), "".into()];
PlainHistoryCell { lines }
}
/// Render a userfriendly plan update styled like a checkbox todo list.
pub(crate) fn new_plan_update(update: UpdatePlanArgs) -> PlainHistoryCell {
let UpdatePlanArgs { explanation, plan } = update;
let mut lines: Vec<Line<'static>> = Vec::new();
// Header with progress summary
let total = plan.len();
let completed = plan
.iter()
.filter(|p| matches!(p.status, StepStatus::Completed))
.count();
let width: usize = 10;
let filled = if total > 0 {
(completed * width + total / 2) / total
} else {
0
};
let empty = width.saturating_sub(filled);
let mut header: Vec<Span> = Vec::new();
header.push(Span::raw("📋"));
header.push(Span::styled(
" Update plan",
Style::default().add_modifier(Modifier::BOLD).magenta(),
));
header.push(Span::raw(" ["));
if filled > 0 {
header.push(Span::styled(
"".repeat(filled),
Style::default().fg(Color::Green),
));
}
if empty > 0 {
header.push(Span::styled(
"".repeat(empty),
Style::default().add_modifier(Modifier::DIM),
));
}
header.push(Span::raw("] "));
header.push(Span::raw(format!("{completed}/{total}")));
lines.push(Line::from(header));
// Optional explanation/note from the model
if let Some(expl) = explanation.and_then(|s| {
let t = s.trim().to_string();
if t.is_empty() { None } else { Some(t) }
}) {
lines.push(Line::from("note".dim().italic()));
for l in expl.lines() {
lines.push(Line::from(l.to_string()).dim());
}
}
// Steps styled as checkbox items
if plan.is_empty() {
lines.push(Line::from("(no steps provided)".dim().italic()));
} else {
for (idx, PlanItemArg { step, status }) in plan.into_iter().enumerate() {
let (box_span, text_span) = match status {
StepStatus::Completed => (
Span::styled("", Style::default().fg(Color::Green)),
Span::styled(
step,
Style::default().add_modifier(Modifier::CROSSED_OUT | Modifier::DIM),
),
),
StepStatus::InProgress => (
Span::raw(""),
Span::styled(
step,
Style::default()
.fg(Color::Cyan)
.add_modifier(Modifier::BOLD),
),
),
StepStatus::Pending => (
Span::raw(""),
Span::styled(step, Style::default().add_modifier(Modifier::DIM)),
),
};
let prefix = if idx == 0 {
Span::raw("")
} else {
Span::raw(" ")
};
lines.push(Line::from(vec![
prefix,
box_span,
Span::raw(" "),
text_span,
]));
}
}
lines.push(Line::from(""));
PlainHistoryCell { lines }
}
/// Create a new `PendingPatch` cell that lists the filelevel summary of
/// a proposed patch. The summary lines should already be formatted (e.g.
/// "A path/to/file.rs").
pub(crate) fn new_patch_event(
event_type: PatchEventType,
changes: HashMap<PathBuf, FileChange>,
) -> PlainHistoryCell {
let title = match &event_type {
PatchEventType::ApprovalRequest => "proposed patch",
PatchEventType::ApplyBegin {
auto_approved: true,
} => "✏️ Applying patch",
PatchEventType::ApplyBegin {
auto_approved: false,
} => {
let lines: Vec<Line<'static>> = vec![
Line::from("✏️ Applying patch".magenta().bold()),
Line::from(""),
];
return PlainHistoryCell { lines };
}
};
let mut lines: Vec<Line<'static>> = create_diff_summary(title, &changes, event_type);
lines.push(Line::from(""));
PlainHistoryCell { lines }
}
pub(crate) fn new_patch_apply_failure(stderr: String) -> PlainHistoryCell {
let mut lines: Vec<Line<'static>> = Vec::new();
// Failure title
lines.push(Line::from("✘ Failed to apply patch".magenta().bold()));
if !stderr.trim().is_empty() {
lines.extend(output_lines(
Some(&CommandOutput {
exit_code: 1,
stdout: String::new(),
stderr,
}),
true,
true,
));
}
lines.push(Line::from(""));
PlainHistoryCell { lines }
}
pub(crate) fn new_patch_apply_success(stdout: String) -> PlainHistoryCell {
let mut lines: Vec<Line<'static>> = Vec::new();
// Success title
lines.push(Line::from("✓ Applied patch".magenta().bold()));
if !stdout.trim().is_empty() {
let mut iter = stdout.lines();
for (i, raw) in iter.by_ref().take(TOOL_CALL_MAX_LINES).enumerate() {
let prefix = if i == 0 { "" } else { " " };
// First line is the header; dim it entirely.
if i == 0 {
let s = format!("{prefix}{raw}");
lines.push(ansi_escape_line(&s).dim());
continue;
}
// Subsequent lines should look like: "M path/to/file".
// Colorize the status letter like `git status` (e.g., M red).
let status = raw.chars().next();
let rest = raw.get(1..).unwrap_or("");
let status_span = match status {
Some('M') => "M".red(),
Some('A') => "A".green(),
Some('D') => "D".red(),
Some(other) => other.to_string().into(),
None => "".into(),
};
lines.push(Line::from(vec![
prefix.into(),
status_span,
ansi_escape_line(rest).to_string().into(),
]));
}
let remaining = iter.count();
if remaining > 0 {
lines.push(Line::from(""));
lines.push(Line::from(format!("... +{remaining} lines")).dim());
}
}
lines.push(Line::from(""));
PlainHistoryCell { lines }
}
pub(crate) fn new_reasoning_block(
full_reasoning_buffer: String,
config: &Config,
) -> TranscriptOnlyHistoryCell {
let mut lines: Vec<Line<'static>> = Vec::new();
lines.push(Line::from("thinking".magenta().italic()));
append_markdown(&full_reasoning_buffer, &mut lines, config);
lines.push(Line::from(""));
TranscriptOnlyHistoryCell { lines }
}
[1/3] Parse exec commands and format them more nicely in the UI (#2095) # Note for reviewers The bulk of this PR is in in the new file, `parse_command.rs`. This file is designed to be written TDD and implemented with Codex. Do not worry about reviewing the code, just review the unit tests (if you want). If any cases are missing, we'll add more tests and have Codex fix them. I think the best approach will be to land and iterate. I have some follow-ups I want to do after this lands. The next PR after this will let us merge (and dedupe) multiple sequential cells of the same such as multiple read commands. The deduping will also be important because the model often reads the same file multiple times in a row in chunks === This PR formats common commands like reading, formatting, testing, etc more nicely: It tries to extract things like file names, tests and falls back to the cmd if it doesn't. It also only shows stdout/err if the command failed. <img width="770" height="238" alt="CleanShot 2025-08-09 at 16 05 15" src="https://github.com/user-attachments/assets/0ead179a-8910-486b-aa3d-7d26264d751e" /> <img width="348" height="158" alt="CleanShot 2025-08-09 at 16 05 32" src="https://github.com/user-attachments/assets/4302681b-5e87-4ff3-85b4-0252c6c485a9" /> <img width="834" height="324" alt="CleanShot 2025-08-09 at 16 05 56 2" src="https://github.com/user-attachments/assets/09fb3517-7bd6-40f6-a126-4172106b700f" /> Part 2: https://github.com/openai/codex/pull/2097 Part 3: https://github.com/openai/codex/pull/2110
2025-08-11 11:26:15 -07:00
fn output_lines(
output: Option<&CommandOutput>,
only_err: bool,
include_angle_pipe: bool,
) -> Vec<Line<'static>> {
let CommandOutput {
exit_code,
stdout,
stderr,
} = match output {
Some(output) if only_err && output.exit_code == 0 => return vec![],
Some(output) => output,
None => return vec![],
};
let src = if *exit_code == 0 { stdout } else { stderr };
let lines: Vec<&str> = src.lines().collect();
let total = lines.len();
let limit = TOOL_CALL_MAX_LINES;
let mut out = Vec::new();
let head_end = total.min(limit);
for (i, raw) in lines[..head_end].iter().enumerate() {
let mut line = ansi_escape_line(raw);
let prefix = if i == 0 && include_angle_pipe {
""
[1/3] Parse exec commands and format them more nicely in the UI (#2095) # Note for reviewers The bulk of this PR is in in the new file, `parse_command.rs`. This file is designed to be written TDD and implemented with Codex. Do not worry about reviewing the code, just review the unit tests (if you want). If any cases are missing, we'll add more tests and have Codex fix them. I think the best approach will be to land and iterate. I have some follow-ups I want to do after this lands. The next PR after this will let us merge (and dedupe) multiple sequential cells of the same such as multiple read commands. The deduping will also be important because the model often reads the same file multiple times in a row in chunks === This PR formats common commands like reading, formatting, testing, etc more nicely: It tries to extract things like file names, tests and falls back to the cmd if it doesn't. It also only shows stdout/err if the command failed. <img width="770" height="238" alt="CleanShot 2025-08-09 at 16 05 15" src="https://github.com/user-attachments/assets/0ead179a-8910-486b-aa3d-7d26264d751e" /> <img width="348" height="158" alt="CleanShot 2025-08-09 at 16 05 32" src="https://github.com/user-attachments/assets/4302681b-5e87-4ff3-85b4-0252c6c485a9" /> <img width="834" height="324" alt="CleanShot 2025-08-09 at 16 05 56 2" src="https://github.com/user-attachments/assets/09fb3517-7bd6-40f6-a126-4172106b700f" /> Part 2: https://github.com/openai/codex/pull/2097 Part 3: https://github.com/openai/codex/pull/2110
2025-08-11 11:26:15 -07:00
} else {
" "
};
line.spans.insert(0, prefix.into());
line.spans.iter_mut().for_each(|span| {
span.style = span.style.add_modifier(Modifier::DIM);
});
out.push(line);
}
// If we will ellipsize less than the limit, just show it.
let show_ellipsis = total > 2 * limit;
if show_ellipsis {
let omitted = total - 2 * limit;
out.push(Line::from(format!("… +{omitted} lines")));
}
let tail_start = if show_ellipsis {
total - limit
} else {
head_end
};
for raw in lines[tail_start..].iter() {
let mut line = ansi_escape_line(raw);
line.spans.insert(0, " ".into());
line.spans.iter_mut().for_each(|span| {
span.style = span.style.add_modifier(Modifier::DIM);
});
out.push(line);
}
out
}
fix: introduce ResponseInputItem::McpToolCallOutput variant (#1151) The output of an MCP server tool call can be one of several types, but to date, we treated all outputs as text by showing the serialized JSON as the "tool output" in Codex: https://github.com/openai/codex/blob/25a9949c49194d5a64de54a11bcc5b4724ac9bd5/codex-rs/mcp-types/src/lib.rs#L96-L101 This PR adds support for the `ImageContent` variant so we can now display an image output from an MCP tool call. In making this change, we introduce a new `ResponseInputItem::McpToolCallOutput` variant so that we can work with the `mcp_types::CallToolResult` directly when the function call is made to an MCP server. Though arguably the more significant change is the introduction of `HistoryCell::CompletedMcpToolCallWithImageOutput`, which is a cell that uses `ratatui_image` to render an image into the terminal. To support this, we introduce `ImageRenderCache`, cache a `ratatui_image::picker::Picker`, and `ensure_image_cache()` to cache the appropriate scaled image data and dimensions based on the current terminal size. To test, I created a minimal `package.json`: ```json { "name": "kitty-mcp", "version": "1.0.0", "type": "module", "description": "MCP that returns image of kitty", "main": "index.js", "dependencies": { "@modelcontextprotocol/sdk": "^1.12.0" } } ``` with the following `index.js` to define the MCP server: ```js #!/usr/bin/env node import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js"; import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js"; import { readFile } from "node:fs/promises"; import { join } from "node:path"; const IMAGE_URI = "image://Ada.png"; const server = new McpServer({ name: "Demo", version: "1.0.0", }); server.tool( "get-cat-image", "If you need a cat image, this tool will provide one.", async () => ({ content: [ { type: "image", data: await getAdaPngBase64(), mimeType: "image/png" }, ], }) ); server.resource("Ada the Cat", IMAGE_URI, async (uri) => { const base64Image = await getAdaPngBase64(); return { contents: [ { uri: uri.href, mimeType: "image/png", blob: base64Image, }, ], }; }); async function getAdaPngBase64() { const __dirname = new URL(".", import.meta.url).pathname; // From https://github.com/benjajaja/ratatui-image/blob/9705ce2c59ec669abbce2924cbfd1f5ae22c9860/assets/Ada.png const filePath = join(__dirname, "Ada.png"); const imageData = await readFile(filePath); const base64Image = imageData.toString("base64"); return base64Image; } const transport = new StdioServerTransport(); await server.connect(transport); ``` With the local changes from this PR, I added the following to my `config.toml`: ```toml [mcp_servers.kitty] command = "node" args = ["/Users/mbolin/code/kitty-mcp/index.js"] ``` Running the TUI from source: ``` cargo run --bin codex -- --model o3 'I need a picture of a cat' ``` I get: <img width="732" alt="image" src="https://github.com/user-attachments/assets/bf80b721-9ca0-4d81-aec7-77d6899e2869" /> Now, that said, I have only tested in iTerm and there is definitely some funny business with getting an accurate character-to-pixel ratio (sometimes the `CompletedMcpToolCallWithImageOutput` thinks it needs 10 rows to render instead of 4), so there is still work to be done here.
2025-05-28 19:03:17 -07:00
fn format_mcp_invocation<'a>(invocation: McpInvocation) -> Line<'a> {
let args_str = invocation
.arguments
.as_ref()
.map(|v| {
// Use compact form to keep things short but readable.
serde_json::to_string(v).unwrap_or_else(|_| v.to_string())
})
.unwrap_or_default();
let invocation_spans = vec![
Span::styled(invocation.server.clone(), Style::default().fg(Color::Cyan)),
Span::raw("."),
Span::styled(invocation.tool.clone(), Style::default().fg(Color::Cyan)),
Span::raw("("),
Span::styled(args_str, Style::default().add_modifier(Modifier::DIM)),
Span::raw(")"),
];
Line::from(invocation_spans)
fix: introduce ResponseInputItem::McpToolCallOutput variant (#1151) The output of an MCP server tool call can be one of several types, but to date, we treated all outputs as text by showing the serialized JSON as the "tool output" in Codex: https://github.com/openai/codex/blob/25a9949c49194d5a64de54a11bcc5b4724ac9bd5/codex-rs/mcp-types/src/lib.rs#L96-L101 This PR adds support for the `ImageContent` variant so we can now display an image output from an MCP tool call. In making this change, we introduce a new `ResponseInputItem::McpToolCallOutput` variant so that we can work with the `mcp_types::CallToolResult` directly when the function call is made to an MCP server. Though arguably the more significant change is the introduction of `HistoryCell::CompletedMcpToolCallWithImageOutput`, which is a cell that uses `ratatui_image` to render an image into the terminal. To support this, we introduce `ImageRenderCache`, cache a `ratatui_image::picker::Picker`, and `ensure_image_cache()` to cache the appropriate scaled image data and dimensions based on the current terminal size. To test, I created a minimal `package.json`: ```json { "name": "kitty-mcp", "version": "1.0.0", "type": "module", "description": "MCP that returns image of kitty", "main": "index.js", "dependencies": { "@modelcontextprotocol/sdk": "^1.12.0" } } ``` with the following `index.js` to define the MCP server: ```js #!/usr/bin/env node import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js"; import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js"; import { readFile } from "node:fs/promises"; import { join } from "node:path"; const IMAGE_URI = "image://Ada.png"; const server = new McpServer({ name: "Demo", version: "1.0.0", }); server.tool( "get-cat-image", "If you need a cat image, this tool will provide one.", async () => ({ content: [ { type: "image", data: await getAdaPngBase64(), mimeType: "image/png" }, ], }) ); server.resource("Ada the Cat", IMAGE_URI, async (uri) => { const base64Image = await getAdaPngBase64(); return { contents: [ { uri: uri.href, mimeType: "image/png", blob: base64Image, }, ], }; }); async function getAdaPngBase64() { const __dirname = new URL(".", import.meta.url).pathname; // From https://github.com/benjajaja/ratatui-image/blob/9705ce2c59ec669abbce2924cbfd1f5ae22c9860/assets/Ada.png const filePath = join(__dirname, "Ada.png"); const imageData = await readFile(filePath); const base64Image = imageData.toString("base64"); return base64Image; } const transport = new StdioServerTransport(); await server.connect(transport); ``` With the local changes from this PR, I added the following to my `config.toml`: ```toml [mcp_servers.kitty] command = "node" args = ["/Users/mbolin/code/kitty-mcp/index.js"] ``` Running the TUI from source: ``` cargo run --bin codex -- --model o3 'I need a picture of a cat' ``` I get: <img width="732" alt="image" src="https://github.com/user-attachments/assets/bf80b721-9ca0-4d81-aec7-77d6899e2869" /> Now, that said, I have only tested in iTerm and there is definitely some funny business with getting an accurate character-to-pixel ratio (sometimes the `CompletedMcpToolCallWithImageOutput` thinks it needs 10 rows to render instead of 4), so there is still work to be done here.
2025-05-28 19:03:17 -07:00
}
[1/3] Parse exec commands and format them more nicely in the UI (#2095) # Note for reviewers The bulk of this PR is in in the new file, `parse_command.rs`. This file is designed to be written TDD and implemented with Codex. Do not worry about reviewing the code, just review the unit tests (if you want). If any cases are missing, we'll add more tests and have Codex fix them. I think the best approach will be to land and iterate. I have some follow-ups I want to do after this lands. The next PR after this will let us merge (and dedupe) multiple sequential cells of the same such as multiple read commands. The deduping will also be important because the model often reads the same file multiple times in a row in chunks === This PR formats common commands like reading, formatting, testing, etc more nicely: It tries to extract things like file names, tests and falls back to the cmd if it doesn't. It also only shows stdout/err if the command failed. <img width="770" height="238" alt="CleanShot 2025-08-09 at 16 05 15" src="https://github.com/user-attachments/assets/0ead179a-8910-486b-aa3d-7d26264d751e" /> <img width="348" height="158" alt="CleanShot 2025-08-09 at 16 05 32" src="https://github.com/user-attachments/assets/4302681b-5e87-4ff3-85b4-0252c6c485a9" /> <img width="834" height="324" alt="CleanShot 2025-08-09 at 16 05 56 2" src="https://github.com/user-attachments/assets/09fb3517-7bd6-40f6-a126-4172106b700f" /> Part 2: https://github.com/openai/codex/pull/2097 Part 3: https://github.com/openai/codex/pull/2110
2025-08-11 11:26:15 -07:00
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn parsed_command_with_newlines_starts_each_line_at_origin() {
let parsed = vec![ParsedCommand::Unknown {
cmd: "printf 'foo\nbar'".to_string(),
}];
let lines = exec_command_lines(&[], &parsed, None, None);
assert!(lines.len() >= 3);
assert_eq!(lines[1].spans[0].content, "");
assert_eq!(lines[2].spans[0].content, " ");
}
}