[MCP] Render MCP tool call result images to the model (#5600)

It's pretty amazing we have gotten here without the ability for the
model to see image content from MCP tool calls.

This PR builds off of 4391 and fixes #4819. I would like @KKcorps to get
adequete credit here but I also want to get this fix in ASAP so I gave
him a week to update it and haven't gotten a response so I'm going to
take it across the finish line.


This test highlights how absured the current situation is. I asked the
model to read this image using the Chrome MCP
<img width="2378" height="674" alt="image"
src="https://github.com/user-attachments/assets/9ef52608-72a2-4423-9f5e-7ae36b2b56e0"
/>

After this change, it correctly outputs:
> Captured the page: image dhows a dark terminal-style UI labeled
`OpenAI Codex (v0.0.0)` with prompt `model: gpt-5-codex medium` and
working directory `/codex/codex-rs`
(and more)  

Before this change, it said:
> Took the full-page screenshot you asked for. It shows a long,
horizontally repeating pattern of stylized people in orange, light-blue,
and mustard clothing, holding hands in alternating poses against a white
background. No text or other graphics-just rows of flat illustration
stretching off to the right.

Without this change, the Figma, Playwright, Chrome, and other visual MCP
servers are pretty much entirely useless.

I tested this change with the openai respones api as well as a third
party completions api
This commit is contained in:
Gabriel Peal
2025-10-27 14:55:57 -07:00
committed by GitHub
parent 67a219ffc2
commit b0bdc04c30
24 changed files with 749 additions and 108 deletions

View File

@@ -136,7 +136,7 @@ impl ConversationHistory {
call_id: call_id.clone(),
output: FunctionCallOutputPayload {
content: "aborted".to_string(),
success: None,
..Default::default()
},
},
));
@@ -183,7 +183,7 @@ impl ConversationHistory {
call_id: call_id.clone(),
output: FunctionCallOutputPayload {
content: "aborted".to_string(),
success: None,
..Default::default()
},
},
));
@@ -565,7 +565,7 @@ mod tests {
call_id: "call-1".to_string(),
output: FunctionCallOutputPayload {
content: "ok".to_string(),
success: None,
..Default::default()
},
},
];
@@ -581,7 +581,7 @@ mod tests {
call_id: "call-2".to_string(),
output: FunctionCallOutputPayload {
content: "ok".to_string(),
success: None,
..Default::default()
},
},
ResponseItem::FunctionCall {
@@ -615,7 +615,7 @@ mod tests {
call_id: "call-3".to_string(),
output: FunctionCallOutputPayload {
content: "ok".to_string(),
success: None,
..Default::default()
},
},
];
@@ -848,7 +848,7 @@ mod tests {
call_id: "call-x".to_string(),
output: FunctionCallOutputPayload {
content: "aborted".to_string(),
success: None,
..Default::default()
},
},
]
@@ -925,7 +925,7 @@ mod tests {
call_id: "shell-1".to_string(),
output: FunctionCallOutputPayload {
content: "aborted".to_string(),
success: None,
..Default::default()
},
},
]
@@ -939,7 +939,7 @@ mod tests {
call_id: "orphan-1".to_string(),
output: FunctionCallOutputPayload {
content: "ok".to_string(),
success: None,
..Default::default()
},
}];
let mut h = create_history_with_items(items);
@@ -979,7 +979,7 @@ mod tests {
call_id: "c2".to_string(),
output: FunctionCallOutputPayload {
content: "ok".to_string(),
success: None,
..Default::default()
},
},
// Will get an inserted custom tool output
@@ -1021,7 +1021,7 @@ mod tests {
call_id: "c1".to_string(),
output: FunctionCallOutputPayload {
content: "aborted".to_string(),
success: None,
..Default::default()
},
},
ResponseItem::CustomToolCall {
@@ -1051,7 +1051,7 @@ mod tests {
call_id: "s1".to_string(),
output: FunctionCallOutputPayload {
content: "aborted".to_string(),
success: None,
..Default::default()
},
},
]
@@ -1116,7 +1116,7 @@ mod tests {
call_id: "orphan-1".to_string(),
output: FunctionCallOutputPayload {
content: "ok".to_string(),
success: None,
..Default::default()
},
}];
let mut h = create_history_with_items(items);
@@ -1150,7 +1150,7 @@ mod tests {
call_id: "c2".to_string(),
output: FunctionCallOutputPayload {
content: "ok".to_string(),
success: None,
..Default::default()
},
},
ResponseItem::CustomToolCall {