feat: make reasoning effort/summaries configurable (#1199)
Previous to this PR, we always set `reasoning` when making a request
using the Responses API:
https://github.com/openai/codex/blob/d7245cbbc9d8ff5446da45e5951761103492476d/codex-rs/core/src/client.rs#L108-L111
Though if you tried to use the Rust CLI with `--model gpt-4.1`, this
would fail with:
```shell
"Unsupported parameter: 'reasoning.effort' is not supported with this model."
```
We take a cue from the TypeScript CLI, which does a check on the model
name:
https://github.com/openai/codex/blob/d7245cbbc9d8ff5446da45e5951761103492476d/codex-cli/src/utils/agent/agent-loop.ts#L786-L789
This PR does a similar check, though also adds support for the following
config options:
```
model_reasoning_effort = "low" | "medium" | "high" | "none"
model_reasoning_summary = "auto" | "concise" | "detailed" | "none"
```
This way, if you have a model whose name happens to start with `"o"` (or
`"codex"`?), you can set these to `"none"` to explicitly disable
reasoning, if necessary. (That said, it seems unlikely anyone would use
the Responses API with non-OpenAI models, but we provide an escape
hatch, anyway.)
This PR also updates both the TUI and `codex exec` to show `reasoning
effort` and `reasoning summaries` in the header.
2025-06-02 16:01:34 -07:00
|
|
|
use crate::config_types::ReasoningEffort as ReasoningEffortConfig;
|
|
|
|
|
use crate::config_types::ReasoningSummary as ReasoningSummaryConfig;
|
2025-05-08 21:46:06 -07:00
|
|
|
use crate::error::Result;
|
|
|
|
|
use crate::models::ResponseItem;
|
feat: show number of tokens remaining in UI (#1388)
When using the OpenAI Responses API, we now record the `usage` field for
a `"response.completed"` event, which includes metrics about the number
of tokens consumed. We also introduce `openai_model_info.rs`, which
includes current data about the most common OpenAI models available via
the API (specifically `context_window` and `max_output_tokens`). If
Codex does not recognize the model, you can set `model_context_window`
and `model_max_output_tokens` explicitly in `config.toml`.
When then introduce a new event type to `protocol.rs`, `TokenCount`,
which includes the `TokenUsage` for the most recent turn.
Finally, we update the TUI to record the running sum of tokens used so
the percentage of available context window remaining can be reported via
the placeholder text for the composer:

We could certainly get much fancier with this (such as reporting the
estimated cost of the conversation), but for now, we are just trying to
achieve feature parity with the TypeScript CLI.
Though arguably this improves upon the TypeScript CLI, as the TypeScript
CLI uses heuristics to estimate the number of tokens used rather than
using the `usage` information directly:
https://github.com/openai/codex/blob/296996d74e345b1b05d8c3451a06ace21c5ada96/codex-cli/src/utils/approximate-tokens-used.ts#L3-L16
Fixes https://github.com/openai/codex/issues/1242
2025-06-25 23:31:11 -07:00
|
|
|
use crate::protocol::TokenUsage;
|
fix: provide tolerance for apply_patch tool (#993)
As explained in detail in the doc comment for `ParseMode::Lenient`, we
have observed that GPT-4.1 does not always generate a valid invocation
of `apply_patch`. Fortunately, the error is predictable, so we introduce
some new logic to the `codex-apply-patch` crate to recover from this
error.
Because we would like to avoid this becoming a de facto standard (as it
would be incompatible if `apply_patch` were provided as an actual
executable, unless we also introduced the lenient behavior in the
executable, as well), we require passing `ParseMode::Lenient` to
`parse_patch_text()` to make it clear that the caller is opting into
supporting this special case.
Note the analogous change to the TypeScript CLI was
https://github.com/openai/codex/pull/930. In addition to changing the
accepted input to `apply_patch`, it also introduced additional
instructions for the model, which we include in this PR.
Note that `apply-patch` does not depend on either `regex` or
`regex-lite`, so some of the checks are slightly more verbose to avoid
introducing this dependency.
That said, this PR does not leverage the existing
`extract_heredoc_body_from_apply_patch_command()`, which depends on
`tree-sitter` and `tree-sitter-bash`:
https://github.com/openai/codex/blob/5a5aa899143f9b9ef606692c401b010368b15bdb/codex-rs/apply-patch/src/lib.rs#L191-L246
though perhaps it should.
2025-06-03 09:06:38 -07:00
|
|
|
use codex_apply_patch::APPLY_PATCH_TOOL_INSTRUCTIONS;
|
2025-05-08 21:46:06 -07:00
|
|
|
use futures::Stream;
|
|
|
|
|
use serde::Serialize;
|
2025-05-12 17:24:44 -07:00
|
|
|
use std::borrow::Cow;
|
2025-05-08 21:46:06 -07:00
|
|
|
use std::collections::HashMap;
|
|
|
|
|
use std::pin::Pin;
|
|
|
|
|
use std::task::Context;
|
|
|
|
|
use std::task::Poll;
|
|
|
|
|
use tokio::sync::mpsc;
|
|
|
|
|
|
2025-05-12 17:24:44 -07:00
|
|
|
/// The `instructions` field in the payload sent to a model should always start
|
|
|
|
|
/// with this content.
|
|
|
|
|
const BASE_INSTRUCTIONS: &str = include_str!("../prompt.md");
|
|
|
|
|
|
2025-05-08 21:46:06 -07:00
|
|
|
/// API request payload for a single model turn.
|
|
|
|
|
#[derive(Default, Debug, Clone)]
|
|
|
|
|
pub struct Prompt {
|
|
|
|
|
/// Conversation context input items.
|
|
|
|
|
pub input: Vec<ResponseItem>,
|
|
|
|
|
/// Optional previous response ID (when storage is enabled).
|
|
|
|
|
pub prev_id: Option<String>,
|
2025-05-12 17:24:44 -07:00
|
|
|
/// Optional instructions from the user to amend to the built-in agent
|
|
|
|
|
/// instructions.
|
2025-06-03 09:40:19 -07:00
|
|
|
pub user_instructions: Option<String>,
|
2025-05-08 21:46:06 -07:00
|
|
|
/// Whether to store response on server side (disable_response_storage = !store).
|
|
|
|
|
pub store: bool,
|
|
|
|
|
|
|
|
|
|
/// Additional tools sourced from external MCP servers. Note each key is
|
|
|
|
|
/// the "fully qualified" tool name (i.e., prefixed with the server name),
|
|
|
|
|
/// which should be reported to the model in place of Tool::name.
|
|
|
|
|
pub extra_tools: HashMap<String, mcp_types::Tool>,
|
|
|
|
|
}
|
|
|
|
|
|
2025-05-12 17:24:44 -07:00
|
|
|
impl Prompt {
|
fix: provide tolerance for apply_patch tool (#993)
As explained in detail in the doc comment for `ParseMode::Lenient`, we
have observed that GPT-4.1 does not always generate a valid invocation
of `apply_patch`. Fortunately, the error is predictable, so we introduce
some new logic to the `codex-apply-patch` crate to recover from this
error.
Because we would like to avoid this becoming a de facto standard (as it
would be incompatible if `apply_patch` were provided as an actual
executable, unless we also introduced the lenient behavior in the
executable, as well), we require passing `ParseMode::Lenient` to
`parse_patch_text()` to make it clear that the caller is opting into
supporting this special case.
Note the analogous change to the TypeScript CLI was
https://github.com/openai/codex/pull/930. In addition to changing the
accepted input to `apply_patch`, it also introduced additional
instructions for the model, which we include in this PR.
Note that `apply-patch` does not depend on either `regex` or
`regex-lite`, so some of the checks are slightly more verbose to avoid
introducing this dependency.
That said, this PR does not leverage the existing
`extract_heredoc_body_from_apply_patch_command()`, which depends on
`tree-sitter` and `tree-sitter-bash`:
https://github.com/openai/codex/blob/5a5aa899143f9b9ef606692c401b010368b15bdb/codex-rs/apply-patch/src/lib.rs#L191-L246
though perhaps it should.
2025-06-03 09:06:38 -07:00
|
|
|
pub(crate) fn get_full_instructions(&self, model: &str) -> Cow<str> {
|
2025-06-03 09:40:19 -07:00
|
|
|
let mut sections: Vec<&str> = vec![BASE_INSTRUCTIONS];
|
|
|
|
|
if let Some(ref user) = self.user_instructions {
|
|
|
|
|
sections.push(user);
|
|
|
|
|
}
|
|
|
|
|
if model.starts_with("gpt-4.1") {
|
|
|
|
|
sections.push(APPLY_PATCH_TOOL_INSTRUCTIONS);
|
|
|
|
|
}
|
|
|
|
|
Cow::Owned(sections.join("\n"))
|
2025-05-12 17:24:44 -07:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2025-05-08 21:46:06 -07:00
|
|
|
#[derive(Debug)]
|
|
|
|
|
pub enum ResponseEvent {
|
|
|
|
|
OutputItemDone(ResponseItem),
|
feat: show number of tokens remaining in UI (#1388)
When using the OpenAI Responses API, we now record the `usage` field for
a `"response.completed"` event, which includes metrics about the number
of tokens consumed. We also introduce `openai_model_info.rs`, which
includes current data about the most common OpenAI models available via
the API (specifically `context_window` and `max_output_tokens`). If
Codex does not recognize the model, you can set `model_context_window`
and `model_max_output_tokens` explicitly in `config.toml`.
When then introduce a new event type to `protocol.rs`, `TokenCount`,
which includes the `TokenUsage` for the most recent turn.
Finally, we update the TUI to record the running sum of tokens used so
the percentage of available context window remaining can be reported via
the placeholder text for the composer:

We could certainly get much fancier with this (such as reporting the
estimated cost of the conversation), but for now, we are just trying to
achieve feature parity with the TypeScript CLI.
Though arguably this improves upon the TypeScript CLI, as the TypeScript
CLI uses heuristics to estimate the number of tokens used rather than
using the `usage` information directly:
https://github.com/openai/codex/blob/296996d74e345b1b05d8c3451a06ace21c5ada96/codex-cli/src/utils/approximate-tokens-used.ts#L3-L16
Fixes https://github.com/openai/codex/issues/1242
2025-06-25 23:31:11 -07:00
|
|
|
Completed {
|
|
|
|
|
response_id: String,
|
|
|
|
|
token_usage: Option<TokenUsage>,
|
|
|
|
|
},
|
2025-05-08 21:46:06 -07:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
#[derive(Debug, Serialize)]
|
|
|
|
|
pub(crate) struct Reasoning {
|
feat: make reasoning effort/summaries configurable (#1199)
Previous to this PR, we always set `reasoning` when making a request
using the Responses API:
https://github.com/openai/codex/blob/d7245cbbc9d8ff5446da45e5951761103492476d/codex-rs/core/src/client.rs#L108-L111
Though if you tried to use the Rust CLI with `--model gpt-4.1`, this
would fail with:
```shell
"Unsupported parameter: 'reasoning.effort' is not supported with this model."
```
We take a cue from the TypeScript CLI, which does a check on the model
name:
https://github.com/openai/codex/blob/d7245cbbc9d8ff5446da45e5951761103492476d/codex-cli/src/utils/agent/agent-loop.ts#L786-L789
This PR does a similar check, though also adds support for the following
config options:
```
model_reasoning_effort = "low" | "medium" | "high" | "none"
model_reasoning_summary = "auto" | "concise" | "detailed" | "none"
```
This way, if you have a model whose name happens to start with `"o"` (or
`"codex"`?), you can set these to `"none"` to explicitly disable
reasoning, if necessary. (That said, it seems unlikely anyone would use
the Responses API with non-OpenAI models, but we provide an escape
hatch, anyway.)
This PR also updates both the TUI and `codex exec` to show `reasoning
effort` and `reasoning summaries` in the header.
2025-06-02 16:01:34 -07:00
|
|
|
pub(crate) effort: OpenAiReasoningEffort,
|
2025-05-08 21:46:06 -07:00
|
|
|
#[serde(skip_serializing_if = "Option::is_none")]
|
feat: make reasoning effort/summaries configurable (#1199)
Previous to this PR, we always set `reasoning` when making a request
using the Responses API:
https://github.com/openai/codex/blob/d7245cbbc9d8ff5446da45e5951761103492476d/codex-rs/core/src/client.rs#L108-L111
Though if you tried to use the Rust CLI with `--model gpt-4.1`, this
would fail with:
```shell
"Unsupported parameter: 'reasoning.effort' is not supported with this model."
```
We take a cue from the TypeScript CLI, which does a check on the model
name:
https://github.com/openai/codex/blob/d7245cbbc9d8ff5446da45e5951761103492476d/codex-cli/src/utils/agent/agent-loop.ts#L786-L789
This PR does a similar check, though also adds support for the following
config options:
```
model_reasoning_effort = "low" | "medium" | "high" | "none"
model_reasoning_summary = "auto" | "concise" | "detailed" | "none"
```
This way, if you have a model whose name happens to start with `"o"` (or
`"codex"`?), you can set these to `"none"` to explicitly disable
reasoning, if necessary. (That said, it seems unlikely anyone would use
the Responses API with non-OpenAI models, but we provide an escape
hatch, anyway.)
This PR also updates both the TUI and `codex exec` to show `reasoning
effort` and `reasoning summaries` in the header.
2025-06-02 16:01:34 -07:00
|
|
|
pub(crate) summary: Option<OpenAiReasoningSummary>,
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// See https://platform.openai.com/docs/guides/reasoning?api-mode=responses#get-started-with-reasoning
|
|
|
|
|
#[derive(Debug, Serialize, Default, Clone, Copy)]
|
|
|
|
|
#[serde(rename_all = "lowercase")]
|
|
|
|
|
pub(crate) enum OpenAiReasoningEffort {
|
|
|
|
|
Low,
|
|
|
|
|
#[default]
|
|
|
|
|
Medium,
|
|
|
|
|
High,
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
impl From<ReasoningEffortConfig> for Option<OpenAiReasoningEffort> {
|
|
|
|
|
fn from(effort: ReasoningEffortConfig) -> Self {
|
|
|
|
|
match effort {
|
|
|
|
|
ReasoningEffortConfig::Low => Some(OpenAiReasoningEffort::Low),
|
|
|
|
|
ReasoningEffortConfig::Medium => Some(OpenAiReasoningEffort::Medium),
|
|
|
|
|
ReasoningEffortConfig::High => Some(OpenAiReasoningEffort::High),
|
|
|
|
|
ReasoningEffortConfig::None => None,
|
|
|
|
|
}
|
|
|
|
|
}
|
2025-05-10 21:43:27 -07:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// A summary of the reasoning performed by the model. This can be useful for
|
|
|
|
|
/// debugging and understanding the model's reasoning process.
|
feat: make reasoning effort/summaries configurable (#1199)
Previous to this PR, we always set `reasoning` when making a request
using the Responses API:
https://github.com/openai/codex/blob/d7245cbbc9d8ff5446da45e5951761103492476d/codex-rs/core/src/client.rs#L108-L111
Though if you tried to use the Rust CLI with `--model gpt-4.1`, this
would fail with:
```shell
"Unsupported parameter: 'reasoning.effort' is not supported with this model."
```
We take a cue from the TypeScript CLI, which does a check on the model
name:
https://github.com/openai/codex/blob/d7245cbbc9d8ff5446da45e5951761103492476d/codex-cli/src/utils/agent/agent-loop.ts#L786-L789
This PR does a similar check, though also adds support for the following
config options:
```
model_reasoning_effort = "low" | "medium" | "high" | "none"
model_reasoning_summary = "auto" | "concise" | "detailed" | "none"
```
This way, if you have a model whose name happens to start with `"o"` (or
`"codex"`?), you can set these to `"none"` to explicitly disable
reasoning, if necessary. (That said, it seems unlikely anyone would use
the Responses API with non-OpenAI models, but we provide an escape
hatch, anyway.)
This PR also updates both the TUI and `codex exec` to show `reasoning
effort` and `reasoning summaries` in the header.
2025-06-02 16:01:34 -07:00
|
|
|
/// See https://platform.openai.com/docs/guides/reasoning?api-mode=responses#reasoning-summaries
|
|
|
|
|
#[derive(Debug, Serialize, Default, Clone, Copy)]
|
2025-05-10 21:43:27 -07:00
|
|
|
#[serde(rename_all = "lowercase")]
|
feat: make reasoning effort/summaries configurable (#1199)
Previous to this PR, we always set `reasoning` when making a request
using the Responses API:
https://github.com/openai/codex/blob/d7245cbbc9d8ff5446da45e5951761103492476d/codex-rs/core/src/client.rs#L108-L111
Though if you tried to use the Rust CLI with `--model gpt-4.1`, this
would fail with:
```shell
"Unsupported parameter: 'reasoning.effort' is not supported with this model."
```
We take a cue from the TypeScript CLI, which does a check on the model
name:
https://github.com/openai/codex/blob/d7245cbbc9d8ff5446da45e5951761103492476d/codex-cli/src/utils/agent/agent-loop.ts#L786-L789
This PR does a similar check, though also adds support for the following
config options:
```
model_reasoning_effort = "low" | "medium" | "high" | "none"
model_reasoning_summary = "auto" | "concise" | "detailed" | "none"
```
This way, if you have a model whose name happens to start with `"o"` (or
`"codex"`?), you can set these to `"none"` to explicitly disable
reasoning, if necessary. (That said, it seems unlikely anyone would use
the Responses API with non-OpenAI models, but we provide an escape
hatch, anyway.)
This PR also updates both the TUI and `codex exec` to show `reasoning
effort` and `reasoning summaries` in the header.
2025-06-02 16:01:34 -07:00
|
|
|
pub(crate) enum OpenAiReasoningSummary {
|
|
|
|
|
#[default]
|
2025-05-10 21:43:27 -07:00
|
|
|
Auto,
|
|
|
|
|
Concise,
|
|
|
|
|
Detailed,
|
2025-05-08 21:46:06 -07:00
|
|
|
}
|
|
|
|
|
|
feat: make reasoning effort/summaries configurable (#1199)
Previous to this PR, we always set `reasoning` when making a request
using the Responses API:
https://github.com/openai/codex/blob/d7245cbbc9d8ff5446da45e5951761103492476d/codex-rs/core/src/client.rs#L108-L111
Though if you tried to use the Rust CLI with `--model gpt-4.1`, this
would fail with:
```shell
"Unsupported parameter: 'reasoning.effort' is not supported with this model."
```
We take a cue from the TypeScript CLI, which does a check on the model
name:
https://github.com/openai/codex/blob/d7245cbbc9d8ff5446da45e5951761103492476d/codex-cli/src/utils/agent/agent-loop.ts#L786-L789
This PR does a similar check, though also adds support for the following
config options:
```
model_reasoning_effort = "low" | "medium" | "high" | "none"
model_reasoning_summary = "auto" | "concise" | "detailed" | "none"
```
This way, if you have a model whose name happens to start with `"o"` (or
`"codex"`?), you can set these to `"none"` to explicitly disable
reasoning, if necessary. (That said, it seems unlikely anyone would use
the Responses API with non-OpenAI models, but we provide an escape
hatch, anyway.)
This PR also updates both the TUI and `codex exec` to show `reasoning
effort` and `reasoning summaries` in the header.
2025-06-02 16:01:34 -07:00
|
|
|
impl From<ReasoningSummaryConfig> for Option<OpenAiReasoningSummary> {
|
|
|
|
|
fn from(summary: ReasoningSummaryConfig) -> Self {
|
|
|
|
|
match summary {
|
|
|
|
|
ReasoningSummaryConfig::Auto => Some(OpenAiReasoningSummary::Auto),
|
|
|
|
|
ReasoningSummaryConfig::Concise => Some(OpenAiReasoningSummary::Concise),
|
|
|
|
|
ReasoningSummaryConfig::Detailed => Some(OpenAiReasoningSummary::Detailed),
|
|
|
|
|
ReasoningSummaryConfig::None => None,
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Request object that is serialized as JSON and POST'ed when using the
|
|
|
|
|
/// Responses API.
|
2025-05-08 21:46:06 -07:00
|
|
|
#[derive(Debug, Serialize)]
|
feat: make reasoning effort/summaries configurable (#1199)
Previous to this PR, we always set `reasoning` when making a request
using the Responses API:
https://github.com/openai/codex/blob/d7245cbbc9d8ff5446da45e5951761103492476d/codex-rs/core/src/client.rs#L108-L111
Though if you tried to use the Rust CLI with `--model gpt-4.1`, this
would fail with:
```shell
"Unsupported parameter: 'reasoning.effort' is not supported with this model."
```
We take a cue from the TypeScript CLI, which does a check on the model
name:
https://github.com/openai/codex/blob/d7245cbbc9d8ff5446da45e5951761103492476d/codex-cli/src/utils/agent/agent-loop.ts#L786-L789
This PR does a similar check, though also adds support for the following
config options:
```
model_reasoning_effort = "low" | "medium" | "high" | "none"
model_reasoning_summary = "auto" | "concise" | "detailed" | "none"
```
This way, if you have a model whose name happens to start with `"o"` (or
`"codex"`?), you can set these to `"none"` to explicitly disable
reasoning, if necessary. (That said, it seems unlikely anyone would use
the Responses API with non-OpenAI models, but we provide an escape
hatch, anyway.)
This PR also updates both the TUI and `codex exec` to show `reasoning
effort` and `reasoning summaries` in the header.
2025-06-02 16:01:34 -07:00
|
|
|
pub(crate) struct ResponsesApiRequest<'a> {
|
2025-05-08 21:46:06 -07:00
|
|
|
pub(crate) model: &'a str,
|
2025-05-12 17:24:44 -07:00
|
|
|
pub(crate) instructions: &'a str,
|
2025-05-08 21:46:06 -07:00
|
|
|
// TODO(mbolin): ResponseItem::Other should not be serialized. Currently,
|
|
|
|
|
// we code defensively to avoid this case, but perhaps we should use a
|
|
|
|
|
// separate enum for serialization.
|
|
|
|
|
pub(crate) input: &'a Vec<ResponseItem>,
|
|
|
|
|
pub(crate) tools: &'a [serde_json::Value],
|
|
|
|
|
pub(crate) tool_choice: &'static str,
|
|
|
|
|
pub(crate) parallel_tool_calls: bool,
|
|
|
|
|
pub(crate) reasoning: Option<Reasoning>,
|
|
|
|
|
#[serde(skip_serializing_if = "Option::is_none")]
|
|
|
|
|
pub(crate) previous_response_id: Option<String>,
|
|
|
|
|
/// true when using the Responses API.
|
|
|
|
|
pub(crate) store: bool,
|
|
|
|
|
pub(crate) stream: bool,
|
|
|
|
|
}
|
|
|
|
|
|
feat: make reasoning effort/summaries configurable (#1199)
Previous to this PR, we always set `reasoning` when making a request
using the Responses API:
https://github.com/openai/codex/blob/d7245cbbc9d8ff5446da45e5951761103492476d/codex-rs/core/src/client.rs#L108-L111
Though if you tried to use the Rust CLI with `--model gpt-4.1`, this
would fail with:
```shell
"Unsupported parameter: 'reasoning.effort' is not supported with this model."
```
We take a cue from the TypeScript CLI, which does a check on the model
name:
https://github.com/openai/codex/blob/d7245cbbc9d8ff5446da45e5951761103492476d/codex-cli/src/utils/agent/agent-loop.ts#L786-L789
This PR does a similar check, though also adds support for the following
config options:
```
model_reasoning_effort = "low" | "medium" | "high" | "none"
model_reasoning_summary = "auto" | "concise" | "detailed" | "none"
```
This way, if you have a model whose name happens to start with `"o"` (or
`"codex"`?), you can set these to `"none"` to explicitly disable
reasoning, if necessary. (That said, it seems unlikely anyone would use
the Responses API with non-OpenAI models, but we provide an escape
hatch, anyway.)
This PR also updates both the TUI and `codex exec` to show `reasoning
effort` and `reasoning summaries` in the header.
2025-06-02 16:01:34 -07:00
|
|
|
pub(crate) fn create_reasoning_param_for_request(
|
|
|
|
|
model: &str,
|
|
|
|
|
effort: ReasoningEffortConfig,
|
|
|
|
|
summary: ReasoningSummaryConfig,
|
|
|
|
|
) -> Option<Reasoning> {
|
|
|
|
|
let effort: Option<OpenAiReasoningEffort> = effort.into();
|
|
|
|
|
let effort = effort?;
|
|
|
|
|
|
|
|
|
|
if model_supports_reasoning_summaries(model) {
|
|
|
|
|
Some(Reasoning {
|
|
|
|
|
effort,
|
|
|
|
|
summary: summary.into(),
|
|
|
|
|
})
|
|
|
|
|
} else {
|
|
|
|
|
None
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
pub fn model_supports_reasoning_summaries(model: &str) -> bool {
|
|
|
|
|
// Currently, we hardcode this rule to decide whether enable reasoning.
|
|
|
|
|
// We expect reasoning to apply only to OpenAI models, but we do not want
|
|
|
|
|
// users to have to mess with their config to disable reasoning for models
|
|
|
|
|
// that do not support it, such as `gpt-4.1`.
|
|
|
|
|
//
|
|
|
|
|
// Though if a user is using Codex with non-OpenAI models that, say, happen
|
|
|
|
|
// to start with "o", then they can set `model_reasoning_effort = "none` in
|
|
|
|
|
// config.toml to disable reasoning.
|
|
|
|
|
//
|
|
|
|
|
// Ultimately, this should also be configurable in config.toml, but we
|
|
|
|
|
// need to have defaults that "just work." Perhaps we could have a
|
|
|
|
|
// "reasoning models pattern" as part of ModelProviderInfo?
|
|
|
|
|
model.starts_with("o") || model.starts_with("codex")
|
|
|
|
|
}
|
|
|
|
|
|
2025-05-08 21:46:06 -07:00
|
|
|
pub(crate) struct ResponseStream {
|
|
|
|
|
pub(crate) rx_event: mpsc::Receiver<Result<ResponseEvent>>,
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
impl Stream for ResponseStream {
|
|
|
|
|
type Item = Result<ResponseEvent>;
|
|
|
|
|
|
|
|
|
|
fn poll_next(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Option<Self::Item>> {
|
|
|
|
|
self.rx_event.poll_recv(cx)
|
|
|
|
|
}
|
|
|
|
|
}
|