fix: agent instructions were not being included when ~/.codex/instructions.md was empty (#908)
I had seen issues where `codex-rs` would not always write files without me pressuring it to do so, and between that and the report of https://github.com/openai/codex/issues/900, I decided to look into this further. I found two serious issues with agent instructions: (1) We were only sending agent instructions on the first turn, but looking at the TypeScript code, we should be sending them on every turn. (2) There was a serious issue where the agent instructions were frequently lost: * The TypeScript CLI appears to keep writing `~/.codex/instructions.md`:55142e3e6c/codex-cli/src/utils/config.ts (L586)* If `instructions.md` is present, the Rust CLI uses the contents of it INSTEAD OF the default prompt, even if `instructions.md` is empty:55142e3e6c/codex-rs/core/src/config.rs (L202-L203)The combination of these two things means that I have been using `codex-rs` without these key instructions: https://github.com/openai/codex/blob/main/codex-rs/core/prompt.md Looking at the TypeScript code, it appears we should be concatenating these three items every time (if they exist): * `prompt.md` * `~/.codex/instructions.md` * nearest `AGENTS.md` This PR fixes things so that: * `Config.instructions` is `None` if `instructions.md` is empty * `Payload.instructions` is now `&'a str` instead of `Option<&'a String>` because we should always have _something_ to send * `Prompt` now has a `get_full_instructions()` helper that returns a `Cow<str>` that will always include the agent instructions first.
This commit is contained in:
@@ -2,12 +2,17 @@ use crate::error::Result;
|
||||
use crate::models::ResponseItem;
|
||||
use futures::Stream;
|
||||
use serde::Serialize;
|
||||
use std::borrow::Cow;
|
||||
use std::collections::HashMap;
|
||||
use std::pin::Pin;
|
||||
use std::task::Context;
|
||||
use std::task::Poll;
|
||||
use tokio::sync::mpsc;
|
||||
|
||||
/// The `instructions` field in the payload sent to a model should always start
|
||||
/// with this content.
|
||||
const BASE_INSTRUCTIONS: &str = include_str!("../prompt.md");
|
||||
|
||||
/// API request payload for a single model turn.
|
||||
#[derive(Default, Debug, Clone)]
|
||||
pub struct Prompt {
|
||||
@@ -15,7 +20,8 @@ pub struct Prompt {
|
||||
pub input: Vec<ResponseItem>,
|
||||
/// Optional previous response ID (when storage is enabled).
|
||||
pub prev_id: Option<String>,
|
||||
/// Optional initial instructions (only sent on first turn).
|
||||
/// Optional instructions from the user to amend to the built-in agent
|
||||
/// instructions.
|
||||
pub instructions: Option<String>,
|
||||
/// Whether to store response on server side (disable_response_storage = !store).
|
||||
pub store: bool,
|
||||
@@ -26,6 +32,18 @@ pub struct Prompt {
|
||||
pub extra_tools: HashMap<String, mcp_types::Tool>,
|
||||
}
|
||||
|
||||
impl Prompt {
|
||||
pub(crate) fn get_full_instructions(&self) -> Cow<str> {
|
||||
match &self.instructions {
|
||||
Some(instructions) => {
|
||||
let instructions = format!("{BASE_INSTRUCTIONS}\n{instructions}");
|
||||
Cow::Owned(instructions)
|
||||
}
|
||||
None => Cow::Borrowed(BASE_INSTRUCTIONS),
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
#[derive(Debug)]
|
||||
pub enum ResponseEvent {
|
||||
OutputItemDone(ResponseItem),
|
||||
@@ -54,8 +72,7 @@ pub(crate) enum Summary {
|
||||
#[derive(Debug, Serialize)]
|
||||
pub(crate) struct Payload<'a> {
|
||||
pub(crate) model: &'a str,
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub(crate) instructions: Option<&'a String>,
|
||||
pub(crate) instructions: &'a str,
|
||||
// TODO(mbolin): ResponseItem::Other should not be serialized. Currently,
|
||||
// we code defensively to avoid this case, but perhaps we should use a
|
||||
// separate enum for serialization.
|
||||
|
||||
Reference in New Issue
Block a user