feat: Complete LLMX v0.1.0 - Rebrand from Codex with LiteLLM Integration

This release represents a comprehensive transformation of the codebase from Codex to LLMX, enhanced with LiteLLM integration to support 100+ LLM providers through a unified API. ## Major Changes ### Phase 1: Repository & Infrastructure Setup - Established new repository structure and branching strategy - Created comprehensive project documentation (CLAUDE.md, LITELLM-SETUP.md) - Set up development environment and tooling configuration ### Phase 2: Rust Workspace Transformation - Renamed all Rust crates from `codex-*` to `llmx-*` (30+ crates) - Updated package names, binary names, and workspace members - Renamed core modules: codex.rs → llmx.rs, codex_delegate.rs → llmx_delegate.rs - Updated all internal references, imports, and type names - Renamed directories: codex-rs/ → llmx-rs/, codex-backend-openapi-models/ → llmx-backend-openapi-models/ - Fixed all Rust compilation errors after mass rename ### Phase 3: LiteLLM Integration - Integrated LiteLLM for multi-provider LLM support (Anthropic, OpenAI, Azure, Google AI, AWS Bedrock, etc.) - Implemented OpenAI-compatible Chat Completions API support - Added model family detection and provider-specific handling - Updated authentication to support LiteLLM API keys - Renamed environment variables: OPENAI_BASE_URL → LLMX_BASE_URL - Added LLMX_API_KEY for unified authentication - Enhanced error handling for Chat Completions API responses - Implemented fallback mechanisms between Responses API and Chat Completions API ### Phase 4: TypeScript/Node.js Components - Renamed npm package: @codex/codex-cli → @valknar/llmx - Updated TypeScript SDK to use new LLMX APIs and endpoints - Fixed all TypeScript compilation and linting errors - Updated SDK tests to support both API backends - Enhanced mock server to handle multiple API formats - Updated build scripts for cross-platform packaging ### Phase 5: Configuration & Documentation - Updated all configuration files to use LLMX naming - Rewrote README and documentation for LLMX branding - Updated config paths: ~/.codex/ → ~/.llmx/ - Added comprehensive LiteLLM setup guide - Updated all user-facing strings and help text - Created release plan and migration documentation ### Phase 6: Testing & Validation - Fixed all Rust tests for new naming scheme - Updated snapshot tests in TUI (36 frame files) - Fixed authentication storage tests - Updated Chat Completions payload and SSE tests - Fixed SDK tests for new API endpoints - Ensured compatibility with Claude Sonnet 4.5 model - Fixed test environment variables (LLMX_API_KEY, LLMX_BASE_URL) ### Phase 7: Build & Release Pipeline - Updated GitHub Actions workflows for LLMX binary names - Fixed rust-release.yml to reference llmx-rs/ instead of codex-rs/ - Updated CI/CD pipelines for new package names - Made Apple code signing optional in release workflow - Enhanced npm packaging resilience for partial platform builds - Added Windows sandbox support to workspace - Updated dotslash configuration for new binary names ### Phase 8: Final Polish - Renamed all assets (.github images, labels, templates) - Updated VSCode and DevContainer configurations - Fixed all clippy warnings and formatting issues - Applied cargo fmt and prettier formatting across codebase - Updated issue templates and pull request templates - Fixed all remaining UI text references ## Technical Details **Breaking Changes:** - Binary name changed from `codex` to `llmx` - Config directory changed from `~/.codex/` to `~/.llmx/` - Environment variables renamed (CODEX_* → LLMX_*) - npm package renamed to `@valknar/llmx` **New Features:** - Support for 100+ LLM providers via LiteLLM - Unified authentication with LLMX_API_KEY - Enhanced model provider detection and handling - Improved error handling and fallback mechanisms **Files Changed:** - 578 files modified across Rust, TypeScript, and documentation - 30+ Rust crates renamed and updated - Complete rebrand of UI, CLI, and documentation - All tests updated and passing **Dependencies:** - Updated Cargo.lock with new package names - Updated npm dependencies in llmx-cli - Enhanced OpenAPI models for LLMX backend This release establishes LLMX as a standalone project with comprehensive LiteLLM integration, maintaining full backward compatibility with existing functionality while opening support for a wide ecosystem of LLM providers. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Sebastian Krüger <support@pivoine.art>
2025-11-12 20:40:44 +01:00
parent 052b052832
commit 3c7efc58c8
1248 changed files with 10085 additions and 9580 deletions
--- a/llmx-rs/app-server/src/error_code.rs
+++ b/llmx-rs/app-server/src/error_code.rs
@@ -0,0 +1,2 @@
+pub(crate) const INVALID_REQUEST_ERROR_CODE: i64 = -32600;
+pub(crate) const INTERNAL_ERROR_CODE: i64 = -32603;
--- a/llmx-rs/app-server/src/fuzzy_file_search.rs
+++ b/llmx-rs/app-server/src/fuzzy_file_search.rs
@@ -0,0 +1,93 @@
+use std::num::NonZero;
+use std::num::NonZeroUsize;
+use std::path::Path;
+use std::path::PathBuf;
+use std::sync::Arc;
+use std::sync::atomic::AtomicBool;
+
+use llmx_app_server_protocol::FuzzyFileSearchResult;
+use llmx_file_search as file_search;
+use tokio::task::JoinSet;
+use tracing::warn;
+
+const LIMIT_PER_ROOT: usize = 50;
+const MAX_THREADS: usize = 12;
+const COMPUTE_INDICES: bool = true;
+
+pub(crate) async fn run_fuzzy_file_search(
+    query: String,
+    roots: Vec<String>,
+    cancellation_flag: Arc<AtomicBool>,
+) -> Vec<FuzzyFileSearchResult> {
+    #[expect(clippy::expect_used)]
+    let limit_per_root =
+        NonZero::new(LIMIT_PER_ROOT).expect("LIMIT_PER_ROOT should be a valid non-zero usize");
+
+    let cores = std::thread::available_parallelism()
+        .map(std::num::NonZero::get)
+        .unwrap_or(1);
+    let threads = cores.min(MAX_THREADS);
+    let threads_per_root = (threads / roots.len()).max(1);
+    let threads = NonZero::new(threads_per_root).unwrap_or(NonZeroUsize::MIN);
+
+    let mut files: Vec<FuzzyFileSearchResult> = Vec::new();
+    let mut join_set = JoinSet::new();
+
+    for root in roots {
+        let search_dir = PathBuf::from(&root);
+        let query = query.clone();
+        let cancel_flag = cancellation_flag.clone();
+        join_set.spawn_blocking(move || {
+            match file_search::run(
+                query.as_str(),
+                limit_per_root,
+                &search_dir,
+                Vec::new(),
+                threads,
+                cancel_flag,
+                COMPUTE_INDICES,
+                true,
+            ) {
+                Ok(res) => Ok((root, res)),
+                Err(err) => Err((root, err)),
+            }
+        });
+    }
+
+    while let Some(res) = join_set.join_next().await {
+        match res {
+            Ok(Ok((root, res))) => {
+                for m in res.matches {
+                    let path = m.path;
+                    //TODO(shijie): Move file name generation to file_search lib.
+                    let file_name = Path::new(&path)
+                        .file_name()
+                        .map(|name| name.to_string_lossy().into_owned())
+                        .unwrap_or_else(|| path.clone());
+                    let result = FuzzyFileSearchResult {
+                        root: root.clone(),
+                        path,
+                        file_name,
+                        score: m.score,
+                        indices: m.indices,
+                    };
+                    files.push(result);
+                }
+            }
+            Ok(Err((root, err))) => {
+                warn!("fuzzy-file-search in dir '{root}' failed: {err}");
+            }
+            Err(err) => {
+                warn!("fuzzy-file-search join_next failed: {err}");
+            }
+        }
+    }
+
+    files.sort_by(file_search::cmp_by_score_desc_then_path_asc::<
+        FuzzyFileSearchResult,
+        _,
+        _,
+    >(|f| f.score, |f| f.path.as_str()));
+
+    files
+}
--- a/llmx-rs/app-server/src/lib.rs
+++ b/llmx-rs/app-server/src/lib.rs
@@ -0,0 +1,172 @@
+#![deny(clippy::print_stdout, clippy::print_stderr)]
+
+use llmx_common::CliConfigOverrides;
+use llmx_core::config::Config;
+use llmx_core::config::ConfigOverrides;
+use opentelemetry_appender_tracing::layer::OpenTelemetryTracingBridge;
+use std::io::ErrorKind;
+use std::io::Result as IoResult;
+use std::path::PathBuf;
+
+use crate::message_processor::MessageProcessor;
+use crate::outgoing_message::OutgoingMessage;
+use crate::outgoing_message::OutgoingMessageSender;
+use llmx_app_server_protocol::JSONRPCMessage;
+use llmx_feedback::LlmxFeedback;
+use tokio::io::AsyncBufReadExt;
+use tokio::io::AsyncWriteExt;
+use tokio::io::BufReader;
+use tokio::io::{self};
+use tokio::sync::mpsc;
+use tracing::Level;
+use tracing::debug;
+use tracing::error;
+use tracing::info;
+use tracing_subscriber::EnvFilter;
+use tracing_subscriber::Layer;
+use tracing_subscriber::filter::Targets;
+use tracing_subscriber::layer::SubscriberExt;
+use tracing_subscriber::util::SubscriberInitExt;
+
+mod error_code;
+mod fuzzy_file_search;
+mod llmx_message_processor;
+mod message_processor;
+mod models;
+mod outgoing_message;
+
+/// Size of the bounded channels used to communicate between tasks. The value
+/// is a balance between throughput and memory usage – 128 messages should be
+/// plenty for an interactive CLI.
+const CHANNEL_CAPACITY: usize = 128;
+
+pub async fn run_main(
+    llmx_linux_sandbox_exe: Option<PathBuf>,
+    cli_config_overrides: CliConfigOverrides,
+) -> IoResult<()> {
+    // Set up channels.
+    let (incoming_tx, mut incoming_rx) = mpsc::channel::<JSONRPCMessage>(CHANNEL_CAPACITY);
+    let (outgoing_tx, mut outgoing_rx) = mpsc::unbounded_channel::<OutgoingMessage>();
+
+    // Task: read from stdin, push to `incoming_tx`.
+    let stdin_reader_handle = tokio::spawn({
+        async move {
+            let stdin = io::stdin();
+            let reader = BufReader::new(stdin);
+            let mut lines = reader.lines();
+
+            while let Some(line) = lines.next_line().await.unwrap_or_default() {
+                match serde_json::from_str::<JSONRPCMessage>(&line) {
+                    Ok(msg) => {
+                        if incoming_tx.send(msg).await.is_err() {
+                            // Receiver gone – nothing left to do.
+                            break;
+                        }
+                    }
+                    Err(e) => error!("Failed to deserialize JSONRPCMessage: {e}"),
+                }
+            }
+
+            debug!("stdin reader finished (EOF)");
+        }
+    });
+
+    // Parse CLI overrides once and derive the base Config eagerly so later
+    // components do not need to work with raw TOML values.
+    let cli_kv_overrides = cli_config_overrides.parse_overrides().map_err(|e| {
+        std::io::Error::new(
+            ErrorKind::InvalidInput,
+            format!("error parsing -c overrides: {e}"),
+        )
+    })?;
+    let config = Config::load_with_cli_overrides(cli_kv_overrides, ConfigOverrides::default())
+        .await
+        .map_err(|e| {
+            std::io::Error::new(ErrorKind::InvalidData, format!("error loading config: {e}"))
+        })?;
+
+    let feedback = LlmxFeedback::new();
+
+    let otel =
+        llmx_core::otel_init::build_provider(&config, env!("CARGO_PKG_VERSION")).map_err(|e| {
+            std::io::Error::new(
+                ErrorKind::InvalidData,
+                format!("error loading otel config: {e}"),
+            )
+        })?;
+
+    // Install a simple subscriber so `tracing` output is visible.  Users can
+    // control the log level with `RUST_LOG`.
+    let stderr_fmt = tracing_subscriber::fmt::layer()
+        .with_writer(std::io::stderr)
+        .with_filter(EnvFilter::from_default_env());
+
+    let feedback_layer = tracing_subscriber::fmt::layer()
+        .with_writer(feedback.make_writer())
+        .with_ansi(false)
+        .with_target(false)
+        .with_filter(Targets::new().with_default(Level::TRACE));
+
+    let _ = tracing_subscriber::registry()
+        .with(stderr_fmt)
+        .with(feedback_layer)
+        .with(otel.as_ref().map(|provider| {
+            OpenTelemetryTracingBridge::new(&provider.logger).with_filter(
+                tracing_subscriber::filter::filter_fn(llmx_core::otel_init::llmx_export_filter),
+            )
+        }))
+        .try_init();
+
+    // Task: process incoming messages.
+    let processor_handle = tokio::spawn({
+        let outgoing_message_sender = OutgoingMessageSender::new(outgoing_tx);
+        let mut processor = MessageProcessor::new(
+            outgoing_message_sender,
+            llmx_linux_sandbox_exe,
+            std::sync::Arc::new(config),
+            feedback.clone(),
+        );
+        async move {
+            while let Some(msg) = incoming_rx.recv().await {
+                match msg {
+                    JSONRPCMessage::Request(r) => processor.process_request(r).await,
+                    JSONRPCMessage::Response(r) => processor.process_response(r).await,
+                    JSONRPCMessage::Notification(n) => processor.process_notification(n).await,
+                    JSONRPCMessage::Error(e) => processor.process_error(e),
+                }
+            }
+
+            info!("processor task exited (channel closed)");
+        }
+    });
+
+    // Task: write outgoing messages to stdout.
+    let stdout_writer_handle = tokio::spawn(async move {
+        let mut stdout = io::stdout();
+        while let Some(outgoing_message) = outgoing_rx.recv().await {
+            let Ok(value) = serde_json::to_value(outgoing_message) else {
+                error!("Failed to convert OutgoingMessage to JSON value");
+                continue;
+            };
+            match serde_json::to_string(&value) {
+                Ok(mut json) => {
+                    json.push('\n');
+                    if let Err(e) = stdout.write_all(json.as_bytes()).await {
+                        error!("Failed to write to stdout: {e}");
+                        break;
+                    }
+                }
+                Err(e) => error!("Failed to serialize JSONRPCMessage: {e}"),
+            }
+        }
+
+        info!("stdout writer exited (channel closed)");
+    });
+
+    // Wait for all tasks to finish.  The typical exit path is the stdin reader
+    // hitting EOF which, once it drops `incoming_tx`, propagates shutdown to
+    // the processor and then to the stdout task.
+    let _ = tokio::join!(stdin_reader_handle, processor_handle, stdout_writer_handle);
+
+    Ok(())
+}
--- a/llmx-rs/app-server/src/llmx_message_processor.rs
+++ b/llmx-rs/app-server/src/llmx_message_processor.rs
--- a/llmx-rs/app-server/src/main.rs
+++ b/llmx-rs/app-server/src/main.rs
@@ -0,0 +1,10 @@
+use llmx_app_server::run_main;
+use llmx_arg0::arg0_dispatch_or_else;
+use llmx_common::CliConfigOverrides;
+
+fn main() -> anyhow::Result<()> {
+    arg0_dispatch_or_else(|llmx_linux_sandbox_exe| async move {
+        run_main(llmx_linux_sandbox_exe, CliConfigOverrides::default()).await?;
+        Ok(())
+    })
+}
--- a/llmx-rs/app-server/src/message_processor.rs
+++ b/llmx-rs/app-server/src/message_processor.rs
@@ -0,0 +1,159 @@
+use std::path::PathBuf;
+
+use crate::error_code::INVALID_REQUEST_ERROR_CODE;
+use crate::llmx_message_processor::LlmxMessageProcessor;
+use crate::outgoing_message::OutgoingMessageSender;
+use llmx_app_server_protocol::ClientInfo;
+use llmx_app_server_protocol::ClientRequest;
+use llmx_app_server_protocol::InitializeResponse;
+
+use llmx_app_server_protocol::JSONRPCError;
+use llmx_app_server_protocol::JSONRPCErrorError;
+use llmx_app_server_protocol::JSONRPCNotification;
+use llmx_app_server_protocol::JSONRPCRequest;
+use llmx_app_server_protocol::JSONRPCResponse;
+use llmx_core::AuthManager;
+use llmx_core::ConversationManager;
+use llmx_core::config::Config;
+use llmx_core::default_client::USER_AGENT_SUFFIX;
+use llmx_core::default_client::get_llmx_user_agent;
+use llmx_feedback::LlmxFeedback;
+use llmx_protocol::protocol::SessionSource;
+use std::sync::Arc;
+
+pub(crate) struct MessageProcessor {
+    outgoing: Arc<OutgoingMessageSender>,
+    llmx_message_processor: LlmxMessageProcessor,
+    initialized: bool,
+}
+
+impl MessageProcessor {
+    /// Create a new `MessageProcessor`, retaining a handle to the outgoing
+    /// `Sender` so handlers can enqueue messages to be written to stdout.
+    pub(crate) fn new(
+        outgoing: OutgoingMessageSender,
+        llmx_linux_sandbox_exe: Option<PathBuf>,
+        config: Arc<Config>,
+        feedback: LlmxFeedback,
+    ) -> Self {
+        let outgoing = Arc::new(outgoing);
+        let auth_manager = AuthManager::shared(
+            config.llmx_home.clone(),
+            false,
+            config.cli_auth_credentials_store_mode,
+        );
+        let conversation_manager = Arc::new(ConversationManager::new(
+            auth_manager.clone(),
+            SessionSource::VSCode,
+        ));
+        let llmx_message_processor = LlmxMessageProcessor::new(
+            auth_manager,
+            conversation_manager,
+            outgoing.clone(),
+            llmx_linux_sandbox_exe,
+            config,
+            feedback,
+        );
+
+        Self {
+            outgoing,
+            llmx_message_processor,
+            initialized: false,
+        }
+    }
+
+    pub(crate) async fn process_request(&mut self, request: JSONRPCRequest) {
+        let request_id = request.id.clone();
+        let request_json = match serde_json::to_value(&request) {
+            Ok(request_json) => request_json,
+            Err(err) => {
+                let error = JSONRPCErrorError {
+                    code: INVALID_REQUEST_ERROR_CODE,
+                    message: format!("Invalid request: {err}"),
+                    data: None,
+                };
+                self.outgoing.send_error(request_id, error).await;
+                return;
+            }
+        };
+
+        let llmx_request = match serde_json::from_value::<ClientRequest>(request_json) {
+            Ok(llmx_request) => llmx_request,
+            Err(err) => {
+                let error = JSONRPCErrorError {
+                    code: INVALID_REQUEST_ERROR_CODE,
+                    message: format!("Invalid request: {err}"),
+                    data: None,
+                };
+                self.outgoing.send_error(request_id, error).await;
+                return;
+            }
+        };
+
+        match llmx_request {
+            // Handle Initialize internally so LlmxMessageProcessor does not have to concern
+            // itself with the `initialized` bool.
+            ClientRequest::Initialize { request_id, params } => {
+                if self.initialized {
+                    let error = JSONRPCErrorError {
+                        code: INVALID_REQUEST_ERROR_CODE,
+                        message: "Already initialized".to_string(),
+                        data: None,
+                    };
+                    self.outgoing.send_error(request_id, error).await;
+                    return;
+                } else {
+                    let ClientInfo {
+                        name,
+                        title: _title,
+                        version,
+                    } = params.client_info;
+                    let user_agent_suffix = format!("{name}; {version}");
+                    if let Ok(mut suffix) = USER_AGENT_SUFFIX.lock() {
+                        *suffix = Some(user_agent_suffix);
+                    }
+
+                    let user_agent = get_llmx_user_agent();
+                    let response = InitializeResponse { user_agent };
+                    self.outgoing.send_response(request_id, response).await;
+
+                    self.initialized = true;
+                    return;
+                }
+            }
+            _ => {
+                if !self.initialized {
+                    let error = JSONRPCErrorError {
+                        code: INVALID_REQUEST_ERROR_CODE,
+                        message: "Not initialized".to_string(),
+                        data: None,
+                    };
+                    self.outgoing.send_error(request_id, error).await;
+                    return;
+                }
+            }
+        }
+
+        self.llmx_message_processor
+            .process_request(llmx_request)
+            .await;
+    }
+
+    pub(crate) async fn process_notification(&self, notification: JSONRPCNotification) {
+        // Currently, we do not expect to receive any notifications from the
+        // client, so we just log them.
+        tracing::info!("<- notification: {:?}", notification);
+    }
+
+    /// Handle a standalone JSON-RPC response originating from the peer.
+    pub(crate) async fn process_response(&mut self, response: JSONRPCResponse) {
+        tracing::info!("<- response: {:?}", response);
+        let JSONRPCResponse { id, result, .. } = response;
+        self.outgoing.notify_client_response(id, result).await
+    }
+
+    /// Handle an error object received from the peer.
+    pub(crate) fn process_error(&mut self, err: JSONRPCError) {
+        tracing::error!("<- error: {:?}", err);
+    }
+}
--- a/llmx-rs/app-server/src/models.rs
+++ b/llmx-rs/app-server/src/models.rs
@@ -0,0 +1,39 @@
+use llmx_app_server_protocol::AuthMode;
+use llmx_app_server_protocol::Model;
+use llmx_app_server_protocol::ReasoningEffortOption;
+use llmx_common::model_presets::ModelPreset;
+use llmx_common::model_presets::ReasoningEffortPreset;
+use llmx_common::model_presets::builtin_model_presets;
+
+pub fn supported_models(auth_mode: Option<AuthMode>) -> Vec<Model> {
+    builtin_model_presets(auth_mode)
+        .into_iter()
+        .map(model_from_preset)
+        .collect()
+}
+
+fn model_from_preset(preset: ModelPreset) -> Model {
+    Model {
+        id: preset.id.to_string(),
+        model: preset.model.to_string(),
+        display_name: preset.display_name.to_string(),
+        description: preset.description.to_string(),
+        supported_reasoning_efforts: reasoning_efforts_from_preset(
+            preset.supported_reasoning_efforts,
+        ),
+        default_reasoning_effort: preset.default_reasoning_effort,
+        is_default: preset.is_default,
+    }
+}
+
+fn reasoning_efforts_from_preset(
+    efforts: &'static [ReasoningEffortPreset],
+) -> Vec<ReasoningEffortOption> {
+    efforts
+        .iter()
+        .map(|preset| ReasoningEffortOption {
+            reasoning_effort: preset.effort,
+            description: preset.description.to_string(),
+        })
+        .collect()
+}
--- a/llmx-rs/app-server/src/outgoing_message.rs
+++ b/llmx-rs/app-server/src/outgoing_message.rs
@@ -0,0 +1,261 @@
+use std::collections::HashMap;
+use std::sync::atomic::AtomicI64;
+use std::sync::atomic::Ordering;
+
+use llmx_app_server_protocol::JSONRPCErrorError;
+use llmx_app_server_protocol::RequestId;
+use llmx_app_server_protocol::Result;
+use llmx_app_server_protocol::ServerNotification;
+use llmx_app_server_protocol::ServerRequest;
+use llmx_app_server_protocol::ServerRequestPayload;
+use serde::Serialize;
+use tokio::sync::Mutex;
+use tokio::sync::mpsc;
+use tokio::sync::oneshot;
+use tracing::warn;
+
+use crate::error_code::INTERNAL_ERROR_CODE;
+
+/// Sends messages to the client and manages request callbacks.
+pub(crate) struct OutgoingMessageSender {
+    next_request_id: AtomicI64,
+    sender: mpsc::UnboundedSender<OutgoingMessage>,
+    request_id_to_callback: Mutex<HashMap<RequestId, oneshot::Sender<Result>>>,
+}
+
+impl OutgoingMessageSender {
+    pub(crate) fn new(sender: mpsc::UnboundedSender<OutgoingMessage>) -> Self {
+        Self {
+            next_request_id: AtomicI64::new(0),
+            sender,
+            request_id_to_callback: Mutex::new(HashMap::new()),
+        }
+    }
+
+    pub(crate) async fn send_request(
+        &self,
+        request: ServerRequestPayload,
+    ) -> oneshot::Receiver<Result> {
+        let id = RequestId::Integer(self.next_request_id.fetch_add(1, Ordering::Relaxed));
+        let outgoing_message_id = id.clone();
+        let (tx_approve, rx_approve) = oneshot::channel();
+        {
+            let mut request_id_to_callback = self.request_id_to_callback.lock().await;
+            request_id_to_callback.insert(id, tx_approve);
+        }
+
+        let outgoing_message =
+            OutgoingMessage::Request(request.request_with_id(outgoing_message_id));
+        let _ = self.sender.send(outgoing_message);
+        rx_approve
+    }
+
+    pub(crate) async fn notify_client_response(&self, id: RequestId, result: Result) {
+        let entry = {
+            let mut request_id_to_callback = self.request_id_to_callback.lock().await;
+            request_id_to_callback.remove_entry(&id)
+        };
+
+        match entry {
+            Some((id, sender)) => {
+                if let Err(err) = sender.send(result) {
+                    warn!("could not notify callback for {id:?} due to: {err:?}");
+                }
+            }
+            None => {
+                warn!("could not find callback for {id:?}");
+            }
+        }
+    }
+
+    pub(crate) async fn send_response<T: Serialize>(&self, id: RequestId, response: T) {
+        match serde_json::to_value(response) {
+            Ok(result) => {
+                let outgoing_message = OutgoingMessage::Response(OutgoingResponse { id, result });
+                let _ = self.sender.send(outgoing_message);
+            }
+            Err(err) => {
+                self.send_error(
+                    id,
+                    JSONRPCErrorError {
+                        code: INTERNAL_ERROR_CODE,
+                        message: format!("failed to serialize response: {err}"),
+                        data: None,
+                    },
+                )
+                .await;
+            }
+        }
+    }
+
+    pub(crate) async fn send_server_notification(&self, notification: ServerNotification) {
+        let _ = self
+            .sender
+            .send(OutgoingMessage::AppServerNotification(notification));
+    }
+
+    /// All notifications should be migrated to [`ServerNotification`] and
+    /// [`OutgoingMessage::Notification`] should be removed.
+    pub(crate) async fn send_notification(&self, notification: OutgoingNotification) {
+        let outgoing_message = OutgoingMessage::Notification(notification);
+        let _ = self.sender.send(outgoing_message);
+    }
+
+    pub(crate) async fn send_error(&self, id: RequestId, error: JSONRPCErrorError) {
+        let outgoing_message = OutgoingMessage::Error(OutgoingError { id, error });
+        let _ = self.sender.send(outgoing_message);
+    }
+}
+
+/// Outgoing message from the server to the client.
+#[derive(Debug, Clone, Serialize)]
+#[serde(untagged)]
+pub(crate) enum OutgoingMessage {
+    Request(ServerRequest),
+    Notification(OutgoingNotification),
+    /// AppServerNotification is specific to the case where this is run as an
+    /// "app server" as opposed to an MCP server.
+    AppServerNotification(ServerNotification),
+    Response(OutgoingResponse),
+    Error(OutgoingError),
+}
+
+#[derive(Debug, Clone, PartialEq, Serialize)]
+pub(crate) struct OutgoingNotification {
+    pub method: String,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub params: Option<serde_json::Value>,
+}
+
+#[derive(Debug, Clone, PartialEq, Serialize)]
+pub(crate) struct OutgoingResponse {
+    pub id: RequestId,
+    pub result: Result,
+}
+
+#[derive(Debug, Clone, PartialEq, Serialize)]
+pub(crate) struct OutgoingError {
+    pub error: JSONRPCErrorError,
+    pub id: RequestId,
+}
+
+#[cfg(test)]
+mod tests {
+    use llmx_app_server_protocol::AccountLoginCompletedNotification;
+    use llmx_app_server_protocol::AccountRateLimitsUpdatedNotification;
+    use llmx_app_server_protocol::AccountUpdatedNotification;
+    use llmx_app_server_protocol::AuthMode;
+    use llmx_app_server_protocol::LoginChatGptCompleteNotification;
+    use llmx_app_server_protocol::RateLimitSnapshot;
+    use llmx_app_server_protocol::RateLimitWindow;
+    use pretty_assertions::assert_eq;
+    use serde_json::json;
+    use uuid::Uuid;
+
+    use super::*;
+
+    #[test]
+    fn verify_server_notification_serialization() {
+        let notification =
+            ServerNotification::LoginChatGptComplete(LoginChatGptCompleteNotification {
+                login_id: Uuid::nil(),
+                success: true,
+                error: None,
+            });
+
+        let jsonrpc_notification = OutgoingMessage::AppServerNotification(notification);
+        assert_eq!(
+            json!({
+                "method": "loginChatGptComplete",
+                "params": {
+                    "loginId": Uuid::nil(),
+                    "success": true,
+                    "error": null,
+                },
+            }),
+            serde_json::to_value(jsonrpc_notification)
+                .expect("ensure the strum macros serialize the method field correctly"),
+            "ensure the strum macros serialize the method field correctly"
+        );
+    }
+
+    #[test]
+    fn verify_account_login_completed_notification_serialization() {
+        let notification =
+            ServerNotification::AccountLoginCompleted(AccountLoginCompletedNotification {
+                login_id: Some(Uuid::nil().to_string()),
+                success: true,
+                error: None,
+            });
+
+        let jsonrpc_notification = OutgoingMessage::AppServerNotification(notification);
+        assert_eq!(
+            json!({
+                "method": "account/login/completed",
+                "params": {
+                    "loginId": Uuid::nil().to_string(),
+                    "success": true,
+                    "error": null,
+                },
+            }),
+            serde_json::to_value(jsonrpc_notification)
+                .expect("ensure the notification serializes correctly"),
+            "ensure the notification serializes correctly"
+        );
+    }
+
+    #[test]
+    fn verify_account_rate_limits_notification_serialization() {
+        let notification =
+            ServerNotification::AccountRateLimitsUpdated(AccountRateLimitsUpdatedNotification {
+                rate_limits: RateLimitSnapshot {
+                    primary: Some(RateLimitWindow {
+                        used_percent: 25,
+                        window_duration_mins: Some(15),
+                        resets_at: Some(123),
+                    }),
+                    secondary: None,
+                },
+            });
+
+        let jsonrpc_notification = OutgoingMessage::AppServerNotification(notification);
+        assert_eq!(
+            json!({
+                "method": "account/rateLimits/updated",
+                "params": {
+                    "rateLimits": {
+                        "primary": {
+                            "usedPercent": 25,
+                            "windowDurationMins": 15,
+                            "resetsAt": 123
+                        },
+                        "secondary": null
+                    }
+                },
+            }),
+            serde_json::to_value(jsonrpc_notification)
+                .expect("ensure the notification serializes correctly"),
+            "ensure the notification serializes correctly"
+        );
+    }
+
+    #[test]
+    fn verify_account_updated_notification_serialization() {
+        let notification = ServerNotification::AccountUpdated(AccountUpdatedNotification {
+            auth_mode: Some(AuthMode::ApiKey),
+        });
+
+        let jsonrpc_notification = OutgoingMessage::AppServerNotification(notification);
+        assert_eq!(
+            json!({
+                "method": "account/updated",
+                "params": {
+                    "authMode": "apikey"
+                },
+            }),
+            serde_json::to_value(jsonrpc_notification)
+                .expect("ensure the notification serializes correctly"),
+            "ensure the notification serializes correctly"
+        );
+    }
+}