feat: context compaction (#3446)

## Compact feature:
1. Stops the model when the context window become too large
2. Add a user turn, asking for the model to summarize
3. Build a bridge that contains all the previous user message + the
summary. Rendered from a template
4. Start sampling again from a clean conversation with only that bridge
This commit is contained in:
jif-oai
2025-09-12 13:07:10 -07:00
committed by GitHub
parent d4848e558b
commit ea225df22e
14 changed files with 1243 additions and 326 deletions

View File

@@ -104,6 +104,12 @@ impl ModelClient {
.or_else(|| get_model_info(&self.config.model_family).map(|info| info.context_window))
}
pub fn get_auto_compact_token_limit(&self) -> Option<i64> {
self.config.model_auto_compact_token_limit.or_else(|| {
get_model_info(&self.config.model_family).and_then(|info| info.auto_compact_token_limit)
})
}
/// Dispatches to either the Responses or Chat implementation depending on
/// the provider config. Public callers always invoke `stream()` the
/// specialised helpers are private to avoid accidental misuse.