Files
llmx/codex-rs/core/src/exec_env.rs

197 lines
6.5 KiB
Rust
Raw Normal View History

feat: introduce support for shell_environment_policy in config.toml (#1061) To date, when handling `shell` and `local_shell` tool calls, we were spawning new processes using the environment inherited from the Codex process itself. This means that the sensitive `OPENAI_API_KEY` that Codex needs to talk to OpenAI models was made available to everything run by `shell` and `local_shell`. While there are cases where that might be useful, it does not seem like a good default. This PR introduces a complex `shell_environment_policy` config option to control the `env` used with these tool calls. It is inevitably a bit complex so that it is possible to override individual components of the policy so without having to restate the entire thing. Details are in the updated `README.md` in this PR, but here is the relevant bit that explains the individual fields of `shell_environment_policy`: | Field | Type | Default | Description | | ------------------------- | -------------------------- | ------- | ----------------------------------------------------------------------------------------------------------------------------------------------- | | `inherit` | string | `core` | Starting template for the environment:<br>`core` (`HOME`, `PATH`, `USER`, …), `all` (clone full parent env), or `none` (start empty). | | `ignore_default_excludes` | boolean | `false` | When `false`, Codex removes any var whose **name** contains `KEY`, `SECRET`, or `TOKEN` (case-insensitive) before other rules run. | | `exclude` | array&lt;string&gt; | `[]` | Case-insensitive glob patterns to drop after the default filter.<br>Examples: `"AWS_*"`, `"AZURE_*"`. | | `set` | table&lt;string,string&gt; | `{}` | Explicit key/value overrides or additions – always win over inherited values. | | `include_only` | array&lt;string&gt; | `[]` | If non-empty, a whitelist of patterns; only variables that match _one_ pattern survive the final step. (Generally used with `inherit = "all"`.) | In particular, note that the default is `inherit = "core"`, so: * if you have extra env variables that you want to inherit from the parent process, use `inherit = "all"` and then specify `include_only` * if you have extra env variables where you want to hardcode the values, the default `inherit = "core"` will work fine, but then you need to specify `set` This configuration is not battle-tested, so we will probably still have to play with it a bit. `core/src/exec_env.rs` has the critical business logic as well as unit tests. Though if nothing else, previous to this change: ``` $ cargo run --bin codex -- debug seatbelt -- printenv OPENAI_API_KEY # ...prints OPENAI_API_KEY... ``` But after this change it does not print anything (as desired). One final thing to call out about this PR is that the `configure_command!` macro we use in `core/src/exec.rs` has to do some complex logic with respect to how it builds up the `env` for the process being spawned under Landlock/seccomp. Specifically, doing `cmd.env_clear()` followed by `cmd.envs(&$env_map)` (which is arguably the most intuitive way to do it) caused the Landlock unit tests to fail because the processes spawned by the unit tests started failing in unexpected ways! If we forgo `env_clear()` in favor of updating env vars one at a time, the tests still pass. The comment in the code talks about this a bit, and while I would like to investigate this more, I need to move on for the moment, but I do plan to come back to it to fully understand what is going on. For example, this suggests that we might not be able to spawn a C program that calls `env_clear()`, which would be...weird. We may still have to fiddle with our Landlock config if that is the case.
2025-05-22 09:51:19 -07:00
use crate::config_types::EnvironmentVariablePattern;
use crate::config_types::ShellEnvironmentPolicy;
use crate::config_types::ShellEnvironmentPolicyInherit;
use std::collections::HashMap;
use std::collections::HashSet;
/// Construct an environment map based on the rules in the specified policy. The
/// resulting map can be passed directly to `Command::envs()` after calling
/// `env_clear()` to ensure no unintended variables are leaked to the spawned
/// process.
///
/// The derivation follows the algorithm documented in the struct-level comment
/// for [`ShellEnvironmentPolicy`].
pub fn create_env(policy: &ShellEnvironmentPolicy) -> HashMap<String, String> {
populate_env(std::env::vars(), policy)
}
fn populate_env<I>(vars: I, policy: &ShellEnvironmentPolicy) -> HashMap<String, String>
where
I: IntoIterator<Item = (String, String)>,
{
// Step 1 determine the starting set of variables based on the
// `inherit` strategy.
let mut env_map: HashMap<String, String> = match policy.inherit {
ShellEnvironmentPolicyInherit::All => vars.into_iter().collect(),
ShellEnvironmentPolicyInherit::None => HashMap::new(),
ShellEnvironmentPolicyInherit::Core => {
const CORE_VARS: &[&str] = &[
"HOME", "LOGNAME", "PATH", "SHELL", "USER", "USERNAME", "TMPDIR", "TEMP", "TMP",
];
let allow: HashSet<&str> = CORE_VARS.iter().copied().collect();
vars.into_iter()
.filter(|(k, _)| allow.contains(k.as_str()))
.collect()
}
};
// Internal helper does `name` match **any** pattern in `patterns`?
let matches_any = |name: &str, patterns: &[EnvironmentVariablePattern]| -> bool {
patterns.iter().any(|pattern| pattern.matches(name))
};
// Step 2 Apply the default exclude if not disabled.
if !policy.ignore_default_excludes {
let default_excludes = vec![
EnvironmentVariablePattern::new_case_insensitive("*KEY*"),
EnvironmentVariablePattern::new_case_insensitive("*SECRET*"),
EnvironmentVariablePattern::new_case_insensitive("*TOKEN*"),
];
env_map.retain(|k, _| !matches_any(k, &default_excludes));
}
// Step 3 Apply custom excludes.
if !policy.exclude.is_empty() {
env_map.retain(|k, _| !matches_any(k, &policy.exclude));
}
// Step 4 Apply user-provided overrides.
for (key, val) in &policy.r#set {
env_map.insert(key.clone(), val.clone());
}
// Step 5 If include_only is non-empty, keep *only* the matching vars.
if !policy.include_only.is_empty() {
env_map.retain(|k, _| matches_any(k, &policy.include_only));
}
env_map
}
#[cfg(test)]
mod tests {
#![allow(clippy::unwrap_used, clippy::expect_used)]
use super::*;
use crate::config_types::ShellEnvironmentPolicyInherit;
use maplit::hashmap;
fn make_vars(pairs: &[(&str, &str)]) -> Vec<(String, String)> {
pairs
.iter()
.map(|(k, v)| (k.to_string(), v.to_string()))
.collect()
}
#[test]
fn test_core_inherit_and_default_excludes() {
let vars = make_vars(&[
("PATH", "/usr/bin"),
("HOME", "/home/user"),
("API_KEY", "secret"),
("SECRET_TOKEN", "t"),
]);
let policy = ShellEnvironmentPolicy::default(); // inherit Core, default excludes on
let result = populate_env(vars, &policy);
let expected: HashMap<String, String> = hashmap! {
"PATH".to_string() => "/usr/bin".to_string(),
"HOME".to_string() => "/home/user".to_string(),
};
assert_eq!(result, expected);
}
#[test]
fn test_include_only() {
let vars = make_vars(&[("PATH", "/usr/bin"), ("FOO", "bar")]);
let policy = ShellEnvironmentPolicy {
// skip default excludes so nothing is removed prematurely
ignore_default_excludes: true,
include_only: vec![EnvironmentVariablePattern::new_case_insensitive("*PATH")],
..Default::default()
};
let result = populate_env(vars, &policy);
let expected: HashMap<String, String> = hashmap! {
"PATH".to_string() => "/usr/bin".to_string(),
};
assert_eq!(result, expected);
}
#[test]
fn test_set_overrides() {
let vars = make_vars(&[("PATH", "/usr/bin")]);
let mut policy = ShellEnvironmentPolicy {
ignore_default_excludes: true,
..Default::default()
};
policy.r#set.insert("NEW_VAR".to_string(), "42".to_string());
let result = populate_env(vars, &policy);
let expected: HashMap<String, String> = hashmap! {
"PATH".to_string() => "/usr/bin".to_string(),
"NEW_VAR".to_string() => "42".to_string(),
};
assert_eq!(result, expected);
}
#[test]
fn test_inherit_all() {
let vars = make_vars(&[("PATH", "/usr/bin"), ("FOO", "bar")]);
let policy = ShellEnvironmentPolicy {
inherit: ShellEnvironmentPolicyInherit::All,
ignore_default_excludes: true, // keep everything
..Default::default()
};
let result = populate_env(vars.clone(), &policy);
let expected: HashMap<String, String> = vars.into_iter().collect();
assert_eq!(result, expected);
}
#[test]
fn test_inherit_all_with_default_excludes() {
let vars = make_vars(&[("PATH", "/usr/bin"), ("API_KEY", "secret")]);
let policy = ShellEnvironmentPolicy {
inherit: ShellEnvironmentPolicyInherit::All,
..Default::default()
};
let result = populate_env(vars, &policy);
let expected: HashMap<String, String> = hashmap! {
"PATH".to_string() => "/usr/bin".to_string(),
};
assert_eq!(result, expected);
}
#[test]
fn test_inherit_none() {
let vars = make_vars(&[("PATH", "/usr/bin"), ("HOME", "/home")]);
let mut policy = ShellEnvironmentPolicy {
inherit: ShellEnvironmentPolicyInherit::None,
ignore_default_excludes: true,
..Default::default()
};
policy
.r#set
.insert("ONLY_VAR".to_string(), "yes".to_string());
let result = populate_env(vars, &policy);
let expected: HashMap<String, String> = hashmap! {
"ONLY_VAR".to_string() => "yes".to_string(),
};
assert_eq!(result, expected);
}
}