Removed supports_prompt_caching parameter that was causing 400 errors.
Prompt caching is automatically enabled by Anthropic when the client
sends cache_control blocks in messages - no config needed.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Added supports_prompt_caching: true to all Claude models:
- claude-sonnet-4
- claude-sonnet-4.5
- claude-3-5-sonnet
- claude-3-opus
- claude-3-haiku
This enables Anthropic's prompt caching feature across all models,
significantly reducing latency and costs for repeated requests
with the same system prompts.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Configure LiteLLM to use existing Redis from core stack for caching:
- Enabled cache with Redis backend
- Set TTL to 1 hour for cached responses
- Uses core_redis container on default port
This will improve performance by caching API responses.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Disabled cache setting that requires Redis configuration.
Prompt caching at the Anthropic API level is still enabled
via supports_prompt_caching setting.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Reduce database logging overhead and enable prompt caching:
- Disabled verbose logging (set_verbose: false)
- Disabled spend tracking logs to reduce DB writes
- Disabled tag tracking and daily spend logs
- Removed success/failure callbacks
- Enabled prompt caching for claude-sonnet-4.5
- Set log level to ERROR only
- Removed --detailed_debug flag from command
This should significantly improve response times by eliminating
unnecessary database writes for every request.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Replace HTTP Basic Auth with Authelia ForwardAuth for consistent
authentication across infrastructure:
- Asciinema Admin (admin.asciinema.dev.pivoine.art): Removed Basic Auth,
added Authelia protection
- FaceFusion (facefusion.ai.pivoine.art): Removed Basic Auth, added
Authelia protection
Updated Authelia access control to include both services with one_factor
policy.
All services now use Authelia for authentication, eliminating the need
to manage separate Basic Auth credentials.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Add Mailpit service to NET stack with web UI at mailpit.pivoine.art
- Configure Mailpit to relay all emails through IONOS SMTP
- Migrate all 11+ services to use Mailpit instead of direct IONOS SMTP:
* SEXY: Directus API
* UTIL: Joplin, Mattermost, Vaultwarden, Tandoor, Linkwarden
* DEV: Gitea, n8n, Asciinema
* AI: Open WebUI
* NET: Netdata (via msmtp)
- Centralize SMTP credentials in mailpit-relay.yaml
- Simplify service configs (no auth/TLS for internal SMTP)
- Enable email monitoring via Mailpit web UI with Basic Auth
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Replace DISABLE_SWAGGER_UI with NO_DOCS and NO_REDOC
- Following official LiteLLM documentation for disabling API docs
- Disables both Swagger UI and Redoc documentation interfaces
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Add --disable_swagger flag to LiteLLM command
- Improves security by hiding API documentation interface
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Watchtower was trying to pull updates from Docker Hub for facefusion-patched:3.5.0-cpu
which only exists locally, causing spam errors. Disabled Watchtower monitoring for this
container since it's a custom-built image with NSFW filter patches.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Fixed line number and function names to match actual source
- Added validation to ensure patch was applied
- Updated patch file with correct context
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
This patch file documents the exact change made to content_analyser.py
for disabling the NSFW content filter in Facefusion 3.5.0.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Added echo statements to track script execution
- Added -v flag to rm to show deleted files
- Confirmed deletion is working correctly
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Previous approach caused infinite download loop. Now waits for models
to download, then deletes NSFW models once, allowing Gradio to start.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Added Facefusion face swapping service to the AI stack:
**Configuration:**
- URL: https://facefusion.ai.pivoine.art
- Image: facefusion/facefusion:3.5.0-cpu
- Port: 7865
- Container: ai_facefusion
- Volume: ai_facefusion_data
- HTTP Basic Auth protection
- CPU execution mode (GPU when available)
**Changes:**
- Added facefusion service to ai/compose.yaml
- Added AI_FACEFUSION_* env vars to arty.yml
- Created ai_facefusion_data volume
- Removed old standalone facefusion stack
- Removed ai/README-export.md and ai/webui-export.py
Facefusion will run on CPU until GPU server is available.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
According to litellm docs, drop_params only drops OpenAI parameters.
Since prompt_cache_key is an Anthropic-specific parameter, we need
to use additional_drop_params to explicitly drop it.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Explicitly set drop_params and supports_prompt_caching=false for
claude-sonnet-4.5 model to prevent prompt_cache_key parameter from
being sent to Anthropic API.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Add router_settings and default_litellm_params to ensure unsupported
parameters like prompt_cache_key are properly dropped when using codex
with the litellm proxy.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Added disable_responses_id_security setting to allow Codex CLI to access
the /responses endpoint without 401 errors. This removes the encryption
requirement on response IDs while maintaining API key authentication.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Created database initialization script following the core stack pattern.
The script automatically creates required databases on first initialization:
- openwebui: Open WebUI application database
- litellm: LiteLLM proxy database for API key management and tracking
Changes:
- Created ai/postgres/init/01-init-databases.sh
- Mounted init directory in ai_postgres service
- Added automatic privilege grants to AI_DB_USER
Note: Init script only runs on first database creation when volume is empty.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
LiteLLM now uses the ai_postgres database instance with a dedicated
'litellm' database for API key management, usage tracking, and rate limiting.
Changes:
- Set DATABASE_URL to postgresql://ai:password@ai_postgres:5432/litellm
- Added depends_on ai_postgres to ensure DB starts first
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Removed authentication middleware to simplify access. LiteLLM now relies
solely on Bearer token authentication via LITELLM_MASTER_KEY.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Re-enabled LITELLM_MASTER_KEY for proper API key authentication.
LiteLLM supports master key without database for simple auth scenarios.
- LiteLLM validates Bearer token against master key
- Open WebUI uses same key for internal communication
- External access requires both HTTP Basic Auth + API key
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Removed LITELLM_MASTER_KEY as it requires a database for virtual key
management. Security is already provided by HTTP Basic Auth on the
public Traefik endpoint. Internal Open WebUI communication doesn't
need additional API key auth.
Security layers:
- Public access: HTTP Basic Auth via Traefik
- Internal LiteLLM: Network isolation (no auth needed)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>