Commit Graph

15 Commits

Author SHA1 Message Date
8622f9dfa0 fix: remove drop_params from individual model configs
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-16 18:53:44 +01:00
0146d1f043 fix: remove invalid supports_prompt_caching parameter
Removed supports_prompt_caching parameter that was causing 400 errors.
Prompt caching is automatically enabled by Anthropic when the client
sends cache_control blocks in messages - no config needed.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-16 16:09:17 +01:00
d26310afb7 feat: enable prompt caching for all Claude models
Added supports_prompt_caching: true to all Claude models:
- claude-sonnet-4
- claude-sonnet-4.5
- claude-3-5-sonnet
- claude-3-opus
- claude-3-haiku

This enables Anthropic's prompt caching feature across all models,
significantly reducing latency and costs for repeated requests
with the same system prompts.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-16 16:07:29 +01:00
2014a82efb feat: enable Redis caching for LiteLLM
Configure LiteLLM to use existing Redis from core stack for caching:
- Enabled cache with Redis backend
- Set TTL to 1 hour for cached responses
- Uses core_redis container on default port

This will improve performance by caching API responses.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-16 16:05:14 +01:00
5cec1415ad fix: disable LiteLLM cache to avoid Redis requirement
Disabled cache setting that requires Redis configuration.
Prompt caching at the Anthropic API level is still enabled
via supports_prompt_caching setting.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-16 16:04:39 +01:00
8a18ae753d perf: optimize LiteLLM for better performance
Reduce database logging overhead and enable prompt caching:

- Disabled verbose logging (set_verbose: false)
- Disabled spend tracking logs to reduce DB writes
- Disabled tag tracking and daily spend logs
- Removed success/failure callbacks
- Enabled prompt caching for claude-sonnet-4.5
- Set log level to ERROR only
- Removed --detailed_debug flag from command

This should significantly improve response times by eliminating
unnecessary database writes for every request.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-16 16:03:19 +01:00
3ddc76e213 fix: add additional_drop_params at global litellm_settings level 2025-11-11 12:36:49 +01:00
cabac4b767 fix: use additional_drop_params to explicitly drop prompt_cache_key
According to litellm docs, drop_params only drops OpenAI parameters.
Since prompt_cache_key is an Anthropic-specific parameter, we need
to use additional_drop_params to explicitly drop it.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-11 12:33:10 +01:00
da0dc2363a fix: disable prompt caching and responses API in litellm
- Add LITELLM_DROP_PARAMS environment variable
- Disable cache in litellm_settings
- Attempt to disable responses API endpoint
- Remove invalid supports_prompt_caching parameter

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-11 12:27:06 +01:00
813823995c fix: disable prompt caching for claude-sonnet-4.5
Explicitly set drop_params and supports_prompt_caching=false for
claude-sonnet-4.5 model to prevent prompt_cache_key parameter from
being sent to Anthropic API.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-11 12:22:27 +01:00
f36e0fa9eb fix: enhance litellm parameter dropping for codex compatibility
Add router_settings and default_litellm_params to ensure unsupported
parameters like prompt_cache_key are properly dropped when using codex
with the litellm proxy.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-11 12:14:00 +01:00
ce6c60d8e0 fix: disable responses ID security for Codex CLI compatibility
Added disable_responses_id_security setting to allow Codex CLI to access
the /responses endpoint without 401 errors. This removes the encryption
requirement on response IDs while maintaining API key authentication.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-09 19:00:55 +01:00
cdb8d2ef34 fix: correct LiteLLM environment variable syntax
Changed API key reference from ${ANTHROPIC_API_KEY} to
os.environ/ANTHROPIC_API_KEY to match LiteLLM's documented syntax.

The os.environ/ prefix tells LiteLLM to use os.getenv() to retrieve
the environment variable at runtime, which is the correct way to
reference environment variables in LiteLLM config files.

Reference: https://docs.litellm.ai/docs/proxy/deploy

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-09 00:30:07 +01:00
424e6d044d fix: configure LiteLLM without database requirement 2025-11-08 23:02:07 +01:00
8eae3c650f feat: add LiteLLM proxy for Anthropic Claude models
Added LiteLLM as an OpenAI-compatible proxy for Anthropic's API to
enable Claude models in Open WebUI.

**New Service: litellm**
- Image: ghcr.io/berriai/litellm:main-latest
- Internal proxy on port 4000
- Converts Anthropic API to OpenAI-compatible format
- Health check with 30s intervals
- Not exposed via Traefik (internal only)

**LiteLLM Configuration (litellm-config.yaml)**
- Claude Sonnet 4 (claude-sonnet-4-20250514)
- Claude Sonnet 4.5 (claude-sonnet-4-5-20250929)
- Claude 3.5 Sonnet (claude-3-5-sonnet-20241022)
- Claude 3 Opus (claude-3-opus-20240229)
- Claude 3 Haiku (claude-3-haiku-20240307)

**Open WebUI Configuration Updates**
- Changed OPENAI_API_BASE_URLS to point to LiteLLM proxy
- URL: http://litellm:4000/v1
- Added litellm as dependency for webui service
- Dummy API key for proxy authentication

**Why LiteLLM?**
Anthropic's API uses different endpoint structure and authentication
headers compared to OpenAI. LiteLLM acts as a translation layer,
allowing Open WebUI to use Claude models through its OpenAI-compatible
interface.

**Available Models in Open WebUI**
- claude-sonnet-4 (latest Claude Sonnet 4)
- claude-sonnet-4.5 (Claude Sonnet 4.5)
- claude-3-5-sonnet
- claude-3-opus
- claude-3-haiku

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-08 22:58:09 +01:00