- Add BGE embedding model config (port 8002) to litellm-config.yaml
- Add GPU_VLLM_EMBED_URL env var to compose and .env
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Remove crawl4ai service from ai/compose.yaml (will use local MCP instead)
- Remove crawl4ai backup volume from core/compose.yaml
- Add core/backrest/config.json (infrastructure as code)
- Change backrest from volume to bind-mounted config
- Update CLAUDE.md and README.md documentation
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Supervisor XML-RPC API v3.0 (Supervisor 4.3.0) only supports 2-parameter
readLog(offset, length) calls, not 3-parameter calls with filename.
The SUPERVISOR_LOGFILE environment variable is not used by the API.
Testing showed:
- Working: server.supervisor.readLog(-4096, 0)
- Failing: server.supervisor.readLog(-4096, 4096, '/path/to/log')
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Replaced nginx:alpine proxy with dev.pivoine.art/valknar/supervisor-ui:latest
- Modern Next.js UI with real-time SSE updates, batch operations, and charts
- Changed service port from 80 (nginx) to 3000 (Next.js)
- Removed supervisor-nginx.conf (no longer needed)
- Kept same URL (supervisor.ai.pivoine.art) and Authelia SSO protection
- Added health check for /api/health endpoint
- Service connects to RunPod Supervisor via Tailscale (SUPERVISOR_HOST/PORT)
- Replace GPU_TAILSCALE_IP interpolation with GPU_VLLM_API_URL
- LiteLLM requires full URL in api_base with os.environ/ syntax
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Replace COMFYUI_BACKEND_HOST and SUPERVISOR_BACKEND_HOST with GPU_TAILSCALE_IP
- Update LiteLLM config to use os.environ/GPU_TAILSCALE_IP for vLLM models
- Add GPU_TAILSCALE_IP env var to LiteLLM service
- Configure qwen-2.5-7b and llama-3.1-8b to route through orchestrator
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Remove hardcoded default values from compose.yaml
- Backend IPs now managed via environment variables only
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Add nginx reverse proxy service for Supervisor web UI at supervisor.ai.pivoine.art with Authelia authentication. Proxies to RunPod GPU instance via Tailscale (100.121.199.88:9001).
Changes:
- Create supervisor-nginx.conf for nginx proxy configuration
- Add supervisor service to docker-compose with Traefik labels
- Add supervisor.ai.pivoine.art to Authelia protected domains
- Remove deprecated Flux-related files
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Replace hardcoded IP in comfyui-nginx.conf with env vars
- Add COMFYUI_BACKEND_HOST and COMFYUI_BACKEND_PORT to compose.yaml
- Use envsubst to substitute variables at container startup
- Defaults: 100.121.199.88:8188 (current RunPod Tailscale IP)
- Add ComfyUI service to AI stack using nginx:alpine as reverse proxy
- Proxy to RunPod ComfyUI via Tailscale (100.121.199.88:8188)
- Configure Traefik routing for comfy.ai.pivoine.art
- Enable Authelia SSO middleware (net-authelia)
- Support WebSocket connections for real-time updates
- Set appropriate timeouts for image generation (300s)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Set drop_params: false in litellm_settings
- Set modify_params: false in litellm_settings
- Set drop_params: false in default_litellm_params
- Commented out LITELLM_DROP_PARAMS env var
- Removed --drop_params command flag
These settings were stripping critical streaming parameters causing
vLLM streaming responses to collapse into empty deltas
- Reverted direct orchestrator connection to WebUI
- Added stream: true parameter to qwen-2.5-7b model config
- Keep LiteLLM as single proxy for all models
- Configure WebUI with both LiteLLM and direct orchestrator API base URLs
- This bypasses LiteLLM's streaming issues for the qwen-2.5-7b model
- WebUI will now show models from both endpoints
- Allows testing if LiteLLM is the bottleneck for streaming
Related to streaming fix in RunPod models/vllm/server.py
Reduce database logging overhead and enable prompt caching:
- Disabled verbose logging (set_verbose: false)
- Disabled spend tracking logs to reduce DB writes
- Disabled tag tracking and daily spend logs
- Removed success/failure callbacks
- Enabled prompt caching for claude-sonnet-4.5
- Set log level to ERROR only
- Removed --detailed_debug flag from command
This should significantly improve response times by eliminating
unnecessary database writes for every request.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Replace HTTP Basic Auth with Authelia ForwardAuth for consistent
authentication across infrastructure:
- Asciinema Admin (admin.asciinema.dev.pivoine.art): Removed Basic Auth,
added Authelia protection
- FaceFusion (facefusion.ai.pivoine.art): Removed Basic Auth, added
Authelia protection
Updated Authelia access control to include both services with one_factor
policy.
All services now use Authelia for authentication, eliminating the need
to manage separate Basic Auth credentials.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Add Mailpit service to NET stack with web UI at mailpit.pivoine.art
- Configure Mailpit to relay all emails through IONOS SMTP
- Migrate all 11+ services to use Mailpit instead of direct IONOS SMTP:
* SEXY: Directus API
* UTIL: Joplin, Mattermost, Vaultwarden, Tandoor, Linkwarden
* DEV: Gitea, n8n, Asciinema
* AI: Open WebUI
* NET: Netdata (via msmtp)
- Centralize SMTP credentials in mailpit-relay.yaml
- Simplify service configs (no auth/TLS for internal SMTP)
- Enable email monitoring via Mailpit web UI with Basic Auth
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Replace DISABLE_SWAGGER_UI with NO_DOCS and NO_REDOC
- Following official LiteLLM documentation for disabling API docs
- Disables both Swagger UI and Redoc documentation interfaces
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Add --disable_swagger flag to LiteLLM command
- Improves security by hiding API documentation interface
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Watchtower was trying to pull updates from Docker Hub for facefusion-patched:3.5.0-cpu
which only exists locally, causing spam errors. Disabled Watchtower monitoring for this
container since it's a custom-built image with NSFW filter patches.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>