Commit Graph

63 Commits

Author SHA1 Message Date
20ba9952a1 feat: upscale service 2025-11-27 12:13:57 +01:00
8bdcde4b90 fix: supervisor env 2025-11-26 22:58:16 +01:00
5d232c7d9b feat: audiocraft 2025-11-26 22:54:10 +01:00
d57a1241d2 feat(ai): add bge-large-en-v1.5 embedding model to litellm
- Add BGE embedding model config (port 8002) to litellm-config.yaml
- Add GPU_VLLM_EMBED_URL env var to compose and .env

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-25 06:40:36 +01:00
ef0309838c refactor(ai): remove crawl4ai service, add backrest config to repo
- Remove crawl4ai service from ai/compose.yaml (will use local MCP instead)
- Remove crawl4ai backup volume from core/compose.yaml
- Add core/backrest/config.json (infrastructure as code)
- Change backrest from volume to bind-mounted config
- Update CLAUDE.md and README.md documentation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-25 06:20:22 +01:00
071a74a996 revert(ai): remove SUPERVISOR_LOGFILE env var from supervisor-ui
Supervisor XML-RPC API v3.0 (Supervisor 4.3.0) only supports 2-parameter
readLog(offset, length) calls, not 3-parameter calls with filename.
The SUPERVISOR_LOGFILE environment variable is not used by the API.

Testing showed:
- Working: server.supervisor.readLog(-4096, 0)
- Failing: server.supervisor.readLog(-4096, 4096, '/path/to/log')

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 23:01:10 +01:00
74b3748b23 feat(ai): add SUPERVISOR_LOGFILE env var to supervisor-ui for RunPod logs
Configure supervisor-ui to use correct logfile path (/workspace/logs/supervisord.log)
for RunPod Supervisor instance. Fixes logs page error on https://supervisor.ai.pivoine.art/logs

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 22:49:33 +01:00
87216ab26a fix: remove healthcheck from supervisor-ui service 2025-11-23 20:38:37 +01:00
9e2b19e7f6 feat: replace nginx supervisor proxy with modern supervisor-ui
- Replaced nginx:alpine proxy with dev.pivoine.art/valknar/supervisor-ui:latest
- Modern Next.js UI with real-time SSE updates, batch operations, and charts
- Changed service port from 80 (nginx) to 3000 (Next.js)
- Removed supervisor-nginx.conf (no longer needed)
- Kept same URL (supervisor.ai.pivoine.art) and Authelia SSO protection
- Added health check for /api/health endpoint
- Service connects to RunPod Supervisor via Tailscale (SUPERVISOR_HOST/PORT)
2025-11-23 20:18:29 +01:00
a80c6b931b fix: update compose.yaml to use new GPU_VLLM URLs 2025-11-23 16:22:54 +01:00
779e76974d fix: use complete URL env var for vLLM API base
- Replace GPU_TAILSCALE_IP interpolation with GPU_VLLM_API_URL
- LiteLLM requires full URL in api_base with os.environ/ syntax

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 13:17:37 +01:00
f3f32c163f feat: consolidate GPU IP with single GPU_TAILSCALE_IP variable
- Replace COMFYUI_BACKEND_HOST and SUPERVISOR_BACKEND_HOST with GPU_TAILSCALE_IP
- Update LiteLLM config to use os.environ/GPU_TAILSCALE_IP for vLLM models
- Add GPU_TAILSCALE_IP env var to LiteLLM service
- Configure qwen-2.5-7b and llama-3.1-8b to route through orchestrator

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 13:05:33 +01:00
e00e959543 Update backend IPs for ComfyUI and Supervisor proxies
- Remove hardcoded default values from compose.yaml
- Backend IPs now managed via environment variables only

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 02:11:19 +01:00
0fd2eacad1 feat: add Supervisor proxy with Authelia SSO
Add nginx reverse proxy service for Supervisor web UI at supervisor.ai.pivoine.art with Authelia authentication. Proxies to RunPod GPU instance via Tailscale (100.121.199.88:9001).

Changes:
- Create supervisor-nginx.conf for nginx proxy configuration
- Add supervisor service to docker-compose with Traefik labels
- Add supervisor.ai.pivoine.art to Authelia protected domains
- Remove deprecated Flux-related files

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 13:19:02 +01:00
ae1c349b55 feat: make ComfyUI backend IP/port configurable via environment variables
- Replace hardcoded IP in comfyui-nginx.conf with env vars
- Add COMFYUI_BACKEND_HOST and COMFYUI_BACKEND_PORT to compose.yaml
- Use envsubst to substitute variables at container startup
- Defaults: 100.121.199.88:8188 (current RunPod Tailscale IP)
2025-11-21 21:24:51 +01:00
904f7d3c2e feat(ai): add ComfyUI proxy service with Authelia SSO
- Add ComfyUI service to AI stack using nginx:alpine as reverse proxy
- Proxy to RunPod ComfyUI via Tailscale (100.121.199.88:8188)
- Configure Traefik routing for comfy.ai.pivoine.art
- Enable Authelia SSO middleware (net-authelia)
- Support WebSocket connections for real-time updates
- Set appropriate timeouts for image generation (300s)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-21 20:56:20 +01:00
9a964cff3c feat: add Flux image generation function for Open WebUI
- Add flux_image_gen.py manifold function for Flux.1 Schnell
- Auto-mount functions via Docker volume (./functions:/app/backend/data/functions:ro)
- Add comprehensive setup guide in FLUX_SETUP.md
- Update CLAUDE.md with Flux integration documentation
- Infrastructure as code approach - no manual import needed

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-21 20:20:33 +01:00
155016da97 debug: enable DEBUG logging for LiteLLM to troubleshoot streaming 2025-11-21 19:10:00 +01:00
01a345979b fix: disable drop_params to preserve streaming metadata in LiteLLM
- Set drop_params: false in litellm_settings
- Set modify_params: false in litellm_settings
- Set drop_params: false in default_litellm_params
- Commented out LITELLM_DROP_PARAMS env var
- Removed --drop_params command flag

These settings were stripping critical streaming parameters causing
vLLM streaming responses to collapse into empty deltas
2025-11-21 18:46:33 +01:00
c58b5d36ba revert: remove direct WebUI connection, focus on fixing LiteLLM streaming
- Reverted direct orchestrator connection to WebUI
- Added stream: true parameter to qwen-2.5-7b model config
- Keep LiteLLM as single proxy for all models
2025-11-21 18:42:46 +01:00
62fcf832da feat: add direct RunPod orchestrator connection to WebUI for streaming bypass
- Configure WebUI with both LiteLLM and direct orchestrator API base URLs
- This bypasses LiteLLM's streaming issues for the qwen-2.5-7b model
- WebUI will now show models from both endpoints
- Allows testing if LiteLLM is the bottleneck for streaming

Related to streaming fix in RunPod models/vllm/server.py
2025-11-21 18:38:31 +01:00
103bbbad51 debug: enable INFO logging in LiteLLM for troubleshooting
Enable detailed logging to debug qwen model requests from WebUI.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-21 17:13:38 +01:00
6aea9d018e feat(ai): disable Ollama API in WebUI, use LiteLLM only 2025-11-21 16:57:20 +01:00
8a18ae753d perf: optimize LiteLLM for better performance
Reduce database logging overhead and enable prompt caching:

- Disabled verbose logging (set_verbose: false)
- Disabled spend tracking logs to reduce DB writes
- Disabled tag tracking and daily spend logs
- Removed success/failure callbacks
- Enabled prompt caching for claude-sonnet-4.5
- Set log level to ERROR only
- Removed --detailed_debug flag from command

This should significantly improve response times by eliminating
unnecessary database writes for every request.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-16 16:03:19 +01:00
ffbcecc09d feat: replace Basic Auth with Authelia
Replace HTTP Basic Auth with Authelia ForwardAuth for consistent
authentication across infrastructure:

- Asciinema Admin (admin.asciinema.dev.pivoine.art): Removed Basic Auth,
  added Authelia protection
- FaceFusion (facefusion.ai.pivoine.art): Removed Basic Auth, added
  Authelia protection

Updated Authelia access control to include both services with one_factor
policy.

All services now use Authelia for authentication, eliminating the need
to manage separate Basic Auth credentials.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-15 21:54:27 +01:00
51267cc674 feat: add Mailpit SMTP relay and migrate all services
- Add Mailpit service to NET stack with web UI at mailpit.pivoine.art
- Configure Mailpit to relay all emails through IONOS SMTP
- Migrate all 11+ services to use Mailpit instead of direct IONOS SMTP:
  * SEXY: Directus API
  * UTIL: Joplin, Mattermost, Vaultwarden, Tandoor, Linkwarden
  * DEV: Gitea, n8n, Asciinema
  * AI: Open WebUI
  * NET: Netdata (via msmtp)
- Centralize SMTP credentials in mailpit-relay.yaml
- Simplify service configs (no auth/TLS for internal SMTP)
- Enable email monitoring via Mailpit web UI with Basic Auth

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-15 18:34:38 +01:00
709dcd8882 fix: use correct NO_DOCS and NO_REDOC environment variables
- Replace DISABLE_SWAGGER_UI with NO_DOCS and NO_REDOC
- Following official LiteLLM documentation for disabling API docs
- Disables both Swagger UI and Redoc documentation interfaces

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-14 02:17:40 +01:00
b66e28d874 fix: use DISABLE_SWAGGER_UI environment variable instead of invalid flag
- Remove invalid --disable_swagger command flag
- Add DISABLE_SWAGGER_UI=true environment variable
- Fixes LiteLLM startup error

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-14 02:15:31 +01:00
f1ff42f452 feat: disable Swagger UI in LiteLLM proxy
- Add --disable_swagger flag to LiteLLM command
- Improves security by hiding API documentation interface

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-14 02:14:43 +01:00
2934caa9ed fix: disable Watchtower for Facefusion custom local image
Watchtower was trying to pull updates from Docker Hub for facefusion-patched:3.5.0-cpu
which only exists locally, causing spam errors. Disabled Watchtower monitoring for this
container since it's a custom-built image with NSFW filter patches.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 08:30:51 +01:00
f71b150263 feat: add tty flag for Gradio to start properly 2025-11-13 06:18:58 +01:00
95099a443e feat: build custom Facefusion image with NSFW filter patch baked in 2025-11-13 06:05:42 +01:00
8f406f62c1 fix: add command with -u flag to start Facefusion 2025-11-13 06:01:09 +01:00
c2d25dde59 fix: remove entrypoint override to use default Facefusion startup 2025-11-13 05:59:05 +01:00
3c56f05286 fix: add Gradio environment variables and remove conflicting command 2025-11-13 05:52:13 +01:00
59f2e8b0fc refactor: use source code patch instead of deleting NSFW models
Cleaner solution based on Reddit community feedback:
- Patch content_analyser.py to return False (always safe)
- Remove unused config file
- Remove config volume mount from compose
- Much simpler and more reliable than file deletion approach

Credit: https://www.reddit.com/r/StableDiffusion/comments/1m2w5af/

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 04:23:38 +00:00
5768fe65ff feat: disable NSFW filter in Facefusion
- Add entrypoint script to continuously delete NSFW model files
- Add Facefusion config file (for future use)
- NSFW content filtering now disabled

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 03:52:21 +00:00
c30d2d7407 chore: facefusion 2025-11-12 16:42:41 +01:00
8445256b0f chore: facefusion 2025-11-12 16:38:57 +01:00
9f9119358a fix: add Python unbuffered flag to see Gradio startup logs 2025-11-12 11:01:23 +01:00
b7f03a313f fix: use port 7865 for both Gradio and Traefik 2025-11-12 10:56:30 +01:00
08cce3479f fix: add command back with python3 and default port 7860 2025-11-12 10:51:35 +01:00
22eaaa9b30 fix: remove custom command and use default Gradio port 7860 for Facefusion 2025-11-12 10:50:11 +01:00
8ac025a14c fix: add command to start Facefusion web UI 2025-11-12 09:42:31 +01:00
8b77f92028 feat: integrate Facefusion into AI stack
Added Facefusion face swapping service to the AI stack:

**Configuration:**
- URL: https://facefusion.ai.pivoine.art
- Image: facefusion/facefusion:3.5.0-cpu
- Port: 7865
- Container: ai_facefusion
- Volume: ai_facefusion_data
- HTTP Basic Auth protection
- CPU execution mode (GPU when available)

**Changes:**
- Added facefusion service to ai/compose.yaml
- Added AI_FACEFUSION_* env vars to arty.yml
- Created ai_facefusion_data volume
- Removed old standalone facefusion stack
- Removed ai/README-export.md and ai/webui-export.py

Facefusion will run on CPU until GPU server is available.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-12 09:36:52 +01:00
da0dc2363a fix: disable prompt caching and responses API in litellm
- Add LITELLM_DROP_PARAMS environment variable
- Disable cache in litellm_settings
- Attempt to disable responses API endpoint
- Remove invalid supports_prompt_caching parameter

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-11 12:27:06 +01:00
db69b30d06 feat: add PostgreSQL initialization script for AI stack
Created database initialization script following the core stack pattern.
The script automatically creates required databases on first initialization:
- openwebui: Open WebUI application database
- litellm: LiteLLM proxy database for API key management and tracking

Changes:
- Created ai/postgres/init/01-init-databases.sh
- Mounted init directory in ai_postgres service
- Added automatic privilege grants to AI_DB_USER

Note: Init script only runs on first database creation when volume is empty.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-09 18:36:50 +01:00
5a6b007cf3 feat: connect LiteLLM to AI PostgreSQL database
LiteLLM now uses the ai_postgres database instance with a dedicated
'litellm' database for API key management, usage tracking, and rate limiting.

Changes:
- Set DATABASE_URL to postgresql://ai:password@ai_postgres:5432/litellm
- Added depends_on ai_postgres to ensure DB starts first

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-09 18:34:10 +01:00
b6cb155da8 fix: remove HTTP Basic Auth from LiteLLM proxy
Removed authentication middleware to simplify access. LiteLLM now relies
solely on Bearer token authentication via LITELLM_MASTER_KEY.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-09 18:30:57 +01:00
87654f5ae8 feat: enable LiteLLM API key authentication
Re-enabled LITELLM_MASTER_KEY for proper API key authentication.
LiteLLM supports master key without database for simple auth scenarios.

- LiteLLM validates Bearer token against master key
- Open WebUI uses same key for internal communication
- External access requires both HTTP Basic Auth + API key

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-09 18:25:57 +01:00