Commit Graph

131 Commits

Author SHA1 Message Date
120bf7c385 feat(ai): bge over litellm 2025-11-30 20:12:07 +01:00
c9e3a5cc4f fix: add resolver for runtime DNS resolution in nginx
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-28 09:48:25 +01:00
b2b444fb98 fix: add Tailscale DNS to GPU proxy containers
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-28 09:47:06 +01:00
dcc29d20f0 fix: traefik labels 2025-11-28 09:39:39 +01:00
19ad30e8c4 revert: tailscale docker sidecar 2025-11-28 09:31:21 +01:00
99e39ee6e6 fix: tailscale sidecar dns 2025-11-28 09:12:52 +01:00
ed83e64727 fix: tailscale sidecar dns 2025-11-28 09:09:10 +01:00
18e6741596 fix: tailscale sidecar dns 2025-11-28 09:07:12 +01:00
c83b77ebdb feat: tailscale sidecar 2025-11-28 08:59:42 +01:00
a6e4540e84 feat: tailscale sidecar 2025-11-28 08:41:30 +01:00
dbdf33e78e feat: tailscale sidecar 2025-11-28 08:40:30 +01:00
74f618bcbb feat: tailscale sidecar 2025-11-28 08:36:50 +01:00
0c7fe219f7 feat: tailscale sidecar 2025-11-28 08:32:26 +01:00
6d0a15a969 fix: ai compose tailscale dns 2025-11-28 08:21:23 +01:00
f4dd7c7d9d fix: litellm compose 2025-11-28 08:14:13 +01:00
608b5ba793 fix: nginx audio mime types 2025-11-27 16:45:14 +01:00
2e45252793 fix: nginx proxy timeouts 2025-11-27 15:24:38 +01:00
20ba9952a1 feat: upscale service 2025-11-27 12:13:57 +01:00
69869ec3fb fix: remove vllm embedding 2025-11-27 01:11:43 +01:00
cc270c8539 fix: vllm model ids 2025-11-27 00:49:53 +01:00
8bdcde4b90 fix: supervisor env 2025-11-26 22:58:16 +01:00
5d232c7d9b feat: audiocraft 2025-11-26 22:54:10 +01:00
cef233b678 chore: remove qwen 2025-11-26 21:03:43 +01:00
b63ddbffbd fix(ai): correct bge embedding model name to hosted_vllm/openai prefix
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-25 06:44:33 +01:00
d57a1241d2 feat(ai): add bge-large-en-v1.5 embedding model to litellm
- Add BGE embedding model config (port 8002) to litellm-config.yaml
- Add GPU_VLLM_EMBED_URL env var to compose and .env

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-25 06:40:36 +01:00
ef0309838c refactor(ai): remove crawl4ai service, add backrest config to repo
- Remove crawl4ai service from ai/compose.yaml (will use local MCP instead)
- Remove crawl4ai backup volume from core/compose.yaml
- Add core/backrest/config.json (infrastructure as code)
- Change backrest from volume to bind-mounted config
- Update CLAUDE.md and README.md documentation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-25 06:20:22 +01:00
071a74a996 revert(ai): remove SUPERVISOR_LOGFILE env var from supervisor-ui
Supervisor XML-RPC API v3.0 (Supervisor 4.3.0) only supports 2-parameter
readLog(offset, length) calls, not 3-parameter calls with filename.
The SUPERVISOR_LOGFILE environment variable is not used by the API.

Testing showed:
- Working: server.supervisor.readLog(-4096, 0)
- Failing: server.supervisor.readLog(-4096, 4096, '/path/to/log')

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 23:01:10 +01:00
74b3748b23 feat(ai): add SUPERVISOR_LOGFILE env var to supervisor-ui for RunPod logs
Configure supervisor-ui to use correct logfile path (/workspace/logs/supervisord.log)
for RunPod Supervisor instance. Fixes logs page error on https://supervisor.ai.pivoine.art/logs

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 22:49:33 +01:00
87216ab26a fix: remove healthcheck from supervisor-ui service 2025-11-23 20:38:37 +01:00
9e2b19e7f6 feat: replace nginx supervisor proxy with modern supervisor-ui
- Replaced nginx:alpine proxy with dev.pivoine.art/valknar/supervisor-ui:latest
- Modern Next.js UI with real-time SSE updates, batch operations, and charts
- Changed service port from 80 (nginx) to 3000 (Next.js)
- Removed supervisor-nginx.conf (no longer needed)
- Kept same URL (supervisor.ai.pivoine.art) and Authelia SSO protection
- Added health check for /api/health endpoint
- Service connects to RunPod Supervisor via Tailscale (SUPERVISOR_HOST/PORT)
2025-11-23 20:18:29 +01:00
a80c6b931b fix: update compose.yaml to use new GPU_VLLM URLs 2025-11-23 16:22:54 +01:00
64c02228d8 fix: use EMPTY api_key for vLLM servers 2025-11-23 16:17:27 +01:00
55d9bef18a fix: remove api_key from vLLM config to fix authentication error
vLLM servers don't validate API keys, so LiteLLM shouldn't pass them

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 16:16:37 +01:00
7fc945e179 fix: update LiteLLM config for direct vLLM server access
- Replace orchestrator routing with direct vLLM server connections
- Qwen 2.5 7B on port 8000 (GPU_VLLM_QWEN_URL)
- Llama 3.1 8B on port 8001 (GPU_VLLM_LLAMA_URL)
- Simplify architecture by removing orchestrator proxy layer

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 16:10:20 +01:00
94ab4ae6dd feat: enable system message support for qwen-2.5-7b 2025-11-23 14:36:34 +01:00
779e76974d fix: use complete URL env var for vLLM API base
- Replace GPU_TAILSCALE_IP interpolation with GPU_VLLM_API_URL
- LiteLLM requires full URL in api_base with os.environ/ syntax

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 13:17:37 +01:00
f3f32c163f feat: consolidate GPU IP with single GPU_TAILSCALE_IP variable
- Replace COMFYUI_BACKEND_HOST and SUPERVISOR_BACKEND_HOST with GPU_TAILSCALE_IP
- Update LiteLLM config to use os.environ/GPU_TAILSCALE_IP for vLLM models
- Add GPU_TAILSCALE_IP env var to LiteLLM service
- Configure qwen-2.5-7b and llama-3.1-8b to route through orchestrator

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 13:05:33 +01:00
e00e959543 Update backend IPs for ComfyUI and Supervisor proxies
- Remove hardcoded default values from compose.yaml
- Backend IPs now managed via environment variables only

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 02:11:19 +01:00
0fd2eacad1 feat: add Supervisor proxy with Authelia SSO
Add nginx reverse proxy service for Supervisor web UI at supervisor.ai.pivoine.art with Authelia authentication. Proxies to RunPod GPU instance via Tailscale (100.121.199.88:9001).

Changes:
- Create supervisor-nginx.conf for nginx proxy configuration
- Add supervisor service to docker-compose with Traefik labels
- Add supervisor.ai.pivoine.art to Authelia protected domains
- Remove deprecated Flux-related files

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 13:19:02 +01:00
bf402adb25 Add Llama 3.1 8B model to LiteLLM configuration 2025-11-21 21:30:18 +01:00
ae1c349b55 feat: make ComfyUI backend IP/port configurable via environment variables
- Replace hardcoded IP in comfyui-nginx.conf with env vars
- Add COMFYUI_BACKEND_HOST and COMFYUI_BACKEND_PORT to compose.yaml
- Use envsubst to substitute variables at container startup
- Defaults: 100.121.199.88:8188 (current RunPod Tailscale IP)
2025-11-21 21:24:51 +01:00
66d8c82e47 Remove Flux and MusicGen models from LiteLLM config
ComfyUI now handles Flux image generation directly.
MusicGen is not being used and has been removed.
2025-11-21 21:11:29 +01:00
904f7d3c2e feat(ai): add ComfyUI proxy service with Authelia SSO
- Add ComfyUI service to AI stack using nginx:alpine as reverse proxy
- Proxy to RunPod ComfyUI via Tailscale (100.121.199.88:8188)
- Configure Traefik routing for comfy.ai.pivoine.art
- Enable Authelia SSO middleware (net-authelia)
- Support WebSocket connections for real-time updates
- Set appropriate timeouts for image generation (300s)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-21 20:56:20 +01:00
9a964cff3c feat: add Flux image generation function for Open WebUI
- Add flux_image_gen.py manifold function for Flux.1 Schnell
- Auto-mount functions via Docker volume (./functions:/app/backend/data/functions:ro)
- Add comprehensive setup guide in FLUX_SETUP.md
- Update CLAUDE.md with Flux integration documentation
- Infrastructure as code approach - no manual import needed

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-21 20:20:33 +01:00
0999e5d29f feat: re-enable Redis caching in LiteLLM now that streaming is fixed 2025-11-21 19:40:57 +01:00
ec903c16c2 fix: use hosted_vllm/openai/ prefix for vLLM model via orchestrator 2025-11-21 19:18:33 +01:00
155016da97 debug: enable DEBUG logging for LiteLLM to troubleshoot streaming 2025-11-21 19:10:00 +01:00
c81f312e9e fix: use correct vLLM model ID from /v1/models endpoint 2025-11-21 19:06:56 +01:00
fe0cf487ee fix: use correct vLLM model name with hosted_vllm prefix 2025-11-21 19:02:44 +01:00
81d4058c5d revert: back to openai prefix for vLLM OpenAI-compatible endpoint 2025-11-21 18:57:10 +01:00