505 Commits

Author SHA1 Message Date
c55f41408a fix(ai): litellm config 2025-11-30 23:03:32 +01:00
7bca766247 fix(ai): litellm config 2025-11-30 22:32:49 +01:00
120bf7c385 feat(ai): bge over litellm 2025-11-30 20:12:07 +01:00
35e0f232f9 revert: remove GPU_TAILSCALE_HOST from arty.yml
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-28 09:58:44 +01:00
cd9256f09c fix: use Tailscale IP for GPU_TAILSCALE_HOST (MagicDNS doesn't work from Docker)
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-28 09:53:14 +01:00
c9e3a5cc4f fix: add resolver for runtime DNS resolution in nginx
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-28 09:48:25 +01:00
b2b444fb98 fix: add Tailscale DNS to GPU proxy containers
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-28 09:47:06 +01:00
dcc29d20f0 fix: traefik labels 2025-11-28 09:39:39 +01:00
19ad30e8c4 revert: tailscale docker sidecar 2025-11-28 09:31:21 +01:00
99e39ee6e6 fix: tailscale sidecar dns 2025-11-28 09:12:52 +01:00
ed83e64727 fix: tailscale sidecar dns 2025-11-28 09:09:10 +01:00
18e6741596 fix: tailscale sidecar dns 2025-11-28 09:07:12 +01:00
c83b77ebdb feat: tailscale sidecar 2025-11-28 08:59:42 +01:00
6568dd10b5 feat: tailscale sidecar 2025-11-28 08:42:40 +01:00
a6e4540e84 feat: tailscale sidecar 2025-11-28 08:41:30 +01:00
dbdf33e78e feat: tailscale sidecar 2025-11-28 08:40:30 +01:00
74f618bcbb feat: tailscale sidecar 2025-11-28 08:36:50 +01:00
0c7fe219f7 feat: tailscale sidecar 2025-11-28 08:32:26 +01:00
6d0a15a969 fix: ai compose tailscale dns 2025-11-28 08:21:23 +01:00
f4dd7c7d9d fix: litellm compose 2025-11-28 08:14:13 +01:00
608b5ba793 fix: nginx audio mime types 2025-11-27 16:45:14 +01:00
2e45252793 fix: nginx proxy timeouts 2025-11-27 15:24:38 +01:00
20ba9952a1 feat: upscale service 2025-11-27 12:13:57 +01:00
69869ec3fb fix: remove vllm embedding 2025-11-27 01:11:43 +01:00
cc270c8539 fix: vllm model ids 2025-11-27 00:49:53 +01:00
8bdcde4b90 fix: supervisor env 2025-11-26 22:58:16 +01:00
2ab43e8fd3 fix: authelia for audiocraft 2025-11-26 22:56:30 +01:00
5d232c7d9b feat: audiocraft 2025-11-26 22:54:10 +01:00
cef233b678 chore: remove qwen 2025-11-26 21:03:43 +01:00
b63ddbffbd fix(ai): correct bge embedding model name to hosted_vllm/openai prefix
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-25 06:44:33 +01:00
d57a1241d2 feat(ai): add bge-large-en-v1.5 embedding model to litellm
- Add BGE embedding model config (port 8002) to litellm-config.yaml
- Add GPU_VLLM_EMBED_URL env var to compose and .env

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-25 06:40:36 +01:00
ef0309838c refactor(ai): remove crawl4ai service, add backrest config to repo
- Remove crawl4ai service from ai/compose.yaml (will use local MCP instead)
- Remove crawl4ai backup volume from core/compose.yaml
- Add core/backrest/config.json (infrastructure as code)
- Change backrest from volume to bind-mounted config
- Update CLAUDE.md and README.md documentation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-25 06:20:22 +01:00
071a74a996 revert(ai): remove SUPERVISOR_LOGFILE env var from supervisor-ui
Supervisor XML-RPC API v3.0 (Supervisor 4.3.0) only supports 2-parameter
readLog(offset, length) calls, not 3-parameter calls with filename.
The SUPERVISOR_LOGFILE environment variable is not used by the API.

Testing showed:
- Working: server.supervisor.readLog(-4096, 0)
- Failing: server.supervisor.readLog(-4096, 4096, '/path/to/log')

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 23:01:10 +01:00
74b3748b23 feat(ai): add SUPERVISOR_LOGFILE env var to supervisor-ui for RunPod logs
Configure supervisor-ui to use correct logfile path (/workspace/logs/supervisord.log)
for RunPod Supervisor instance. Fixes logs page error on https://supervisor.ai.pivoine.art/logs

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 22:49:33 +01:00
87216ab26a fix: remove healthcheck from supervisor-ui service 2025-11-23 20:38:37 +01:00
9e2b19e7f6 feat: replace nginx supervisor proxy with modern supervisor-ui
- Replaced nginx:alpine proxy with dev.pivoine.art/valknar/supervisor-ui:latest
- Modern Next.js UI with real-time SSE updates, batch operations, and charts
- Changed service port from 80 (nginx) to 3000 (Next.js)
- Removed supervisor-nginx.conf (no longer needed)
- Kept same URL (supervisor.ai.pivoine.art) and Authelia SSO protection
- Added health check for /api/health endpoint
- Service connects to RunPod Supervisor via Tailscale (SUPERVISOR_HOST/PORT)
2025-11-23 20:18:29 +01:00
a80c6b931b fix: update compose.yaml to use new GPU_VLLM URLs 2025-11-23 16:22:54 +01:00
64c02228d8 fix: use EMPTY api_key for vLLM servers 2025-11-23 16:17:27 +01:00
55d9bef18a fix: remove api_key from vLLM config to fix authentication error
vLLM servers don't validate API keys, so LiteLLM shouldn't pass them

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 16:16:37 +01:00
7fc945e179 fix: update LiteLLM config for direct vLLM server access
- Replace orchestrator routing with direct vLLM server connections
- Qwen 2.5 7B on port 8000 (GPU_VLLM_QWEN_URL)
- Llama 3.1 8B on port 8001 (GPU_VLLM_LLAMA_URL)
- Simplify architecture by removing orchestrator proxy layer

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 16:10:20 +01:00
94ab4ae6dd feat: enable system message support for qwen-2.5-7b 2025-11-23 14:36:34 +01:00
779e76974d fix: use complete URL env var for vLLM API base
- Replace GPU_TAILSCALE_IP interpolation with GPU_VLLM_API_URL
- LiteLLM requires full URL in api_base with os.environ/ syntax

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 13:17:37 +01:00
f3f32c163f feat: consolidate GPU IP with single GPU_TAILSCALE_IP variable
- Replace COMFYUI_BACKEND_HOST and SUPERVISOR_BACKEND_HOST with GPU_TAILSCALE_IP
- Update LiteLLM config to use os.environ/GPU_TAILSCALE_IP for vLLM models
- Add GPU_TAILSCALE_IP env var to LiteLLM service
- Configure qwen-2.5-7b and llama-3.1-8b to route through orchestrator

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 13:05:33 +01:00
e00e959543 Update backend IPs for ComfyUI and Supervisor proxies
- Remove hardcoded default values from compose.yaml
- Backend IPs now managed via environment variables only

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 02:11:19 +01:00
0fd2eacad1 feat: add Supervisor proxy with Authelia SSO
Add nginx reverse proxy service for Supervisor web UI at supervisor.ai.pivoine.art with Authelia authentication. Proxies to RunPod GPU instance via Tailscale (100.121.199.88:9001).

Changes:
- Create supervisor-nginx.conf for nginx proxy configuration
- Add supervisor service to docker-compose with Traefik labels
- Add supervisor.ai.pivoine.art to Authelia protected domains
- Remove deprecated Flux-related files

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 13:19:02 +01:00
bf402adb25 Add Llama 3.1 8B model to LiteLLM configuration 2025-11-21 21:30:18 +01:00
ae1c349b55 feat: make ComfyUI backend IP/port configurable via environment variables
- Replace hardcoded IP in comfyui-nginx.conf with env vars
- Add COMFYUI_BACKEND_HOST and COMFYUI_BACKEND_PORT to compose.yaml
- Use envsubst to substitute variables at container startup
- Defaults: 100.121.199.88:8188 (current RunPod Tailscale IP)
2025-11-21 21:24:51 +01:00
66d8c82e47 Remove Flux and MusicGen models from LiteLLM config
ComfyUI now handles Flux image generation directly.
MusicGen is not being used and has been removed.
2025-11-21 21:11:29 +01:00
ea81634ef3 feat: add ComfyUI to Authelia protected domains
- Add comfy.ai.pivoine.art to one_factor authentication policy
- Enables SSO protection for ComfyUI image generation service
2025-11-21 21:05:24 +01:00
25bd020b93 docs: document ComfyUI setup and integration
- Add ComfyUI service to AI stack service list
- Document ComfyUI proxy architecture and configuration
- Include deployment instructions via Ansible
- Explain network topology and access flow
- Add proxy configuration details (nginx, Tailscale, Authelia)
- Document RunPod setup process and model integration
2025-11-21 21:03:35 +01:00