runpod-ai-orchestrator

Author	SHA1	Message	Date
Sebastian Krüger	f8694653d0	fix: adjust VRAM for 24K context based on actual usage All checks were successful Build and Push RunPod Docker Image / build-and-push (push) Successful in 13s Details Based on error output, model uses ~17.5GB (not 15GB estimated). - Llama: 85% VRAM for 24576 context (3GB KV cache) - BGE: 6% VRAM (reduced to fit) - Total: 91% 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-30 22:46:34 +01:00
Sebastian Krüger	078043e35a	feat: balance Llama 24K context with concurrent BGE All checks were successful Build and Push RunPod Docker Image / build-and-push (push) Successful in 14s Details Adjusted VRAM allocation for concurrent operation: - Llama: 80% VRAM, 24576 context - BGE: 8% VRAM - Total: 88% of 24GB RTX 4090 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-30 22:39:08 +01:00
Sebastian Krüger	c969d10eaf	feat: increase Llama context to 32K with 95% VRAM All checks were successful Build and Push RunPod Docker Image / build-and-push (push) Successful in 14s Details For larger input requirements, increased max-model-len from 20480 to 32768. BGE remains available but cannot run concurrently at this VRAM level. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-30 22:37:11 +01:00
Sebastian Krüger	f68bc47915	feat: increase Llama max-model-len to 20480 All checks were successful Build and Push RunPod Docker Image / build-and-push (push) Successful in 14s Details Adjusted VRAM allocation for larger context window: - Llama: 90% VRAM, 20480 context (up from 8192) - BGE: 8% VRAM (down from 10%) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-30 22:15:08 +01:00
Sebastian Krüger	b2de3b17ee	fix: adjust VRAM allocation for concurrent Llama+BGE All checks were successful Build and Push RunPod Docker Image / build-and-push (push) Successful in 13s Details - Llama: 85% GPU, 8K context (model needs ~15GB base) - BGE: 10% GPU (1.3GB model) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-30 20:16:00 +01:00
Sebastian Krüger	f668e06228	feat: add BGE embedding model for concurrent operation with Llama All checks were successful Build and Push RunPod Docker Image / build-and-push (push) Successful in 36s Details - Create config_bge.yaml for BAAI/bge-large-en-v1.5 on port 8002 - Reduce Llama VRAM to 70% and context to 16K for concurrent use - Add BGE service to supervisor with vllm group 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-30 19:55:13 +01:00
Sebastian Krüger	c32340f23c	fix: start.sh All checks were successful Build and Push RunPod Docker Image / build-and-push (push) Successful in 1m27s Details	2025-11-28 11:11:52 +01:00
Sebastian Krüger	f3225f70d6	fix: start.sh All checks were successful Build and Push RunPod Docker Image / build-and-push (push) Successful in 1m47s Details	2025-11-28 11:01:36 +01:00
Sebastian Krüger	875c25d802	fix: tailscale stale dir All checks were successful Build and Push RunPod Docker Image / build-and-push (push) Successful in 15s Details	2025-11-27 23:37:36 +01:00
Sebastian Krüger	1e1832c166	fix: start.sh All checks were successful Build and Push RunPod Docker Image / build-and-push (push) Successful in 1m30s Details	2025-11-27 23:23:08 +01:00
Sebastian Krüger	07c6ab50e7	fix: tailscale --force-reauth All checks were successful Build and Push RunPod Docker Image / build-and-push (push) Successful in 14s Details	2025-11-27 23:17:23 +01:00
Sebastian Krüger	977b9c0f4f	fix: proxy root path All checks were successful Build and Push RunPod Docker Image / build-and-push (push) Successful in 15s Details	2025-11-27 15:36:45 +01:00
Sebastian Krüger	6e57b21c78	fix: audiocraft HF_HOME env All checks were successful Build and Push RunPod Docker Image / build-and-push (push) Successful in 16s Details	2025-11-27 14:58:09 +01:00
Sebastian Krüger	9e4c3f5a9c	fix: supervisor upscale All checks were successful Build and Push RunPod Docker Image / build-and-push (push) Successful in 14s Details	2025-11-27 12:27:50 +01:00
Sebastian Krüger	364bf5f46d	chore: supervisor reload All checks were successful Build and Push RunPod Docker Image / build-and-push (push) Successful in 14s Details	2025-11-27 12:25:36 +01:00
Sebastian Krüger	7cde108b88	fix: upscale service ref All checks were successful Build and Push RunPod Docker Image / build-and-push (push) Successful in 14s Details	2025-11-27 12:05:45 +01:00
Sebastian Krüger	44bfa271e3	feat: upscale service All checks were successful Build and Push RunPod Docker Image / build-and-push (push) Successful in 15s Details	2025-11-27 12:04:19 +01:00
Sebastian Krüger	69bf6af419	chore: minimal refactoring All checks were successful Build and Push RunPod Docker Image / build-and-push (push) Successful in 1m26s Details	2025-11-27 11:51:21 +01:00
Sebastian Krüger	d625bf1c23	feat: SKIP_ARTY_SETUP env All checks were successful Build and Push RunPod Docker Image / build-and-push (push) Successful in 1m25s Details	2025-11-27 11:41:21 +01:00
Sebastian Krüger	44742a1fb4	feat: FLUX.2 Dev models All checks were successful Build and Push RunPod Docker Image / build-and-push (push) Successful in 14s Details	2025-11-27 11:35:37 +01:00
Sebastian Krüger	a08879625c	fix: remove audiocraft autostart All checks were successful Build and Push RunPod Docker Image / build-and-push (push) Successful in 14s Details	2025-11-27 10:52:57 +01:00
Sebastian Krüger	2669ff65a9	feat: HunyuanVideo 1.5 All checks were successful Build and Push RunPod Docker Image / build-and-push (push) Successful in 14s Details	2025-11-27 10:01:28 +01:00
Sebastian Krüger	400e534f17	fix: webdav-sync config All checks were successful Build and Push RunPod Docker Image / build-and-push (push) Successful in 17s Details	2025-11-27 09:33:31 +01:00
Sebastian Krüger	b9beef283d	fix: remove vllm embedding All checks were successful Build and Push RunPod Docker Image / build-and-push (push) Successful in 14s Details	2025-11-27 01:24:05 +01:00
Sebastian Krüger	90fa8a073c	fix: remove vllm embedding All checks were successful Build and Push RunPod Docker Image / build-and-push (push) Successful in 36s Details	2025-11-27 01:12:57 +01:00
Sebastian Krüger	4d7c811a46	fix: vllm gpu utilization 2 All checks were successful Build and Push RunPod Docker Image / build-and-push (push) Successful in 14s Details	2025-11-27 00:57:14 +01:00
Sebastian Krüger	eaa8e0ebab	fix: vllm gpu utilization All checks were successful Build and Push RunPod Docker Image / build-and-push (push) Successful in 14s Details	2025-11-27 00:50:42 +01:00
Sebastian Krüger	5e9aa8f25d	fix: HF_HOME for vllm All checks were successful Build and Push RunPod Docker Image / build-and-push (push) Successful in 14s Details	2025-11-27 00:20:42 +01:00
Sebastian Krüger	874e9a639d	fix: models linking All checks were successful Build and Push RunPod Docker Image / build-and-push (push) Successful in 15s Details	2025-11-26 23:51:21 +01:00
Sebastian Krüger	15182802ed	fix: arty.yml All checks were successful Build and Push RunPod Docker Image / build-and-push (push) Successful in 1m59s Details	2025-11-26 22:03:03 +01:00
Sebastian Krüger	73aea85181	docs: README.md All checks were successful Build and Push RunPod Docker Image / build-and-push (push) Successful in 14s Details	2025-11-26 21:26:19 +01:00
Sebastian Krüger	5f8c843b22	Initial commit All checks were successful Build and Push RunPod Docker Image / build-and-push (push) Successful in 14s Details	2025-11-26 20:20:45 +01:00
Sebastian Krüger	5c61ac5c67	Initial commit All checks were successful Build and Push RunPod Docker Image / build-and-push (push) Successful in 1m28s Details	2025-11-26 17:15:08 +01:00

33 Commits