docs: remove outdated ai/README.md

Removed outdated AI infrastructure README that referenced GPU services.
VPS AI services (Open WebUI, Crawl4AI, facefusion) are documented in compose.yaml comments.
GPU infrastructure docs are now in dedicated runpod repository.
This commit is contained in:
2025-11-21 14:42:23 +01:00
parent d5e37dbd3f
commit a5ed2be933

View File

@@ -1,170 +0,0 @@
# AI Infrastructure
This directory contains AI-related configurations for the VPS deployment.
## Multi-Modal GPU Infrastructure (Migrated)
**The multi-modal AI orchestration stack (text, image, music generation) has been moved to a dedicated repository:**
**Repository**: https://dev.pivoine.art/valknar/runpod
The RunPod repository contains:
- Model orchestrator for automatic switching between text, image, and music models
- vLLM + Qwen 2.5 7B (text generation)
- Flux.1 Schnell (image generation)
- MusicGen Medium (music generation)
- RunPod template creation scripts
- Complete deployment documentation
This separation allows for independent management of:
- **VPS Services** (this repo): Open WebUI, Crawl4AI, AI database
- **GPU Services** (runpod repo): Model inference, orchestration, RunPod templates
## VPS AI Services (ai/compose.yaml)
This compose stack manages the VPS-side AI infrastructure that integrates with the GPU server:
### Services
#### ai_postgres
Dedicated PostgreSQL 16 instance with pgvector extension for AI workloads:
- Vector similarity search support
- Isolated from core database for performance
- Used by Open WebUI for RAG and embeddings
#### webui (Open WebUI)
ChatGPT-like interface exposed at `ai.pivoine.art:8080`:
- Claude API integration via Anthropic
- RAG support with document upload
- Vector storage via pgvector
- Web search capability
- SMTP email via IONOS
- User signup enabled
#### crawl4ai
Internal web scraping service for LLM content preparation:
- API on port 11235 (not exposed publicly)
- Optimized for AI/RAG workflows
- Integration with Open WebUI and n8n
## Integration with GPU Server
The VPS AI services connect to the GPU server via Tailscale VPN:
- **VPS Tailscale IP**: 100.102.217.79
- **GPU Tailscale IP**: 100.100.108.13
**LiteLLM Proxy** (port 4000 on VPS) routes requests:
- Claude API for chat completions
- GPU orchestrator for self-hosted models (text, image, music)
See `../litellm-config.yaml` for routing configuration.
## Environment Variables
Required in `.env`:
```bash
# AI Database
AI_DB_PASSWORD=<password>
# Open WebUI
AI_WEBUI_SECRET_KEY=<secret>
# Claude API
ANTHROPIC_API_KEY=<api_key>
# Email (IONOS SMTP)
ADMIN_EMAIL=<email>
SMTP_HOST=smtp.ionos.com
SMTP_PORT=587
SMTP_USER=<smtp_user>
SMTP_PASSWORD=<smtp_password>
```
## Backup Configuration
AI services are backed up daily via Restic:
- **ai_postgres_data**: 3 AM (7 daily, 4 weekly, 6 monthly, 2 yearly)
- **ai_webui_data**: 3 AM (same retention)
- **ai_crawl4ai_data**: 3 AM (same retention)
Repository: `/mnt/hidrive/users/valknar/Backup`
## Management Commands
```bash
# Start AI stack
pnpm arty up ai_postgres webui crawl4ai
# View logs
docker logs -f ai_webui
docker logs -f ai_postgres
docker logs -f ai_crawl4ai
# Check Open WebUI
curl http://ai.pivoine.art:8080/health
# Restart AI services
pnpm arty restart ai_postgres webui crawl4ai
```
## GPU Server Management
For GPU server operations (model orchestration, template creation, etc.):
```bash
# Clone the dedicated repository
git clone ssh://git@dev.pivoine.art:2222/valknar/runpod.git
# See runpod repository for:
# - Model orchestration setup
# - RunPod template creation
# - GPU deployment guides
```
## Documentation
### VPS AI Services
- [GPU_DEPLOYMENT_LOG.md](GPU_DEPLOYMENT_LOG.md) - VPS AI deployment history
### GPU Server (Separate Repository)
- [runpod/README.md](https://dev.pivoine.art/valknar/runpod) - Main GPU documentation
- [runpod/DEPLOYMENT.md](https://dev.pivoine.art/valknar/runpod) - Deployment guide
- [runpod/RUNPOD_TEMPLATE.md](https://dev.pivoine.art/valknar/runpod) - Template creation
## Architecture Overview
```
┌─────────────────────────────────────────────────────────────────┐
│ VPS (Tailscale: 100.102.217.79) │
│ ┌───────────────────────────────────────────────────────────┐ │
│ │ LiteLLM Proxy (Port 4000) │ │
│ │ Routes to: Claude API + GPU Orchestrator │ │
│ └───────┬───────────────────────────────────────────────────┘ │
│ │ │
│ ┌───────▼─────────┐ ┌──────────────┐ ┌─────────────────┐ │
│ │ Open WebUI │ │ Crawl4AI │ │ AI PostgreSQL │ │
│ │ Port: 8080 │ │ Port: 11235 │ │ + pgvector │ │
│ └─────────────────┘ └──────────────┘ └─────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
│ Tailscale VPN
┌──────────────────────────────┼──────────────────────────────────┐
│ RunPod GPU Server (Tailscale: 100.100.108.13) │
│ ┌───────────────────────────▼──────────────────────────────┐ │
│ │ Orchestrator (Port 9000) │ │
│ │ Manages sequential model loading │ │
│ └─────┬──────────────┬──────────────────┬──────────────────┘ │
│ │ │ │ │
│ ┌─────▼──────┐ ┌────▼────────┐ ┌──────▼───────┐ │
│ │vLLM │ │Flux.1 │ │MusicGen │ │
│ │Qwen 2.5 7B │ │Schnell │ │Medium │ │
│ │Port: 8001 │ │Port: 8002 │ │Port: 8003 │ │
│ └────────────┘ └─────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────────────┘
```
## Support
For issues:
- **VPS AI services**: Check logs via `docker logs`
- **GPU server**: See runpod repository documentation
- **LiteLLM routing**: Review `../litellm-config.yaml`