docs: remove outdated ai/README.md
Removed outdated AI infrastructure README that referenced GPU services. VPS AI services (Open WebUI, Crawl4AI, facefusion) are documented in compose.yaml comments. GPU infrastructure docs are now in dedicated runpod repository.
This commit is contained in:
170
ai/README.md
170
ai/README.md
@@ -1,170 +0,0 @@
|
||||
# AI Infrastructure
|
||||
|
||||
This directory contains AI-related configurations for the VPS deployment.
|
||||
|
||||
## Multi-Modal GPU Infrastructure (Migrated)
|
||||
|
||||
**The multi-modal AI orchestration stack (text, image, music generation) has been moved to a dedicated repository:**
|
||||
|
||||
**Repository**: https://dev.pivoine.art/valknar/runpod
|
||||
|
||||
The RunPod repository contains:
|
||||
- Model orchestrator for automatic switching between text, image, and music models
|
||||
- vLLM + Qwen 2.5 7B (text generation)
|
||||
- Flux.1 Schnell (image generation)
|
||||
- MusicGen Medium (music generation)
|
||||
- RunPod template creation scripts
|
||||
- Complete deployment documentation
|
||||
|
||||
This separation allows for independent management of:
|
||||
- **VPS Services** (this repo): Open WebUI, Crawl4AI, AI database
|
||||
- **GPU Services** (runpod repo): Model inference, orchestration, RunPod templates
|
||||
|
||||
## VPS AI Services (ai/compose.yaml)
|
||||
|
||||
This compose stack manages the VPS-side AI infrastructure that integrates with the GPU server:
|
||||
|
||||
### Services
|
||||
|
||||
#### ai_postgres
|
||||
Dedicated PostgreSQL 16 instance with pgvector extension for AI workloads:
|
||||
- Vector similarity search support
|
||||
- Isolated from core database for performance
|
||||
- Used by Open WebUI for RAG and embeddings
|
||||
|
||||
#### webui (Open WebUI)
|
||||
ChatGPT-like interface exposed at `ai.pivoine.art:8080`:
|
||||
- Claude API integration via Anthropic
|
||||
- RAG support with document upload
|
||||
- Vector storage via pgvector
|
||||
- Web search capability
|
||||
- SMTP email via IONOS
|
||||
- User signup enabled
|
||||
|
||||
#### crawl4ai
|
||||
Internal web scraping service for LLM content preparation:
|
||||
- API on port 11235 (not exposed publicly)
|
||||
- Optimized for AI/RAG workflows
|
||||
- Integration with Open WebUI and n8n
|
||||
|
||||
## Integration with GPU Server
|
||||
|
||||
The VPS AI services connect to the GPU server via Tailscale VPN:
|
||||
- **VPS Tailscale IP**: 100.102.217.79
|
||||
- **GPU Tailscale IP**: 100.100.108.13
|
||||
|
||||
**LiteLLM Proxy** (port 4000 on VPS) routes requests:
|
||||
- Claude API for chat completions
|
||||
- GPU orchestrator for self-hosted models (text, image, music)
|
||||
|
||||
See `../litellm-config.yaml` for routing configuration.
|
||||
|
||||
## Environment Variables
|
||||
|
||||
Required in `.env`:
|
||||
```bash
|
||||
# AI Database
|
||||
AI_DB_PASSWORD=<password>
|
||||
|
||||
# Open WebUI
|
||||
AI_WEBUI_SECRET_KEY=<secret>
|
||||
|
||||
# Claude API
|
||||
ANTHROPIC_API_KEY=<api_key>
|
||||
|
||||
# Email (IONOS SMTP)
|
||||
ADMIN_EMAIL=<email>
|
||||
SMTP_HOST=smtp.ionos.com
|
||||
SMTP_PORT=587
|
||||
SMTP_USER=<smtp_user>
|
||||
SMTP_PASSWORD=<smtp_password>
|
||||
```
|
||||
|
||||
## Backup Configuration
|
||||
|
||||
AI services are backed up daily via Restic:
|
||||
- **ai_postgres_data**: 3 AM (7 daily, 4 weekly, 6 monthly, 2 yearly)
|
||||
- **ai_webui_data**: 3 AM (same retention)
|
||||
- **ai_crawl4ai_data**: 3 AM (same retention)
|
||||
|
||||
Repository: `/mnt/hidrive/users/valknar/Backup`
|
||||
|
||||
## Management Commands
|
||||
|
||||
```bash
|
||||
# Start AI stack
|
||||
pnpm arty up ai_postgres webui crawl4ai
|
||||
|
||||
# View logs
|
||||
docker logs -f ai_webui
|
||||
docker logs -f ai_postgres
|
||||
docker logs -f ai_crawl4ai
|
||||
|
||||
# Check Open WebUI
|
||||
curl http://ai.pivoine.art:8080/health
|
||||
|
||||
# Restart AI services
|
||||
pnpm arty restart ai_postgres webui crawl4ai
|
||||
```
|
||||
|
||||
## GPU Server Management
|
||||
|
||||
For GPU server operations (model orchestration, template creation, etc.):
|
||||
|
||||
```bash
|
||||
# Clone the dedicated repository
|
||||
git clone ssh://git@dev.pivoine.art:2222/valknar/runpod.git
|
||||
|
||||
# See runpod repository for:
|
||||
# - Model orchestration setup
|
||||
# - RunPod template creation
|
||||
# - GPU deployment guides
|
||||
```
|
||||
|
||||
## Documentation
|
||||
|
||||
### VPS AI Services
|
||||
- [GPU_DEPLOYMENT_LOG.md](GPU_DEPLOYMENT_LOG.md) - VPS AI deployment history
|
||||
|
||||
### GPU Server (Separate Repository)
|
||||
- [runpod/README.md](https://dev.pivoine.art/valknar/runpod) - Main GPU documentation
|
||||
- [runpod/DEPLOYMENT.md](https://dev.pivoine.art/valknar/runpod) - Deployment guide
|
||||
- [runpod/RUNPOD_TEMPLATE.md](https://dev.pivoine.art/valknar/runpod) - Template creation
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ VPS (Tailscale: 100.102.217.79) │
|
||||
│ ┌───────────────────────────────────────────────────────────┐ │
|
||||
│ │ LiteLLM Proxy (Port 4000) │ │
|
||||
│ │ Routes to: Claude API + GPU Orchestrator │ │
|
||||
│ └───────┬───────────────────────────────────────────────────┘ │
|
||||
│ │ │
|
||||
│ ┌───────▼─────────┐ ┌──────────────┐ ┌─────────────────┐ │
|
||||
│ │ Open WebUI │ │ Crawl4AI │ │ AI PostgreSQL │ │
|
||||
│ │ Port: 8080 │ │ Port: 11235 │ │ + pgvector │ │
|
||||
│ └─────────────────┘ └──────────────┘ └─────────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
│ Tailscale VPN
|
||||
┌──────────────────────────────┼──────────────────────────────────┐
|
||||
│ RunPod GPU Server (Tailscale: 100.100.108.13) │
|
||||
│ ┌───────────────────────────▼──────────────────────────────┐ │
|
||||
│ │ Orchestrator (Port 9000) │ │
|
||||
│ │ Manages sequential model loading │ │
|
||||
│ └─────┬──────────────┬──────────────────┬──────────────────┘ │
|
||||
│ │ │ │ │
|
||||
│ ┌─────▼──────┐ ┌────▼────────┐ ┌──────▼───────┐ │
|
||||
│ │vLLM │ │Flux.1 │ │MusicGen │ │
|
||||
│ │Qwen 2.5 7B │ │Schnell │ │Medium │ │
|
||||
│ │Port: 8001 │ │Port: 8002 │ │Port: 8003 │ │
|
||||
│ └────────────┘ └─────────────┘ └──────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Support
|
||||
|
||||
For issues:
|
||||
- **VPS AI services**: Check logs via `docker logs`
|
||||
- **GPU server**: See runpod repository documentation
|
||||
- **LiteLLM routing**: Review `../litellm-config.yaml`
|
||||
Reference in New Issue
Block a user