audiocraft-ui/README.md

# AudioCraft Studio

A comprehensive web interface for Meta's AudioCraft AI audio generation models, optimized for RunPod deployment with NVIDIA RTX 4090 GPUs.

## Features

### Models Supported
- **MusicGen** - Text-to-music generation with melody conditioning
- **AudioGen** - Text-to-sound effects and environmental audio
- **MAGNeT** - Fast non-autoregressive music generation
- **MusicGen Style** - Style-conditioned music from reference audio
- **JASCO** - Chord and drum-conditioned music generation

### Core Capabilities
- **Gradio Web UI** - Intuitive interface with real-time generation
- **REST API** - Full-featured API with OpenAPI documentation
- **Batch Processing** - Queue system for multiple generations
- **Project Management** - Organize and browse generation history
- **VRAM Management** - Smart model loading/unloading, ComfyUI coexistence
- **Waveform Visualization** - Visual audio feedback

## Quick Start

### Local Development

```bash
# Clone repository
git clone https://github.com/your-username/audiocraft-ui.git
cd audiocraft-ui

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Linux/Mac
# or: venv\Scripts\activate  # Windows

# Install dependencies
pip install -r requirements.txt

# Run application
python main.py
```

Access the UI at `http://localhost:7860`

### Docker

```bash
# Build and run
docker-compose up --build

# Or build manually
docker build -t audiocraft-studio .
docker run --gpus all -p 7860:7860 -p 8000:8000 audiocraft-studio
```

### RunPod Deployment

1. Build and push Docker image:
```bash
docker build -t your-dockerhub/audiocraft-studio:latest .
docker push your-dockerhub/audiocraft-studio:latest
```

2. Create RunPod template using `runpod.yaml` as reference

3. Deploy with RTX 4090 or equivalent GPU

## Configuration

Configuration via environment variables:

| Variable | Default | Description |
|----------|---------|-------------|
| `AUDIOCRAFT_HOST` | `0.0.0.0` | Server bind address |
| `AUDIOCRAFT_GRADIO_PORT` | `7860` | Gradio UI port |
| `AUDIOCRAFT_API_PORT` | `8000` | REST API port |
| `AUDIOCRAFT_OUTPUT_DIR` | `./outputs` | Generated audio output |
| `AUDIOCRAFT_DATA_DIR` | `./data` | Database and config |
| `AUDIOCRAFT_COMFYUI_RESERVE_GB` | `10` | VRAM reserved for ComfyUI |
| `AUDIOCRAFT_MAX_LOADED_MODELS` | `2` | Max models in memory |
| `AUDIOCRAFT_IDLE_UNLOAD_MINUTES` | `15` | Auto-unload idle models |

See `.env.example` for full configuration options.

## API Usage

### Authentication

```bash
# Get API key from Settings page or generate via CLI
curl -X POST http://localhost:8000/api/v1/system/api-key/regenerate \
  -H "X-API-Key: YOUR_CURRENT_KEY"
```

### Generate Audio

```bash
# Synchronous generation
curl -X POST http://localhost:8000/api/v1/generate \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "musicgen",
    "variant": "medium",
    "prompts": ["upbeat electronic dance music with synth leads"],
    "duration": 10
  }'

# Async (queue) generation
curl -X POST http://localhost:8000/api/v1/generate/async \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "request": {
      "model": "musicgen",
      "prompts": ["ambient soundscape"],
      "duration": 30
    },
    "priority": 5
  }'
```

### Check Job Status

```bash
curl http://localhost:8000/api/v1/generate/jobs/{job_id} \
  -H "X-API-Key: YOUR_API_KEY"
```

Full API documentation available at `http://localhost:8000/api/docs`

## Architecture

```
audiocraft-ui/
├── config/
│   ├── settings.py      # Pydantic settings
│   └── models.yaml      # Model registry
├── src/
│   ├── core/
│   │   ├── base_model.py     # Abstract model interface
│   │   ├── gpu_manager.py    # VRAM management
│   │   ├── model_registry.py # Model loading/caching
│   │   └── oom_handler.py    # OOM recovery
│   ├── models/
│   │   ├── musicgen/         # MusicGen adapter
│   │   ├── audiogen/         # AudioGen adapter
│   │   ├── magnet/           # MAGNeT adapter
│   │   ├── musicgen_style/   # Style adapter
│   │   └── jasco/            # JASCO adapter
│   ├── services/
│   │   ├── generation_service.py
│   │   ├── batch_processor.py
│   │   └── project_service.py
│   ├── storage/
│   │   └── database.py       # SQLite storage
│   ├── api/
│   │   ├── app.py            # FastAPI app
│   │   └── routes/           # API endpoints
│   └── ui/
│       ├── app.py            # Gradio app
│       ├── components/       # Reusable UI components
│       ├── tabs/             # Model generation tabs
│       └── pages/            # Projects, Settings
├── main.py                   # Entry point
├── Dockerfile
└── docker-compose.yml
```

## ComfyUI Coexistence

AudioCraft Studio is designed to run alongside ComfyUI on the same GPU:

1. Set `AUDIOCRAFT_COMFYUI_RESERVE_GB` to reserve VRAM for ComfyUI
2. Models are automatically unloaded when idle
3. Coordination file at `/tmp/audiocraft_comfyui_coord.json` prevents conflicts

## Development

```bash
# Install dev dependencies
pip install -r requirements-dev.txt

# Run tests
pytest

# Format code
black src/ config/
ruff check src/ config/

# Type checking
mypy src/
```

## License

This project uses Meta's AudioCraft library. See [AudioCraft License](https://github.com/facebookresearch/audiocraft/blob/main/LICENSE).