Files
audiocraft-ui/README.md
Sebastian Krüger ffbf02b12c Initial implementation of AudioCraft Studio
Complete web interface for Meta's AudioCraft AI audio generation:

- Gradio UI with tabs for all 5 model families (MusicGen, AudioGen,
  MAGNeT, MusicGen Style, JASCO)
- REST API with FastAPI, OpenAPI docs, and API key auth
- VRAM management with ComfyUI coexistence support
- SQLite database for project/generation history
- Batch processing queue for async generation
- Docker deployment optimized for RunPod with RTX 4090

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-25 19:34:27 +01:00

198 lines
5.5 KiB
Markdown

# AudioCraft Studio
A comprehensive web interface for Meta's AudioCraft AI audio generation models, optimized for RunPod deployment with NVIDIA RTX 4090 GPUs.
## Features
### Models Supported
- **MusicGen** - Text-to-music generation with melody conditioning
- **AudioGen** - Text-to-sound effects and environmental audio
- **MAGNeT** - Fast non-autoregressive music generation
- **MusicGen Style** - Style-conditioned music from reference audio
- **JASCO** - Chord and drum-conditioned music generation
### Core Capabilities
- **Gradio Web UI** - Intuitive interface with real-time generation
- **REST API** - Full-featured API with OpenAPI documentation
- **Batch Processing** - Queue system for multiple generations
- **Project Management** - Organize and browse generation history
- **VRAM Management** - Smart model loading/unloading, ComfyUI coexistence
- **Waveform Visualization** - Visual audio feedback
## Quick Start
### Local Development
```bash
# Clone repository
git clone https://github.com/your-username/audiocraft-ui.git
cd audiocraft-ui
# Create virtual environment
python -m venv venv
source venv/bin/activate # Linux/Mac
# or: venv\Scripts\activate # Windows
# Install dependencies
pip install -r requirements.txt
# Run application
python main.py
```
Access the UI at `http://localhost:7860`
### Docker
```bash
# Build and run
docker-compose up --build
# Or build manually
docker build -t audiocraft-studio .
docker run --gpus all -p 7860:7860 -p 8000:8000 audiocraft-studio
```
### RunPod Deployment
1. Build and push Docker image:
```bash
docker build -t your-dockerhub/audiocraft-studio:latest .
docker push your-dockerhub/audiocraft-studio:latest
```
2. Create RunPod template using `runpod.yaml` as reference
3. Deploy with RTX 4090 or equivalent GPU
## Configuration
Configuration via environment variables:
| Variable | Default | Description |
|----------|---------|-------------|
| `AUDIOCRAFT_HOST` | `0.0.0.0` | Server bind address |
| `AUDIOCRAFT_GRADIO_PORT` | `7860` | Gradio UI port |
| `AUDIOCRAFT_API_PORT` | `8000` | REST API port |
| `AUDIOCRAFT_OUTPUT_DIR` | `./outputs` | Generated audio output |
| `AUDIOCRAFT_DATA_DIR` | `./data` | Database and config |
| `AUDIOCRAFT_COMFYUI_RESERVE_GB` | `10` | VRAM reserved for ComfyUI |
| `AUDIOCRAFT_MAX_LOADED_MODELS` | `2` | Max models in memory |
| `AUDIOCRAFT_IDLE_UNLOAD_MINUTES` | `15` | Auto-unload idle models |
See `.env.example` for full configuration options.
## API Usage
### Authentication
```bash
# Get API key from Settings page or generate via CLI
curl -X POST http://localhost:8000/api/v1/system/api-key/regenerate \
-H "X-API-Key: YOUR_CURRENT_KEY"
```
### Generate Audio
```bash
# Synchronous generation
curl -X POST http://localhost:8000/api/v1/generate \
-H "X-API-Key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "musicgen",
"variant": "medium",
"prompts": ["upbeat electronic dance music with synth leads"],
"duration": 10
}'
# Async (queue) generation
curl -X POST http://localhost:8000/api/v1/generate/async \
-H "X-API-Key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"request": {
"model": "musicgen",
"prompts": ["ambient soundscape"],
"duration": 30
},
"priority": 5
}'
```
### Check Job Status
```bash
curl http://localhost:8000/api/v1/generate/jobs/{job_id} \
-H "X-API-Key: YOUR_API_KEY"
```
Full API documentation available at `http://localhost:8000/api/docs`
## Architecture
```
audiocraft-ui/
├── config/
│ ├── settings.py # Pydantic settings
│ └── models.yaml # Model registry
├── src/
│ ├── core/
│ │ ├── base_model.py # Abstract model interface
│ │ ├── gpu_manager.py # VRAM management
│ │ ├── model_registry.py # Model loading/caching
│ │ └── oom_handler.py # OOM recovery
│ ├── models/
│ │ ├── musicgen/ # MusicGen adapter
│ │ ├── audiogen/ # AudioGen adapter
│ │ ├── magnet/ # MAGNeT adapter
│ │ ├── musicgen_style/ # Style adapter
│ │ └── jasco/ # JASCO adapter
│ ├── services/
│ │ ├── generation_service.py
│ │ ├── batch_processor.py
│ │ └── project_service.py
│ ├── storage/
│ │ └── database.py # SQLite storage
│ ├── api/
│ │ ├── app.py # FastAPI app
│ │ └── routes/ # API endpoints
│ └── ui/
│ ├── app.py # Gradio app
│ ├── components/ # Reusable UI components
│ ├── tabs/ # Model generation tabs
│ └── pages/ # Projects, Settings
├── main.py # Entry point
├── Dockerfile
└── docker-compose.yml
```
## ComfyUI Coexistence
AudioCraft Studio is designed to run alongside ComfyUI on the same GPU:
1. Set `AUDIOCRAFT_COMFYUI_RESERVE_GB` to reserve VRAM for ComfyUI
2. Models are automatically unloaded when idle
3. Coordination file at `/tmp/audiocraft_comfyui_coord.json` prevents conflicts
## Development
```bash
# Install dev dependencies
pip install -r requirements-dev.txt
# Run tests
pytest
# Format code
black src/ config/
ruff check src/ config/
# Type checking
mypy src/
```
## License
This project uses Meta's AudioCraft library. See [AudioCraft License](https://github.com/facebookresearch/audiocraft/blob/main/LICENSE).