Files
scrapy-ui/README.md
Sebastian Krüger 971ef5426d feat: initial commit - Scrapyd UI web interface
- Next.js 16.0.1 + React 19.2.0 + Tailwind CSS 4.1.16
- Complete Scrapyd API integration (all 12 endpoints)
- Dashboard with real-time job monitoring
- Projects management (upload, list, delete)
- Spiders management with scheduling
- Jobs monitoring with filtering and cancellation
- System status monitoring
- Dark/light theme toggle with next-themes
- Server-side authentication via environment variables
- Docker deployment with multi-stage builds
- GitHub Actions CI/CD workflow

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-05 03:32:14 +01:00

284 lines
7.6 KiB
Markdown

# Scrapy UI
A modern, stylish web interface for managing and monitoring [Scrapyd](https://scrapyd.readthedocs.io/) instances. Built with Next.js 15, shadcn/ui, and Tailwind CSS 4.
## Features
- **Real-time Monitoring** - Live dashboard with job statistics and system status
- **Project Management** - Upload, list, and delete Scrapy projects and versions
- **Spider Management** - Browse spiders and schedule jobs with custom arguments
- **Job Control** - Monitor running/pending/finished jobs with filtering and cancellation
- **System Status** - View Scrapyd daemon health and metrics
- **Modern UI** - Clean, responsive design with dark/light theme support
- **Secure** - Server-side authentication with environment variables
- **Docker Ready** - Multi-stage builds for production deployment
## Tech Stack
- **Next.js 15** (App Router, Server Components)
- **React 19** with Server Actions
- **TypeScript** for type safety
- **Tailwind CSS 4** for styling
- **shadcn/ui** for UI components
- **TanStack Query** for data fetching
- **Zod** for runtime validation
- **Docker** for containerization
## Prerequisites
- Node.js 22+ or Docker
- A running Scrapyd instance
- Basic auth credentials for Scrapyd
## Quick Start
### 1. Clone the repository
```bash
git clone <your-repo-url>
cd scrapy-ui
```
### 2. Configure environment variables
Copy `.env.example` to `.env.local` and update with your credentials:
```bash
cp .env.example .env.local
```
Edit `.env.local`:
```env
SCRAPYD_URL=https://scrapy.pivoine.art
SCRAPYD_USERNAME=your_username
SCRAPYD_PASSWORD=your_password
```
### 3. Run locally
#### Using pnpm:
```bash
pnpm install
pnpm dev
```
#### Using Docker Compose (development):
```bash
docker-compose -f docker-compose.dev.yml up
```
#### Using Docker Compose (production):
```bash
docker-compose up -d
```
Visit [http://localhost:3000/ui](http://localhost:3000/ui)
## Project Structure
```
scrapy-ui/
├── app/ # Next.js App Router
│ ├── (dashboard)/ # Dashboard route group
│ │ ├── page.tsx # Dashboard (/)
│ │ ├── projects/ # Projects management
│ │ ├── spiders/ # Spiders listing & scheduling
│ │ ├── jobs/ # Jobs monitoring & control
│ │ └── system/ # System status
│ ├── api/scrapyd/ # API routes (server-side)
│ │ ├── daemon/ # GET /api/scrapyd/daemon
│ │ ├── projects/ # GET/DELETE /api/scrapyd/projects
│ │ ├── versions/ # GET/POST/DELETE /api/scrapyd/versions
│ │ ├── spiders/ # GET /api/scrapyd/spiders
│ │ └── jobs/ # GET/POST/DELETE /api/scrapyd/jobs
│ └── layout.tsx # Root layout with theme provider
├── components/ # React components
│ ├── ui/ # shadcn/ui components
│ ├── sidebar.tsx # Navigation sidebar
│ ├── header.tsx # Page header
│ └── theme-toggle.tsx # Dark/light mode toggle
├── lib/ # Utilities & API client
│ ├── scrapyd-client.ts # Scrapyd API wrapper
│ ├── types.ts # TypeScript types & Zod schemas
│ └── utils.ts # Helper functions
├── Dockerfile # Production build
├── Dockerfile.dev # Development build
├── docker-compose.yml # Production deployment
└── docker-compose.dev.yml # Development deployment
```
## API Endpoints
All Scrapyd endpoints are proxied through Next.js API routes with server-side authentication:
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/api/scrapyd/daemon` | GET | Daemon status |
| `/api/scrapyd/projects` | GET | List projects |
| `/api/scrapyd/projects` | DELETE | Delete project |
| `/api/scrapyd/versions` | GET | List versions |
| `/api/scrapyd/versions` | POST | Upload version |
| `/api/scrapyd/versions` | DELETE | Delete version |
| `/api/scrapyd/spiders` | GET | List spiders |
| `/api/scrapyd/jobs` | GET | List jobs |
| `/api/scrapyd/jobs` | POST | Schedule job |
| `/api/scrapyd/jobs` | DELETE | Cancel job |
## Deployment
### Docker
#### Build production image:
```bash
docker build -t scrapy-ui:latest .
```
#### Run container:
```bash
docker run -d \
-p 3000:3000 \
-e SCRAPYD_URL=https://scrapy.pivoine.art \
-e SCRAPYD_USERNAME=your_username \
-e SCRAPYD_PASSWORD=your_password \
--name scrapy-ui \
scrapy-ui:latest
```
### GitHub Actions
The project includes a GitHub Actions workflow (`.github/workflows/docker-build.yml`) that automatically builds and pushes Docker images to GitHub Container Registry on push to `main` or on tagged releases.
To use it:
1. Ensure GitHub Actions has write permissions to packages
2. Push code to trigger the workflow
3. Images will be available at `ghcr.io/<username>/scrapy-ui`
### Reverse Proxy (Nginx)
To deploy under `/ui` path with Nginx:
```nginx
location /ui/ {
proxy_pass http://localhost:3000/ui/;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
proxy_set_header Host $host;
proxy_cache_bypass $http_upgrade;
}
```
## Configuration
### Environment Variables
| Variable | Description | Required | Default |
|----------|-------------|----------|---------|
| `SCRAPYD_URL` | Scrapyd base URL | Yes | `https://scrapy.pivoine.art` |
| `SCRAPYD_USERNAME` | Basic auth username | Yes | - |
| `SCRAPYD_PASSWORD` | Basic auth password | Yes | - |
| `NODE_ENV` | Environment mode | No | `production` |
### Next.js Configuration
The `next.config.ts` includes:
- `basePath: "/ui"` - Serves app under `/ui` path
- `output: "standalone"` - Optimized Docker builds
- Optimized imports for `lucide-react`
## Development
### Install dependencies:
```bash
pnpm install
```
### Run development server:
```bash
pnpm dev
```
### Build for production:
```bash
pnpm build
pnpm start
```
### Lint code:
```bash
pnpm lint
```
### Add shadcn/ui components:
```bash
pnpm dlx shadcn@latest add <component-name>
```
## Features in Detail
### Dashboard
- Real-time job statistics (running, pending, finished)
- System health indicators
- Quick project overview
- Auto-refresh every 30 seconds
### Projects
- List all Scrapy projects
- Upload new versions (.egg files)
- View version history
- Delete projects/versions
- Drag & drop file upload
### Spiders
- Browse spiders by project
- Schedule jobs with custom arguments
- JSON argument validation
- Quick schedule dialog
### Jobs
- Filter by status (pending/running/finished)
- Real-time status updates (5-second refresh)
- Cancel running/pending jobs
- View job logs and items
- Detailed job information
### System
- Daemon status monitoring
- Job queue statistics
- Environment information
- Auto-refresh every 10 seconds
## Security
- **Server-side authentication**: Credentials are stored in environment variables and never exposed to the client
- **API proxy**: All Scrapyd requests go through Next.js API routes
- **Basic auth**: Automatic injection of credentials in request headers
- **No client-side secrets**: Zero credential exposure in browser
## Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
## License
MIT License - feel free to use this project for personal or commercial purposes.
## Acknowledgments
- Built with [Next.js](https://nextjs.org/)
- UI components from [shadcn/ui](https://ui.shadcn.com/)
- Icons from [Lucide](https://lucide.dev/)
- Designed for [Scrapyd](https://scrapyd.readthedocs.io/)