feat: add comprehensive PDF support

- Add jsPDF for PDF generation from text/Markdown/HTML
- Add PDF.js for PDF text extraction (read PDFs)
- Support PDF → Text/Markdown conversions
- Support Markdown/HTML/Text → PDF conversions
- Implement page-by-page PDF text extraction
- Automatic pagination and formatting for generated PDFs

Supported PDF operations:
- Extract text from PDF files (all pages)
- Convert PDF to Markdown or plain text
- Create formatted PDFs from Markdown, HTML, or plain text
- Automatic text wrapping and page breaks

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
2025-11-17 11:13:09 +01:00
parent 9de639b138
commit b899989b3e
5 changed files with 658 additions and 2 deletions

View File

@@ -7,7 +7,7 @@ A modern, browser-based file conversion application built with Next.js 16, Tailw
- **🎬 Video Conversion** - Convert between MP4, WebM, AVI, MOV, MKV, and GIF
- **🎵 Audio Conversion** - Convert between MP3, WAV, OGG, AAC, and FLAC
- **🖼️ Image Conversion** - Convert between PNG, JPG, WebP, GIF, BMP, TIFF, and SVG
- **📄 Document Conversion** - Convert between Markdown, HTML, and Plain Text
- **📄 Document Conversion** - Convert between PDF, Markdown, HTML, and Plain Text
- **🔒 Privacy First** - All conversions happen locally in your browser, no server uploads
- **⚡ Fast & Efficient** - Powered by WebAssembly for near-native performance
- **🎨 Beautiful UI** - Modern, responsive design with dark/light theme support
@@ -26,6 +26,8 @@ A modern, browser-based file conversion application built with Next.js 16, Tailw
- **Marked** - Markdown to HTML conversion
- **Turndown** - HTML to Markdown conversion
- **DOMPurify** - HTML sanitization
- **jsPDF** - PDF generation
- **PDF.js** - PDF text extraction
- **Fuse.js** - Fuzzy search for format selection
- **Lucide React** - Beautiful icon library
@@ -115,13 +117,21 @@ convert-ui/
- **Input/Output:** PNG, JPG, WebP, GIF, BMP, TIFF, SVG
### Documents
- **PDF → Text/Markdown** - Extract text from PDF files with page-by-page processing
- **Markdown/HTML/Text → PDF** - Generate formatted PDF documents
- **Markdown → HTML** - Full GitHub Flavored Markdown support with styling
- **HTML → Markdown** - Clean conversion with formatting preservation
- **Markdown ↔ Plain Text** - Strip or add basic formatting
- **HTML → Plain Text** - Extract text content
- **Plain Text → HTML** - Convert to formatted HTML document
**Note:** Uses lightweight JavaScript libraries (marked, turndown) instead of Pandoc WASM for fast, reliable conversions.
**Supported PDF Operations:**
- Read PDFs and extract all text content
- Convert extracted text to Markdown or plain text
- Create PDFs from Markdown, HTML, or plain text
- Automatic pagination and formatting
**Note:** Uses PDF.js for reading and jsPDF for generation. Lightweight JavaScript libraries (marked, turndown) used instead of Pandoc WASM for fast, reliable conversions.
## How It Works