feat: add comprehensive PDF support

- Add jsPDF for PDF generation from text/Markdown/HTML - Add PDF.js for PDF text extraction (read PDFs) - Support PDF → Text/Markdown conversions - Support Markdown/HTML/Text → PDF conversions - Implement page-by-page PDF text extraction - Automatic pagination and formatting for generated PDFs Supported PDF operations: - Extract text from PDF files (all pages) - Convert PDF to Markdown or plain text - Create formatted PDFs from Markdown, HTML, or plain text - Automatic text wrapping and page breaks 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-17 11:13:09 +01:00
parent 9de639b138
commit b899989b3e
5 changed files with 658 additions and 2 deletions
--- a/README.md
+++ b/README.md
@@ -7,7 +7,7 @@ A modern, browser-based file conversion application built with Next.js 16, Tailw
 - **🎬 Video Conversion** - Convert between MP4, WebM, AVI, MOV, MKV, and GIF
 - **🎵 Audio Conversion** - Convert between MP3, WAV, OGG, AAC, and FLAC
 - **🖼️ Image Conversion** - Convert between PNG, JPG, WebP, GIF, BMP, TIFF, and SVG
- **📄 Document Conversion** - Convert between Markdown, HTML, and Plain Text
+- **📄 Document Conversion** - Convert between PDF, Markdown, HTML, and Plain Text
 - **🔒 Privacy First** - All conversions happen locally in your browser, no server uploads
 - **⚡ Fast & Efficient** - Powered by WebAssembly for near-native performance
 - **🎨 Beautiful UI** - Modern, responsive design with dark/light theme support
@@ -26,6 +26,8 @@ A modern, browser-based file conversion application built with Next.js 16, Tailw
 - **Marked** - Markdown to HTML conversion
 - **Turndown** - HTML to Markdown conversion
 - **DOMPurify** - HTML sanitization
+- **jsPDF** - PDF generation
+- **PDF.js** - PDF text extraction
 - **Fuse.js** - Fuzzy search for format selection
 - **Lucide React** - Beautiful icon library

@@ -115,13 +117,21 @@ convert-ui/
 - **Input/Output:** PNG, JPG, WebP, GIF, BMP, TIFF, SVG

 ### Documents
+- **PDF → Text/Markdown** - Extract text from PDF files with page-by-page processing
+- **Markdown/HTML/Text → PDF** - Generate formatted PDF documents
 - **Markdown → HTML** - Full GitHub Flavored Markdown support with styling
 - **HTML → Markdown** - Clean conversion with formatting preservation
 - **Markdown ↔ Plain Text** - Strip or add basic formatting
 - **HTML → Plain Text** - Extract text content
 - **Plain Text → HTML** - Convert to formatted HTML document

-**Note:** Uses lightweight JavaScript libraries (marked, turndown) instead of Pandoc WASM for fast, reliable conversions.
+**Supported PDF Operations:**
+- Read PDFs and extract all text content
+- Convert extracted text to Markdown or plain text
+- Create PDFs from Markdown, HTML, or plain text
+- Automatic pagination and formatting
+
+**Note:** Uses PDF.js for reading and jsPDF for generation. Lightweight JavaScript libraries (marked, turndown) used instead of Pandoc WASM for fast, reliable conversions.

 ## How It Works