A modern desktop application for parsing and processing documents (PDF, DOCX, TXT) with intelligent image context extraction.
Built with Rust + Tauri + Svelte for maximum performance and minimal footprint.
sudo apt install -y libwebkit2gtk-4.1-dev libappindicator3-dev librsvg2-dev patchelf libssl-dev
# Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
# Clone and build
cd ~/projects/buildonai/document-processor
npm install
npm run tauri build
npm run tauri dev
Each processed document creates:
processed/<document-id>/
├── document.md # Human-readable markdown
├── document.json # Structured data for AI
├── images/
│ ├── img_001.png # Extracted images
│ ├── img_001.json # Image metadata + context
│ └── thumb_001.png # Thumbnails
└── original.pdf # Original file copy
Each image includes:
context_before: 200 characters of text before the imagecontext_after: 200 characters afterposition_marker: Page and position referenceocr_text: Text extracted from image (if applicable)ai_description: AI-generated description (when available)This project includes Claude Code skills for command-line integration:
/parse ~/Documents/contract.pdf
Parses a document and extracts text + images.
/document-upload-analyzer
Analyzes document upload methods in a web application.
document-processor/
├── src/ # Svelte frontend
│ ├── App.svelte # Main component
│ ├── main.js # Entry point
│ └── styles.css # Global styles
├── src-tauri/
│ └── src/
│ ├── main.rs # Tauri entry point
│ ├── parser.rs # Document parsing logic
│ ├── db.rs # SQLite database
│ └── watcher.rs # Folder watching
└── package.json
MIT - BuildOnAI Project