Show HN: Local CPU OCR for images, PDFs, webpages A developer released textsnap, a command-line tool that performs OCR on images, screenshots, PDFs, and webpages entirely on a local CPU without requiring a GPU or cloud connection. The tool uses a quantized 0.9B vision-language model that runs offline after an initial 890 MB download, supporting clipboard input and output for screenshot-to-text workflows. The single-file Python module is designed for portability, allowing users to copy the tool and its model files to any machine for fully offline, air-gapped operation. Snap any image, screenshot, or webpage into plaintext. No GPU. No cloud. One command. textsnap screenshot.png That's it. You get a .txt next to your shell, recognized on your CPU, from a screenshot, a photo, an image URL, or even a webpage. - ⚡ Runs on CPU. A 0.9B PaddleOCR-VL-1.5 vision-language model, quantized to q4 ONNX, parses full pages on a plain laptop. No CUDA. No M-series-only tricks. Plain old cores, pinned to your physical-core count. - 🖞 Images, screenshots, URLs, webpages. Point it at a local file, a direct image URL, or a full article URL — it isolates the main content and OCRs the most prominent image. Or OCR straight from your clipboard with no argument at all — and get the text put back on your clipboard, ready to paste. - ðŸ“ī Offline after first run. ~890 MB of ONNX downloads once to your cache and stays there. No API keys. No quotas. Your images never leave your machine. - 🎒 Portable. Drop the model files next to the script and the whole folder becomes a self-contained, copy-anywhere tool — no install, no download, no flags. - ðŸŠķ One file. The whole tool is a single Python module. Dependencies install themselves on first run if missing. - 📝 Markdown or plaintext. Default output is the model's native markdown tables, headings, structure preserved . Add --plaintext to flatten it. Install pip install textsnap Snap something textsnap screenshot.png textsnap https://example.com/article --plaintext textsnap photo.jpg -o ~/notes/receipt.txt The first run downloads the model ~890 MB . Every run after is offline. | Source | Example | |---|---| | Clipboard | textsnap no argument | | Local image file | textsnap path/to/img.png | | Direct image URL | textsnap https://example.com/x.png | | Webpage URL | textsnap https://example.com/article | Local files cover anything Pillow can decode: .png , .jpg , .jpeg , .webp , .bmp , .gif , .tiff , and friends. For webpage URLs, textsnap uses readability to isolate the main content, then picks the most prominent image on the page and OCRs that. Run textsnap with no argument and it reads the image currently on your clipboard. The recognized text is then copied straight back to the clipboard , so a screenshot-to-text round trip is just: snap → textsnap → paste. The .txt file is still written as well and its path still printed to stdout , so nothing about scripting changes — the clipboard copy is a pure convenience layered on top. Clipboard-out uses your platform's native tool — pbcopy macOS , clip Windows , or wl-copy / xclip / xsel Linux — so it needs no extra Python package. If none of those is installed, textsnap simply skips the clipboard copy; the .txt file is always there regardless. Run with -v to see whether the copy succeeded. By default textsnap downloads its model files to an OS cache directory ~/.cache/textsnap/ . But if it finds the model files sitting next to the script , it uses those directly — no download, no --model-dir flag, no setup at all. "Next to the script" means a layout like: textsnap/ ├── textsnap.py ├── onnx/ │ ├── vision encoder q4.onnx │ ├── decoder q4.onnx │ └── embedding.onnx └── tokenizer.json Drop those files in, and you can copy the entire textsnap/ folder to any machine — a USB stick, an air-gapped box, a fresh laptop — and run it immediately, fully offline, with zero install steps. Model-directory resolution order: --model-dir DIR — if you pass it explicitly, it always wins. Portable — model files found next to the script. OS cache — ~/.cache/textsnap/ , downloading on first run if needed. Like --model-dir , portable-mode files arenotSHA-256 verified — files you placed there yourself are trusted by definition. Integrity verification applies to files textsnapdownloads. See Security . pip install textsnap Installs two equivalent commands on your PATH : textsnap canonical and alias, for when the name slips your mind . ocr To install from a local source checkout instead: pip install . For a reproducible install with exact pinned dependency versions: pip install -r requirements-lock.txt pip install . Clipboard note.Reading imagesfromthe clipboard relies on Pillow's ImageGrab ; on Linux you may need xclip or wl-clipboard installed. Writing recognized textbackto the clipboard uses pbcopy / clip / wl-copy / xclip / xsel . macOS and Windows work out of the box. Clipboard no argument — text is also copied back to the clipboard textsnap Local image file textsnap path/to/screenshot.png Direct image URL textsnap "https://example.com/diagram.png" Webpage — OCRs the most prominent image on the page textsnap "https://example.com/article" Flatten the model's markdown to plain text textsnap input.png --plaintext Custom output path textsnap input.png -o ./out/extracted.txt Raise the token cap for very dense pages textsnap dense-page.png --max-tokens 4096 Trade accuracy for speed by shrinking the image budget textsnap input.png --max-pixels 250000 Use a local model directory instead of downloading textsnap input.png --model-dir ~/models/paddleocr-vl Plaintext, UTF-8. Default location is ./textsnaps/ created if missing under the current working directory; override with -o . The filename is derived from the image filename stem receipt ocr.txt , or from the webpage slug for URL inputs. textsnap is quiet by default, Unix-style: the only thing printed to stdout is the path to the file it wrote, so it composes cleanly — OUT=$ textsnap receipt.png capture the path textsnap receipt.png | xargs cat print the recognized text When the input is the clipboard, the recognized text is also placed on the clipboard — see Clipboard in, clipboard out clipboard-in-clipboard-out . Pass -v to send progress diagnostics input type, image size, decode speed, token counts to stderr ; stdout stays just the path either way. Default file output is the model's native markdown — it preserves tables, headings, and document structure: Quarterly Report | Region | Revenue | | ------ | ------- | | EMEA | $1.2M | | APAC | $0.9M | With --plaintext , markdown is flattened to bare text: Quarterly Report Region Revenue EMEA $1.2M APAC $0.9M | Flag | Description | |---|---| -o , --output | Output .txt path. Default: ./textsnaps/