Loading...

Free PDF tool

PDF OCR

Recognize text on scanned PDF pages in your browser. Download a plain-text transcript or a new PDF with an invisible text layer that makes the document searchable and copyable. Runs locally using Tesseract.js — no upload.

English + multi-lang packsPlain text or searchable PDFPage range100% in browser

PDF OCR

Upload a scanned PDF, choose language and output, then run OCR.

OCR runs in your browser using Tesseract.js. Language data downloads the first time you run it and is cached after.

Two output formats

Plain text (.txt)

Useful for copying into Word, sending to a translation tool, or feeding into search and analysis pipelines. One file with each page separated by a header line.

Searchable PDF

The page image stays as-is, with recognized text placed underneath as an invisible layer. PDF viewers can search, select, and copy the text while users still see the original scan.

How to OCR a PDF

  1. 01

    Upload your PDF

    Drop a scanned or image-based PDF onto the upload area.

  2. 02

    Pick language and output

    Choose a language pack, plain text or searchable PDF, and an optional page range.

  3. 03

    Run and download

    The first run downloads language data into your browser cache. Subsequent runs reuse it.

Common use cases

Scanned books and articles

Pull selectable text out of scanned chapters for citation, search, and quote extraction.

Photographed whiteboards

Convert meeting whiteboard photos into searchable text so action items become findable.

Receipts and invoices

Extract totals and reference numbers from receipts for expense reports and bookkeeping.

Legal exhibits

Make multi-page exhibits searchable so reviewers can jump to relevant passages.

Historical archives

Add a search layer to scanned letters, certificates, and other heritage documents.

Accessibility

Searchable PDFs work better with screen readers and assistive technology than pure image PDFs.

PDF OCR FAQ

Is the PDF uploaded anywhere?

No. OCR runs entirely in your browser using Tesseract.js. The PDF and the recognized text stay on your device.

Why is the first run slow?

Tesseract downloads language data on first use. The browser caches it so subsequent runs are much faster.

How accurate is the OCR?

Accuracy depends on scan quality, font, and language mix. Sharp scans of standard print typically yield 95%+ accuracy. Cursive, low-contrast, or poorly-scanned pages perform worse.

What is a searchable PDF?

A PDF where the visible page image is paired with an invisible text layer in the same position. Users see the scan, but the document is searchable and the text can be selected and copied. The invisible layer is drawn with full transparency, which works in Chrome, Firefox, Adobe Reader, and Preview. A small number of older or minimal PDF viewers may not index it.

How do I pick the right language?

Match the dominant script in the document. Multi-language entries cover English plus the second script, which usually improves accuracy on mixed-language pages.

Can I share the OCR result as a link?

Plain text is downloaded as a .txt file. Searchable PDFs can be uploaded to PDFtoLink for an instant shareable URL with optional password protection.