Extracting Images from Scanned PDFs vs Regular PDFs — What Actually Changes
- Scanned PDFs: pages ARE the images — each page is one scanned image
- Digital PDFs: contain embedded image objects — photos, logos, diagrams stored separately
- Scanned PDF quality is set at scan time — 200 DPI scans cannot be improved after the fact
- Both types work with the browser extractor — but the output differs significantly
Table of Contents
There are two fundamentally different types of PDFs, and understanding which one you have changes what you can expect from image extraction. A scanned PDF is essentially a collection of page photographs — each page is a single image of whatever was on the physical paper. A digital PDF contains text as actual text and images as separate embedded objects. The extraction process works differently for each, and the output quality depends on different factors. Here is what you need to know before you extract.
What Is a Scanned PDF?
A scanned PDF is created by passing physical paper through a scanner or photographing it with a phone camera. The scanner captures each page as a raster image (a grid of pixels) and packages those images into a PDF container. The "content" of each page is just one big image — text is not searchable, layout is not structured, everything is pixels.
When you extract images from a scanned PDF, you get the scanned page images. A 10-page scanned document produces 10 image files, one per page. The quality is exactly what the scanner captured:
- A 200 DPI scan produces relatively low-resolution images
- A 300 DPI scan is standard quality — good for reading, adequate for some print uses
- A 600 DPI scan produces large, high-quality images suitable for archival
Phone camera scans vary widely — a well-lit photo with a modern iPhone produces better quality than many dedicated flatbed scanners at 200 DPI, but with perspective distortion and uneven lighting that scanner apps partially correct.
What Is a Digital PDF?
A digital PDF (also called a "born-digital" PDF) is created directly from software — exported from Word, InDesign, PowerPoint, Illustrator, or any application. The PDF contains structured data: text as actual text strings, images as embedded objects with their own resolution, vector graphics as mathematical paths.
When you extract images from a digital PDF, you get the individual embedded objects — photos, logos, charts (if rasterized), diagrams. A product catalog created in InDesign might have 50 product photos embedded at 300 DPI each. Extract those and you get 50 separate high-quality image files, not 50 screenshots of pages.
This is the fundamental difference: scanned PDFs give you page photographs. Digital PDFs give you the individual assets that were assembled to create each page.
Sell Custom Apparel — We Handle Printing & Free ShippingHow the Browser Extractor Handles Each Type
The extractor reads the PDF's internal structure and handles both types, but produces different output:
Scanned PDFs: The tool detects that page content is a raster image and extracts those page-level images. You get one PNG per page. Quality matches the original scan DPI.
Digital PDFs: The tool extracts embedded image objects from each page separately. A single page might yield 3 or 4 individual images (a hero photo, a company logo, a diagram) rather than one page-level image.
In practice, many PDFs are mixed: a document might be mostly digital text with some scanned sections inserted as images. The extractor handles these consistently — everything that appears as raster image content in the PDF structure gets extracted.
Setting the Right Quality Expectations
For scanned PDFs: the ceiling on quality is the scan resolution. A document scanned at 150 DPI produces small, soft images — no extraction tool can make them sharper. If you need higher quality, re-scan the physical document at 300–600 DPI.
For digital PDFs: quality depends on what the PDF creator embedded. A PDF exported from InDesign with "print quality" settings preserves images at 300 DPI. A PDF exported as "web" or "email" size often downsample images to 96 or 150 DPI. Request a higher-quality export from the source if the embedded images are too small.
The best extraction results come from: print-quality digital PDFs (high-res embedded images), high-DPI scanned PDFs (300+ DPI scans), or original-resolution documents that have not been compressed for distribution.
For more on working with scanned documents, see the OCR guide for extracting text from scanned PDFs — a different but related workflow.
Extract Images from Scanned or Digital PDFs — Free
Drop any PDF — scanned or digital — and extract every embedded image as a PNG. Runs in your browser. No upload needed.
Open PDF Image ExtractorFrequently Asked Questions
How can I tell if a PDF is scanned or digital?
Try selecting text on the page. If you can highlight and copy text, it is a digital PDF with actual text. If text selection is impossible (or the whole page selects as one block), it is a scanned PDF. Another clue: scanned PDFs are often large in file size relative to their content, because each page is stored as a high-resolution image.
Can I make text searchable in a scanned PDF?
That is an OCR (optical character recognition) process, not image extraction. See the separate OCR tool for converting scanned PDFs into searchable text documents.
Why do scanned PDFs sometimes extract as blurry images?
Blurry extracted images from scanned PDFs usually mean the original scan was done at low DPI (150 or 200 DPI) or with a low-quality camera. The extractor preserves exactly what was in the scan — it cannot sharpen or enhance a low-quality original.
Are there any PDFs where image extraction does not work?
Password-protected PDFs that block content access, PDFs with copy-protection DRM, and some specialized PDF formats (like PDF/A with strict compliance settings) may limit or block extraction. For standard PDFs created by common tools (Word, InDesign, scanners), extraction works reliably.

