Skip to main content
← Back to blog
Phone Number Management5 min read

How to Extract Phone Numbers from a PDF

PDFs are one of the most common places phone numbers hide — invoices, contracts, event programs, business directories, scanned business cards. Getting those numbers out so you can actually call or message them isn't always straightforward.

Here's how to extract phone numbers from any PDF, whether it's text-based or a scanned image.

Step 1: Determine Your PDF Type

PDFs come in two fundamentally different forms:

Text-based PDFs — Created digitally (exported from Word, generated by software, etc.). You can select and copy text from these. Most modern PDFs are this type.

Scanned/Image PDFs — Created by scanning a physical document. The PDF contains images of text, not actual text. You can't select or copy anything — it's essentially a photograph.

How to tell: Try to click and drag to select text in the PDF. If you can highlight individual words, it's text-based. If nothing highlights or the entire page selects as one block, it's scanned.

Text-Based PDFs: The Easy Case

Method 1: Copy-Paste into NumSwift

The fastest approach for any text-based PDF:

  1. Open the PDF in any viewer (Preview, Chrome, Adobe Reader)
  2. Select all text (Ctrl/Cmd+A) or select the relevant section
  3. Copy (Ctrl/Cmd+C)
  4. Open NumSwift
  5. Paste the text
  6. Every phone number is automatically extracted with WhatsApp, SMS, and call buttons

This works regardless of how the numbers are formatted in the PDF — mixed in with text, in tables, with or without country codes. It's the same extract phone numbers from text workflow, just with a PDF as the source.

Method 2: Adobe Reader Find

If you're looking for a specific number rather than extracting all numbers:

  1. Open the PDF in Adobe Acrobat Reader
  2. Press Ctrl/Cmd+F
  3. Search for partial number patterns (area code, country code, or first few digits)

Limitation: You need to know what you're looking for. This doesn't help when you want to find all numbers in a document.

Method 3: Browser Copy

Most browsers can open PDFs:

  1. Open the PDF in Chrome, Firefox, or Edge
  2. Select the text containing phone numbers
  3. Copy and paste wherever you need it

Limitation: Browser PDF rendering sometimes merges table columns or loses formatting, making numbers harder to identify.

Scanned PDFs: The Hard Case

Scanned PDFs require OCR (Optical Character Recognition) to convert images of text into actual text.

Method 1: Adobe Acrobat OCR

Adobe Acrobat Pro (paid) has built-in OCR:

  1. Open the scanned PDF
  2. Go to Tools → Scan & OCR
  3. Click Recognize Text → In This File
  4. Once processed, the text is selectable
  5. Copy the text and paste into NumSwift for number extraction

Method 2: Google Drive OCR (Free)

Google Drive automatically OCRs uploaded documents:

  1. Upload the PDF to Google Drive
  2. Right-click → Open with → Google Docs
  3. Google converts the scanned PDF to a text document
  4. Copy the text and paste into NumSwift

Accuracy: Good for clean scans, struggles with handwriting, low-resolution scans, or unusual fonts.

Method 3: macOS Preview + Live Text

On macOS Ventura and later:

  1. Open the scanned PDF in Preview
  2. Live Text automatically recognizes text in images
  3. Click on recognized phone numbers to copy them

Limitation: Works best with clear, high-contrast text. May miss numbers in complex layouts.

Method 4: Mobile Phone Camera (for Physical Documents)

If you have a physical document (not yet a PDF):

  1. iPhone: Open Camera, point at the document. Live Text recognizes phone numbers — tap to call or copy.
  2. Android: Use Google Lens. Point at the number, tap to copy or call.

Common PDF Sources and Their Challenges

Invoices and Receipts

Usually text-based with well-formatted numbers. Copy-paste into NumSwift typically works perfectly.

Watch for: Numbers that look like phone numbers but aren't — invoice numbers, order references, and account numbers can match phone number patterns.

Contracts and Legal Documents

Often text-based but lengthy. Phone numbers may appear in headers, footers, signature blocks, or buried in clauses.

Tip: Copy the entire document into NumSwift rather than hunting for numbers page by page. It extracts all numbers regardless of where they appear.

Business Card Scans

Often low-resolution scanned images. Numbers may appear in decorative fonts.

Best approach: Use Google Drive OCR or your phone's Live Text/Google Lens for best results with business cards.

Event Programs and Directories

May contain dozens or hundreds of phone numbers in list or table format.

Tip: If the PDF is text-based, select the entire directory section, copy, and paste into NumSwift. For directories with hundreds of entries, a bulk phone number extractor handles the volume without slowing down.

Government and Official Documents

Often scanned PDFs of older documents. Quality varies.

Best approach: Adobe Acrobat OCR or Google Drive for the OCR step, then NumSwift for extraction.

Handling Table Data

PDFs with phone numbers in tables present a special challenge. When you copy table data from a PDF, columns often merge together:

Original table:
Name          | Phone
John Smith    | 555-123-4567
Jane Doe      | 555-987-6543

What you might get when copying:
John Smith 555-123-4567 Jane Doe 555-987-6543

NumSwift handles this well — it identifies phone number patterns regardless of surrounding text. Paste the messy copied text and it will still extract the numbers correctly.

For Developers: Programmatic Extraction

If you need to extract phone numbers from PDFs at scale:

  1. Parse the PDF: Use a library like pdf-parse (Node.js), PyPDF2 (Python), or Apache PDFBox (Java) to extract raw text
  2. Find numbers: Use Google's libphonenumber to identify and validate phone numbers in the extracted text
  3. Handle scanned PDFs: Integrate an OCR library like Tesseract for image-based PDFs
import PyPDF2
import phonenumbers

with open('document.pdf', 'rb') as f:
    reader = PyPDF2.PdfReader(f)
    text = ''
    for page in reader.pages:
        text += page.extract_text()

for match in phonenumbers.PhoneNumberMatcher(text, 'US'):
    print(phonenumbers.format_number(
        match.number,
        phonenumbers.PhoneNumberFormat.E164
    ))

Tips for Better Extraction

  1. Select more text than you need. It's easier to paste a full page into NumSwift and let it filter than to carefully select just the numbers.

  2. Try different PDF viewers. Some viewers handle text selection better than others. Chrome's PDF viewer often works well for simple documents.

  3. Check the output. OCR can mistake similar characters: 0 vs O, 1 vs l, 8 vs B. Verify extracted numbers before calling.

  4. Know the expected format. If you know the numbers should be UK mobile numbers, you can quickly spot extraction errors (they should start with 07 or +447).

Related Guides

Bottom Line

For text-based PDFs, copy the text and paste into NumSwift — it finds every phone number automatically. For scanned PDFs, use Google Drive's free OCR or Adobe Acrobat to convert to text first, then paste into NumSwift. Either way, you'll have clickable WhatsApp, SMS, and call buttons for every extracted number.