Module pdf_meta

Module pdf_meta 

Source
Expand description

PDF metadata extraction (page count).

Uses lopdf to read the document catalog and return the page count. Designed for bulk extraction — returns None on any parse error so one bad file doesn’t stop a batch job.

Functions§

extract_page_count
Page count for a single PDF. Returns None if the file can’t be parsed.
extract_pages_batch
Batch page-count extraction with parallel parsing. Returns (path, pages) pairs only for PDFs that parsed successfully.