Man pages for dsidavis/ReadPDF
Extract Information from PDF Documents

columnOfDetermine in which column in a page a node is located.
findSectionHeadersFind the XML nodes corresponding to the section titles of the...
getBBoxGet the bounding box of a collection of nodes
getLinksGet hyperlink destinations in a PDF document or page
getPagesGet a list of XML nodes for the pages in the PDF document.
getTablesAttempt to heuristically locate the tables in the document or...
getTextByColsGet the text arranged by each column.
getTextFontsGet information about the fonts used for each text node.
isCenteredIs the node centered within the text on the page or column
isScannedIs the document scanned or a contain real PDF elements
readPDFXMLRead a PDF or XML version of a PDF
showPagePlot the contents of a PDF page
dsidavis/ReadPDF documentation built on June 12, 2025, 6:39 a.m.