API for dsidavis/ReadPDF
Extract Information from PDF Documents

Global functions
QQuote Source code
addBBoxColors Source code
anyTextToLeft Source code
assembleLine Source code
attrsToDataFrame Source code
bodyLine Source code
cleanAbstract Source code
collapseLine Source code
collapsePageCols Source code
columnOf Man page
combineBBoxLines Source code
combineLines Source code
containsDate Source code
containsFigureCaption Source code
context Source code
dim.PDFToXMLPage Source code
extractDate Source code
f Source code
findAbstract Source code
findEIDAbstract Source code Source code
findNearestVerticalLine Source code
findSectionHeaders Man page
findShortSectionHeaders Source code
findTable Source code
findVol Source code
firstIsolated Source code
gapBetweenSegments Source code
getAbstractBySpan Source code
getBBox Man page
getBBox.XMLInternalNode Source code
getBBox2.XMLInternalNode Source code
getCaption Source code
getColPositions Source code
getCoords Source code
getCoords.PDFToXMLPage Source code
getCoords.list Source code
getCrossPageLines Source code
getDatePublished Source code
getDocTitleString Source code
getDocWords Source code
getEIDAuthors Source code
getEIDHeadMaterialByFont Source code
getFontText Source code
getFooterPos Source code
getHLines Source code
getHeader Source code
getHorizExtremes Source code
getHorizRects Source code
getImages Source code
getLastRealTextNode Source code
getLinePositions Source code
getLinks Man page Source code
getMetaData Source code
getMonthNames Source code
getNodeColors Source code
getNodeFontInfo Source code
getNodePos Source code
getNodesBetween Source code
getNodesWithFont Source code
getNumCols Source code
getNumCols.PDFToXMLDoc Source code
getNumCols.XMLInternalNode Source code
getNumCols.character Source code
getNumPages Source code
getPageFooter Source code
getPageGroups Source code
getPageHeight Source code
getPageLines Source code
getPageText Source code
getPageWidth Source code
getPages Man page Source code
getRotatedDownloadNodes Source code
getRotatedTable Source code
getRotatedText Source code
getRotation Source code
getSubmissionDateInfo Source code
getTableNodes Source code
getTables Man page Source code Source code
getTextAround Source code
getTextBBox Source code
getTextByCols Man page
getTextFonts Man page
getTextNodeColors Source code
getVerticalRects Source code
getVolume Source code
getXPathDocFontQuery Source code
hardCols Source code
hasCoverPage Source code
hasYear Source code
identicalInColumn Source code
inColumn Source code
interNodeDist Source code
isBibSup Source code
isBold Source code
isBold.XMLInternalNode Source code Source code
isBold.character Source code
isBold.data.frame Source code
isCentered Man page
isCenteredMargins Source code
isEmergingInfectDisease Source code
isItalic Source code
isItalic.character Source code
isItalic.data.frame Source code
isLowerCase Source code
isMBio Source code
isNodeIn Source code
isOnLineBySelf Source code
isScanned Man page
isScanned2 Source code
isScannedPage Source code
isTitleBad Source code
isTitleBad.XMLNodeSet Source code
isUpperCase Source code
joinLines Source code
lineSpacing Source code
margins.PDFToXMLDoc Source code
margins.PDFToXMLPage Source code
margins.XMLNodeSet Source code
margins.character Source code
mergeLines Source code
mkDateRegexp Source code
mostCommon Source code
nodesToTable Source code
orderNodesInPage Source code
pageNodesByLine Source code
pageOf Source code
pageOf.XMLInternalElementNode Source code
pageOf.list Source code
pageTitle Source code
parseCoord Source code
pdfText Source code
pdfText.PDFToXMLPage Source code
pdf_text Source code
plot Man page
plot.PDFToXMLDoc Source code
readPDFXML Man page
reassembleLines Source code
removeExtension Source code
removeRotated Source code
renderCoord Source code
renderCoords Source code
renderLinesRects Source code
sameFileName Source code
showNode Source code
showNodes Source code
showPage Man page
showTb Source code
spansColumn Source code
spansColumns Source code
spansColumns2 Source code
trim Source code
xfoo Source code
xmlFile Source code
xpathQ Source code
dsidavis/ReadPDF documentation built on June 12, 2025, 6:39 a.m.