findSectionHeaders: Find the XML nodes corresponding to the section titles of the...

findSectionHeadersR Documentation

Find the XML nodes corresponding to the section titles of the article/document.

Description

This uses a heuristic approach to find the

Usage

findSectionHeaders(doc, sectionName = c("introduction", "background", "conclusions", "discussion", "materials and methods", "literature cited", "references cited", "the study"), otherSectionNames = c("references", "acknowledgements", "acknowledgments", "results", "methods"), checkCentered = TRUE, discardAfterReferences = TRUE, allowRotated = FALSE, onlyFirst = FALSE, order = TRUE, groupByLine = FALSE)

Arguments

doc
sectionName
otherSectionNames
checkCentered

a logical value. If the nodes we identify as section using the "expected" names are centered, then by default when we look for other text with the same font, we only include centered text. However, if checkCentered = FALSE we include all text with the same section header font. Checking for centered is currently expensive.

discardAfterReferences
allowRotated
onlyFirst
order
groupByLine

Author(s)

Duncan Temple Lang


dsidavis/ReadPDF documentation built on June 12, 2025, 6:39 a.m.