heading_search: Function to locate sections of pdf
In lebebr01/pdfsearch: Search Tools for PDF Files

View source: R/heading_search.r

heading_search

R Documentation

Function to locate sections of pdf

Description

The ability to extract the location of the text and separate by sections. The function will return the headings with their location in the pdf.

Usage

heading_search(
  x,
  headings,
  path = FALSE,
  pdf_toc = FALSE,
  full_line = FALSE,
  ignore_case = FALSE,
  split_pdf = FALSE,
  convert_sentence = FALSE
)

Arguments

`x`	Either the text of the pdf read in with the pdftools package or a path for the location of the pdf file.
`headings`	A character vector representing the headings to search for. Can be NULL if pdf_toc = TRUE.
`path`	An optional path designation for the location of the pdf to be converted to text. The pdftools package is used for this conversion.
`pdf_toc`	TRUE/FALSE whether the pdf_toc function should be used from the pdftools package. This is most useful if the pdf has the table of contents embedded within the pdf. Must specify path = TRUE if pdf_toc = TRUE.
`full_line`	TRUE/FALSE indicating whether the headings should reside on their own line. This can create problems with multiple column pdfs.
`ignore_case`	TRUE/FALSE/vector of TRUE/FALSE, indicating whether the case of the keyword matters. Default is FALSE meaning that case of the headings keywords are literal. If a vector, must be same length as the headings vector.
`split_pdf`	TRUE/FALSE indicating whether to split the pdf using white space. This would be most useful with multicolumn pdf files. The split_pdf function attempts to recreate the column layout of the text into a single column starting with the left column and proceeding to the right.
`convert_sentence`	TRUE/FALSE indicating if individual lines of PDF file should be collapsed into a single large paragraph to perform keyword searching. Default is FALSE

Examples

file <- system.file('pdf', '1501.00450.pdf', package = 'pdfsearch')

heading_search(file, headings = c('abstract', 'introduction'),
  path = TRUE)

lebebr01/pdfsearch documentation built on June 14, 2025, 6:52 p.m.

lebebr01/pdfsearch index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

lebebr01/pdfsearch
Search Tools for PDF Files

heading_search: Function to locate sections of pdf
In lebebr01/pdfsearch: Search Tools for PDF Files

Function to locate sections of pdf

Description

Usage

Arguments

Examples

Related to heading_search in lebebr01/pdfsearch...

R Package Documentation

Browse R Packages

We want your feedback!

lebebr01/pdfsearch Search Tools for PDF Files

heading_search: Function to locate sections of pdf In lebebr01/pdfsearch: Search Tools for PDF Files

Function to locate sections of pdf

Description

Usage

Arguments

Examples

Related to heading_search in lebebr01/pdfsearch...

R Package Documentation

Browse R Packages

We want your feedback!

lebebr01/pdfsearch
Search Tools for PDF Files

heading_search: Function to locate sections of pdf
In lebebr01/pdfsearch: Search Tools for PDF Files