Extract highlighted or underlined text from a PDF file

Description

Extract highlighted or underlined text from a PDF file

Usage

1
extractPDF(file, type=c("Highlight", "Underline", "Both"), collapse = TRUE)

Arguments

file

charater, path of a PDF file.

type

Type of text exerpts. If Both, highlighted and underlined text is extracted.

collapse

logical, the result is concatenated into a single string when TRUE.

Details

This function uses the PDF Clown java library to do the hardwork.

To extract the highlighted text, the highlighted must be created with specific PDF reader. You can refer to Docear project http://www.docear.org/support/user-manual/#compatible_pdf_readers for settings of PDF reader.

Value

character vector of the extracted text.

Author(s)

Ronggui HUANG

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.