extractPDF: Extract highlighted or underlined text from a PDF file

Description Usage Arguments Details Value Author(s)

Description

Extract highlighted or underlined text from a PDF file

Usage

1
extractPDF(file, type=c("Highlight", "Underline", "Both"), collapse = TRUE)

Arguments

file

charater, path of a PDF file.

type

Type of text exerpts. If Both, highlighted and underlined text is extracted.

collapse

logical, the result is concatenated into a single string when TRUE.

Details

This function uses the PDF Clown java library to do the hardwork.

To extract the highlighted text, the highlighted must be created with specific PDF reader. You can refer to Docear project http://www.docear.org/support/user-manual/#compatible_pdf_readers for settings of PDF reader.

Value

character vector of the extracted text.

Author(s)

Ronggui HUANG


rpdfclown documentation built on May 2, 2019, 6:40 p.m.