pdf_ocr: OCR text extraction
In pdftools: Text Extraction, Rendering and Converting of PDF Documents

pdf_ocr_text

R Documentation

OCR text extraction

Description

Perform OCR text extraction. This requires you have the tesseract package.

Usage

pdf_ocr_text(
  pdf,
  pages = NULL,
  opw = "",
  upw = "",
  dpi = 600,
  language = "eng",
  options = NULL
)

pdf_ocr_data(
  pdf,
  pages = NULL,
  opw = "",
  upw = "",
  dpi = 600,
  language = "eng",
  options = NULL
)

Arguments

`pdf`	file path or raw vector with pdf data
`pages`	which pages of the pdf file to extract
`opw`	string with owner password to open pdf
`upw`	string with user password to open pdf
`dpi`	resolution to render image that is passed to pdf_convert.
`language`	passed to tesseract to specify the languge of the engine.
`options`	passed to tesseract to specify OCR parameters

pdftools
Text Extraction, Rendering and Converting of PDF Documents

pdf_ocr: OCR text extraction
In pdftools: Text Extraction, Rendering and Converting of PDF Documents

OCR text extraction

Description

Usage

Arguments

See Also

Related to pdf_ocr in pdftools...

R Package Documentation

Browse R Packages

We want your feedback!

pdftools Text Extraction, Rendering and Converting of PDF Documents

pdf_ocr: OCR text extraction In pdftools: Text Extraction, Rendering and Converting of PDF Documents

OCR text extraction

Description

Usage

Arguments

See Also

Related to pdf_ocr in pdftools...

R Package Documentation

Browse R Packages

We want your feedback!

pdftools
Text Extraction, Rendering and Converting of PDF Documents

pdf_ocr: OCR text extraction
In pdftools: Text Extraction, Rendering and Converting of PDF Documents