pdf_ocr_text | R Documentation |
Perform OCR text extraction. This requires you have the tesseract
package.
pdf_ocr_text(
pdf,
pages = NULL,
opw = "",
upw = "",
dpi = 600,
language = "eng",
options = NULL
)
pdf_ocr_data(
pdf,
pages = NULL,
opw = "",
upw = "",
dpi = 600,
language = "eng",
options = NULL
)
pdf |
file path or raw vector with pdf data |
pages |
which pages of the pdf file to extract |
opw |
string with owner password to open pdf |
upw |
string with user password to open pdf |
dpi |
resolution to render image that is passed to pdf_convert. |
language |
passed to tesseract to specify the languge of the engine. |
options |
passed to tesseract to specify OCR parameters |
Other pdftools:
pdftools
,
qpdf
,
rendering
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.