| extract | R Documentation |
This function wraps many methods to extract text from non-scanned PDFs - no OCR methods used here. Available methods include xpdf, Ghostscript, and Poppler via pdftools
extract(paths, which = "xpdf", ...)
paths |
(character) One or more paths to a file |
which |
(character) One of gs, xpdf (default), or pdftools |
... |
further args passed on |
A list or a single object, of class gs_extr,
xpdf_extr, or poppler_extr. All share the
same global class extr
## Not run:
path <- system.file("examples", "example1.pdf", package = "extractr")
# xpdf
xpdf <- extract(path, "xpdf")
xpdf$meta
xpdf$data
# Ghostscript
gs <- extract(path, "gs")
gs$meta
gs$data
# pdftools
pdft <- extract(path, "pdftools")
pdft$meta
cat(pdft$data)
# Pass many paths at once
path1 <- system.file("examples", "example1.pdf", package = "extractr")
path2 <- system.file("examples", "example2.pdf", package = "extractr")
path3 <- system.file("examples", "example3.pdf", package = "extractr")
extract(c(path1, path2, path3))
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.