extract | R Documentation |
This function wraps many methods to extract text from non-scanned PDFs - no OCR methods used here. Available methods include xpdf, Ghostscript, and Poppler via pdftools
extract(paths, which = "xpdf", ...)
paths |
(character) One or more paths to a file |
which |
(character) One of gs, xpdf (default), or pdftools |
... |
further args passed on |
A list or a single object, of class gs_extr
,
xpdf_extr
, or poppler_extr
. All share the
same global class extr
## Not run: path <- system.file("examples", "example1.pdf", package = "extractr") # xpdf xpdf <- extract(path, "xpdf") xpdf$meta xpdf$data # Ghostscript gs <- extract(path, "gs") gs$meta gs$data # pdftools pdft <- extract(path, "pdftools") pdft$meta cat(pdft$data) # Pass many paths at once path1 <- system.file("examples", "example1.pdf", package = "extractr") path2 <- system.file("examples", "example2.pdf", package = "extractr") path3 <- system.file("examples", "example3.pdf", package = "extractr") extract(c(path1, path2, path3)) ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.