pdfimager

knitr::opts_chunk$set(
  warning = FALSE,
  message = FALSE,
  collapse = TRUE,
  comment = "#>"
)

R-check

pdfimager - Extract images from pdfs

Docs: https://sckott.github.io/pdfimager/

This packages uses sys R package to "shell out" to pdfimages. Apparently pdfimages is not in poppler cpp, so is not in pdftools R pkg

Install pdfimages

pdfimages is installed when you install poppler

Installation instructions can be found at https://poppler.freedesktop.org/

Install pdfimager

# install.packages("pak")
pak::pak("sckott/pdfimager")
library("pdfimager")

Set the path

Some users may need to manually set the path to pdfimages.

You can do so with a function in this package like

pdimg_set_path()

or set the path for pdfimages before starting R with an env var like:

PDFIMAGER_PATH=C:/some/path/to/poppler/24/bin/pdfimages.exe R

Or set within R like:

Sys.setenv(PDFIMAGER_PATH="C:/some/path/to/poppler/24/bin/pdfimages.exe")

help info

pdimg_help()

pdf image metadata

x <- system.file("examples/BachmanEtal2020.pdf", package="pdfimager")
pdimg_meta(x)

pdf images

x <- system.file("examples/BachmanEtal2020.pdf", package="pdfimager")
pdimg_images(x)

filter images

does a variety of thing to filter images by their metadata, some are configureable

x1 <- system.file("examples/Tierney2017JOSS.pdf", package="pdfimager")
x2 <- system.file("examples/vanGemert2018.pdf", package="pdfimager")
res <- pdimg_images(c(x1, x2))
vapply(res, NROW, 1)
out <- pdimg_filter(res)
vapply(out, NROW, 1)

Meta



sckott/pdfimager documentation built on Sept. 15, 2024, 1:19 a.m.