edit_docinfo: Set/get pdf document info dictionary

edit_docinfoR Documentation

Set/get pdf document info dictionary

Description

get_docinfo() gets pdf document info from a file. set_docinfo() sets pdf document info for a file.

Usage

get_docinfo(filename, use_names = TRUE)

get_docinfo_pdftools(filename, use_names = TRUE)

get_docinfo_exiftool(filename, use_names = TRUE)

set_docinfo_exiftool(docinfo, input, output = input)

get_docinfo_pdftk(filename, use_names = TRUE)

set_docinfo(docinfo, input, output = input)

set_docinfo_gs(docinfo, input, output = input)

set_docinfo_pdftk(docinfo, input, output = input)

Arguments

filename

Filename(s) (pdf) to extract info dictionary entries from.

use_names

If TRUE (default) use filename as the names of the result.

docinfo

A "docinfo" object (as returned by docinfo() or get_docinfo()).

input

Input pdf filename.

output

Output pdf filename.

Details

get_docinfo() will try to use the following helper functions in the following order:

  1. get_docinfo_pdftk() which wraps pdftk command-line tool

  2. get_docinfo_exiftool() which wraps exiftool command-line tool

  3. get_docinfo_pdftools() which wraps pdftools::pdf_info()

set_docinfo() will try to use the following helper functions in the following order:

  1. set_docinfo_exiftool() which wraps exiftool command-line tool

  2. set_docinfo_gs() which wraps ghostscript command-line tool

  3. set_docinfo_pdftk() which wraps pdftk command-line tool

Value

docinfo() returns a "docinfo" R6 class. get_docinfo() returns a list of "docinfo" R6 classes. set_docinfo() returns the (output) filename invisibly.

Known limitations

  • Currently does not support arbitrary info dictionary entries.

  • As a side effect set_docinfo_gs() seems to also update in previously set matching XPN metadata while set_docinfo_exiftool() and set_docinfo_pdftk() don't update any previously set matching XPN metadata. Some pdf viewers will preferentially use the previously set document title from XPN metadata if it exists instead of using the title set in documentation info dictionary entry. Consider also manually setting this XPN metadata using set_xmp().

  • Old metadata information is usually not deleted from the pdf file by these operations. If deleting the old metadata is important one may want to try qpdf::pdf_compress(input, linearize = TRUE).

  • get_docinfo_exiftool() will "widen" datetimes to second precision.

  • get_docinfo_pdftools()'s datetimes may not accurately reflect the embedded datetimes.

  • set_docinfo_pdftk() may not correctly handle documentation info entries with newlines in them.

See Also

docinfo() for more information about the documentation info objects. supports_get_docinfo(), supports_set_docinfo(), supports_gs(), and supports_pdftk() to detect support for these features. For more info about the pdf document info dictionary see https://opensource.adobe.com/dc-acrobat-sdk-docs/library/pdfmark/pdfmark_Basic.html#document-info-dictionary-docinfo.

Examples

if (supports_set_docinfo() && supports_get_docinfo() && require("grid", quietly = TRUE)) {
  f <- tempfile(fileext = ".pdf")
  pdf(f, onefile = TRUE)
  grid.text("Page 1")
  grid.newpage()
  grid.text("Page 2")
  invisible(dev.off())

  cat("\nInitial documentation info:\n\n")
  d <- get_docinfo(f)[[1]]
  print(d)

  d <- update(d,
              author = "John Doe",
              title = "Two Boring Pages",
              keywords = c("R", "xmpdf"))
  set_docinfo(d, f)

  cat("\nDocumentation info after setting it:\n\n")
  print(get_docinfo(f)[[1]])

  unlink(f)
}

xmpdf documentation built on July 4, 2024, 9:08 a.m.