update_PACKAGES: update existing package repository

View source: R/write_PACKAGES2.R

update_PACKAGESR Documentation

update existing package repository

Description

Update an existing repository by reading the PACKAGES file and only processing built package tarballs which do not match existing entries.

update_PACKAGES can be much faster than write_PACKAGES for small-moderate changes to large repository indexes.

Usage

update_PACKAGES(
  dir = ".",
  fields = NULL,
  type = c("source", "mac.binary", "win.binary"),
  verbose = dryrun,
  unpacked = FALSE,
  subdirs = FALSE,
  latestOnly = TRUE,
  addFiles = FALSE,
  strict = TRUE,
  dryrun = FALSE,
  logfun = message,
  ...
)

Arguments

dir

See write_PACKAGES

fields

See write_PACKAGES

type

See write_PACKAGES

verbose

Should informative messages be displayed throughout the proccess. Defaults to the value of dryrun (whose own default is FALSE) NOT passed to write_PACKAGES

unpacked

See write_PACKAGES

subdirs

See write_PACKAGES

latestOnly

See write_PACKAGES

addFiles

See write_PACKAGES

strict

logical. Should 'strict mode' be used when checking existing PACKAGES entries. See details. Defaults to TRUE.

dryrun

logical. Should should the necessary updates be calculated but NOT applied. (default FALSE)

logfun

function. If verbose is TRUE, the function to be used to emit the informative messages. Defaults to message

...

Additional arguments to write_PACKAGES - e.g., the relatively new rds_compress argument.

Details

Throughout this section, package tarball is taken to mean a tarball file in dir whose name can be interpreted as <package>_<version>.<ext> (or that is pointed to by the File field of an existing PACKAGES entry). Novel package tarballs are those which do not match an existing PACKAGES file entry.

update_PACKAGES avoids (re)processing package tarballs in cases where a PACKAGES file entry already exists and appears to remain valid. The logic for detecting still-valid entries is as follows:

Currently update_PACKAGES calls directly down to write_PACKAGES (and thus no speedup should be expected) if any of the following conditions hold:

  • No PACKAGES file exists under dir

  • unpacked is TRUE

  • subdirs is anything other than FALSE

  • fields is not NULL and one or more specified fields are not present in the existing PACKAGES file

All package tarballs whose last modify times are later than that of the existing PACKAGES file are considered novel and no attempt is made to identify or retain any corresponding PACKAGES entries. Similarly, all PACKAGES entries which have no corresponding package tarball are definitionally invalid.

When strict = TRUE, PACKAGES entries which appear to match a package tarball are confirmed via MD5 checksum; those that pass are retained as valid. All novel package tarballs are fully proccessed by the standard write_PACKAGES machinery, and the resulting entries are added. Finally, if latestOnly = TRUE, package-version pruning is performed across the entries.

When strict = FALSE, package tarballs are assumed to encode correct metadata in their filenames. PACKAGES entries which appear to match a package tarball are retained as valid (No MD5sum checking occurs). If latestOnly = TRUE, package-version pruning across the full set of retained entries and novel package tarballs before the processing of the novel tarballs, at significant computational and time savings in some situations. After the optional pruning, any relevant novel package tarballs are processed via write_PACKAGES and added to the set of retained entries.

After the above process concludes, the final database of PACKAGES entries is written to all three PACKAGES files, overwriting the existing files.

Note

While both strict and nonstrict modes offer speedups when updating small percentages of large repositories, non-strict mode is much faster and is recommended in situations where the assumptions it makes are safe.

Author(s)

Gabriel Becker

See Also

write_PACKAGES


gmbecker/switchr documentation built on Feb. 24, 2023, 12:59 p.m.