raadfiles-admin: Raadfiles administration tools

raadfiles-adminR Documentation

Raadfiles administration tools

Description

Administration tools for managing a data library.

Usage

get_raad_data_roots()

get_raad_filenames(all = FALSE)

set_raad_data_roots(
  ...,
  replace_existing = TRUE,
  use_known_candidates = FALSE,
  verbose = TRUE
)

raad_filedb_path(...)

set_raad_filenames(clobber = FALSE)

run_build_raad_cache()

Arguments

all

if ‘TRUE' include ’data_deprecated', expert-use only

...

input file paths to set

replace_existing

replace existing paths, defaults to TRUE

use_known_candidates

apply internal logic for known candidates (for internal use at raad-hq), defaults to FALSE

verbose

issue warnings?

clobber

by default do not ignore existing file cache, set to TRUE to ignore and set

Details

These management functions are aimed at raadtools users, but can be used for any file collection. The administration tools consist of **data roots** and control over the building, reading, and caching of the available file list. No interpretation of the underlying files is provided in the administration tools.

A typical user won't use these functions but may want to investigate the contents of the raw file list, with 'get_raad_filenames()'.

A user setting up a raadfiles collection will typically set the root directory/directories with 'set_raad_data_roots()', then run the file cache list builder with 'run_build_raad_cache()', and then 'set_raad_filenames()' to actually load the file cache into memory.

In a new R session there is no need to run 'set_raad_filenames()' directly as this will be done as the package loads. To disable this automatic behaviour use 'options(raadfiles.file.cache.disable = TRUE)' *before* the package is used or loaded. This is typically done when calling 'run_build_raad_cache()' in a cron task.

Every raadfiles file collection function (e.g. 'oisst_daily_files') will run 'get_raad_filenames' to obtain the full raw list of available files from the global in-memory option 'getOption("raadfiles.env")$raadfiles.filename.database' and there is a low threshold probability that this will also trigger a re-read of the file listing from the root directories. To avoid this trigger either use that directly directly to get the in-memory file list, or set 'options(raadfiles.file.refresh.threshold = 0)' to prevent the trigger. (Set it to 1 to force it always to be read, also controlled by 'set_raad_filenames(clobber = TRUE)').

There is a family of functions and global options used for administration.

Administration functions

set_raad_data_roots set data root paths, for normal use only one data root is needed
set_raad_filenames runs the system to update the file listing and refresh it
get_raad_data_roots returns the current list of visible root directories
get_raad_filenames returns the entire list of all files found in visible root directories
run_build_raad_cache scan all root directories and update the file listing in each

Options for use by administrators

raadfiles.data.roots the list of paths to root directories
raadfiles.file.cache.disable disable on-load setting of the in-memory file cache (never set automatically by the package)
raadfiles.file.refresh.threshold threshold probability of how often to refresh in-memory file cache (0 = never, 1 = every time `get_raad_filenames()` is called)

Internal options, used by the package

Options used internally, and subject to control by adminstrator options and the running of admin functions (they may not be set).

raadfiles.env an environment with the data frame of all file names from the data roots in a object named 'raadfiles.filename.database'
raadfiles.database.status a status record of the in-memory filename database (timestamp)

AustralianAntarcticDivision/raadfiles documentation built on Feb. 15, 2024, 6:14 p.m.