Set and get tracking options on a tracked environment

Description

Set and get tracking options on a tracked environment. Each tracked environment has its own set of tracking options exists which can be changed indpendently. Global default values can be set in options("global.track.options").

Usage

1
2
3
track.options(..., pos = 1, envir = as.environment(pos),
              values=list(...), save = FALSE, clear=FALSE, delete=FALSE,
              trackingEnv, only.preprocess = FALSE, old.options = list())

Arguments

...

Either option names as character data, or specifications for setting options as named arguments or in a named list. See DETAILS for descriptions of options.

pos

The search path position of the environment being tracked (default is 1 for the global environment)

envir

The environment being tracked. This is an alternate way (to the use of pos=) of specifying the environment being tracked, but should be rarely needed.

values

A named list of option values to set. track.options(readonly=T) is equivalent to track.options(values=list(readonly=TRUE))

save

If TRUE, current options are saved to disk and will be used in future. Note that all current options settings are saved, not just the new settings made in this call.

clear

If TRUE, and the option can have multiple values (e.g., autoTrackExcludeClass), the current values are cleared prior to using the supplied values. The default behavior, with clear=FALSE and delete=FALSE is to add supplied values to multi-valued options, and to replace the value for single-valued options.

delete

If TRUE, and the option can have multiple values, the supplied values are removed from the current values (if they are not in the current values, they are silently ignored.)

trackingEnv

The hidden environment in which tracked objects are stored. It is not necessary to supply this in normal use.

only.preprocess

If TRUE, process any options specifications and return the full list of option settings with the values as specified, and defaults for all othe roptions. Stored options are neither accessed nor changed. Intended for internal use.

old.options

A list of old options to use, can only be suppled when only.preprocess=TRUE. Intended for internal use.

Details

Valid option names and values are as follows:

alwaysCache:

character (default ".Last"): vector of objects to always keep in memory. ".Last" is here to avoid difficulties quitting R if the tracking DB becomes unavailable.

alwaysCacheClass:

character (default "ff"): vector of classes whose objects are always kept in memory. "ff" is here by default because "ff" objects generally occupy only a small amount of memory, and flushing the object from memory causes unnecessary finalization calls on the external pointers in "ff" objects, which changes their behavior.

alwaysSaveSummary:

logical (default TRUE) if TRUE, always save the summary on any change to the summary. Summaries are not saved for databases attached in a readonly mode.

autoTrackExcludeClass:

character vector. Variables whose class is in this vector are not auto-tracked. The default is "RODBC", because variables of that class do not work after being saved and reloaded.

autoTrackExcludePattern:

character vector (default c("^\.track", "^\.required")) variables whose name matches any of these regular expressions are not auto-tracked

autoTrackFullSyncWait:

(default -1) auto track will wait at least this many seconds between doing a full sync at the end of a top level task. If equal to zero, do a full sync at the end of each top level task. If less than zero, don't do a full sync. Doing a full sync can be slow, so this is off by default.

cache:

logical (default TRUE): keep objects in memory?

cacheKeepFun:

A function that specifies which objects to keep in memory at the end of a top-level-task. track.plugins for further info. Can be "none" or NULL.

cachePolicy:

The higher-level policy to follow regarding keeping objects in memory. Currently has two possible values - one of them allows special action at the end of a top-level-task:

"none":

No special action at end of task, i.e., follow option cache

"eotPurge":

Purge objects from memory at the end of a top-level task

Also affects when changes to objects are written to disk - see option writeToDisk below.

clobberVars:

vector of string specifying variables to be clobbered silently when attaching a tracking db

compress:

character or logical (default TRUE) passed to save(). Possible values are "none", "gzip", "xz", "bzip2". save() currently uses gzip by default (i.e., when compress=TRUE), which according to save() offers the best tradeoff of filesize and compression and decompression times.

compression_level:

numeric (default 1) passed to save()

debug:

integer (default 0) if > 0, print some diagnostic debugging messages

maintainSummary:

logical (default TRUE) if TRUE, record time & number of changes and accesses

RDataSuffix:

character (default "rda") suffix to use for files containing saved R objects

readonly:

logical (default TRUE for track.attach() and FALSE for track.start()) should any changes be allowed to the files? Note that this option is a function of how a tracking database is accessed – it is not a property of the database itself. A particular tracking database can attached on one R session with readonly=TRUE and at the same time be attached to another R session with readonly=FALSE. To unconditionally protect a tracking database from modification, use file permissions.

recordAccesses:

logical (default TRUE) if TRUE, record counts and times for access ("get") operations on tracked variables

summaryAccess:

logical, or integer value 0,1,2,3,4; controls what info about accesses is output by track.summary()

summaryTimes:

logical, or integer value 0,1,2,3 (see track.summary() for the effect of these settings)

writeToDisk:

logical (default TRUE): always write changed objects to disk? If TRUE, when objects are written to disk depends on cachePolicy: cachePolicy="none": write objects immediately on a change; cachePolicy="eotPurge": write changed objects at the end of a top-level task

The option settings are saved as a list in an object called .trackingOptions in the tracking environment (with a copy mirrored to a file in the tracking dir if save=TRUE.)

The options can be used to tune performance to resource availability (time & memory) and robustness in the face of machine or user error. Some possible settings are:

maximize robustness and speed:

cache=TRUE and writeToDisk=TRUE (the default): always write an object to disk when it is changed, and keep a copy in memory, so that an object only needs to be read once

minimize memory usage and maximize robustness:

writeToDisk=TRUE, cache=FALSE: always write an object to disk when it is changed, and don't keep a copy in memory – need to read from disk whenever the object is referred to

maximize speed:

writeToDisk=FALSE, cache=TRUE: don't write the object to disk - just keep a copy in memory after it is first accessed and only write it when track.stop() or one of track.save() or its friends is called. This combination less robust because changed variables can be lost if R crashes, or the user quits R without remembering to call track.stop(). This mode of operation is like the g.data package, but with automatically keeping track of which variables have been changed and need to be written to disk (and the writing of changed variables with one call to track.save() or track.stop()).

The combination writeToDisk=FALSE and cache=FALSE is possible, but is unlikely to be desirable – this will keep changed objects in memory, but will not keep merely fetched objects in memory.

The options maintainSummary, recordAccesses, and alwaysSaveSummary control when the object summary is updated and when it is saved to disk (the default is for it to be updated and saved to disk for every read and write access to an object, whether or not the object is cached in memory).

Global default values can be set in options("global.track.options") as a list like options(global.track.options=list(cache=TRUE, cachePolicy='eotPurge')).

Value

The value returned is a list of option values. If options were specified as arguments, the old values of those options are returned (unless only.preprocess=TRUE was supplied). If no options were specified as arguments, the full list of current option values is returned.

Cache plugin functions

track allows users to supply their own plugin functions that specify cache rules. The plugin function is called at the end of a top-level command. The default plugin function implements a rule that flushes least-recently accessed large objects from the cache when more memory usage is over a threshold. See track.plugins for further info.

Author(s)

Tony Plate <tplate@acm.org>

See Also

Overview and design of the track package. See track.plugins for description of cache plugin functions

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
##############################################################
# Warning: running this example will cause variables currently
# in the R global environment to be written to .RData files
# in a tracking database on the filesystem under R's temporary
# directory, and will cause the variables to be removed temporarily
# from the R global environment.
# It is recommended to run this example with a fresh R session
# with no important variables in the global environment.
##############################################################

library(track)
track.start(dir=file.path(tempdir(), 'rdatadir6'))
x <- 33
X <- array(1:24, dim=2:4)
track.status()
track.options(cache=TRUE, writeToDisk=FALSE) # change for just this session
# different ways of retrieving option values
track.options(c("cache", "writeToDisk"))
track.options("cache", "writeToDisk")
track.options("cache")
track.options()
# see the effect of the changed options on the status of X (X is not saved to disk)
track.status()
X[1,1,1] <- 0
track.status()
track.flush()
track.status()
track.stop(pos=1)
track.start(dir=file.path(tempdir(), 'rdatadir6'))
# note that options previously changed are back at defaults (because default
# to track.options() is save=FALSE
track.options(c("cache", "writeToDisk"))
track.options(cache=TRUE, writeToDisk=FALSE, save=TRUE) # change the options on disk
track.options(c("cache", "writeToDisk"))
track.stop(pos=1)
track.start(dir=file.path(tempdir(), 'rdatadir6'))
# now options previously changed are remembered (because track.options(..., save=TRUE) was used)
track.options(c("cache", "writeToDisk"))
track.stop(pos=1, keepVars=TRUE)