Potential future features of the track package, in some vague order of feasibility and priority ('easy', 'medium' and 'hard' are an estimate of design and coding difficulty):
(medium) with cachePolicy="tltPurge", changed objects are only written to file at the end of a top level task. However, with cachePolicy="none", objects are written to file on each change – is better control over this needed?
(easy) would this be useful? wouldn't need to cache these, mark with an asterisk in a special column? Compute these each time track.summary is called.
is this redundant option with cache and cachePolicy?
(easy) would it be useful to allow
an environment other than the global environment to be the
default tracking environment? This could be implemented by using
options("tracked.environment")
as the default environment for all the
tracking functions (rather than the currently hardcoded pos=1)
(easy) provide an integrated quiting function that saves all tracked vars and history before quitting (and maybe also saves untracked vars in an RData file)
(hard) allow rule-based decisions for caching, e.g., only cache objects under a certain size, or only cache objects of certain classes, or enforce a limit on memory for caching tracked variables, and flush out least-recently used variables
(easy) record each time a file is read or written in the summary. Could be useful for smarter caching.
(easy) when rebuilding an active tracking environment, base decision whether to use summary row from file or environment on which has more recent dates in it. (whole dataframe, or row by row?)
(medium) check the mod time on filemap.txt when getting the filemap obj, and if the file on disk appears to have changed, reread it instead of just getting it from memory. This would allow working together better with other sessions that are simultaneously using this tracking dir. Don't know how much it would slow things down – do some timings. Note that to make this work in a fool-proof manner would require locks.
doing subset-replacement (e.g.,
X[2] <- ...
) retrieves X
twice
(see example below)
(hard) to allow linking tracking dirs that might be in use by other R processes – would require not recording gets – this would require adding a new env on the search path and tracking it
(hard) automatic flushing of variables that haven't been used frequently (triggered automaticall when memory runs low?) – this is why the summary records fetches as well as writes
(medium) check that we will be able to restart before doing the stop (check for masked variables or other potential clobber problems)
(hard) write files in a safe way so that the original file is not removed until the new file is written – not sure if this is necessary, because objects are in memory, and can be rewritten if there is a failure
(hard) automatically track new variables? (would require hooks in base-R that get called when a new var is created)
Example of the "double-get" when assigning a subset (using the example
from the help page for makeActiveBinding
). Note that it works
correctly, but retrieving the object twice seems unneccessary and could
be slow with very large objects.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | > f <- local( {
+ x <- 1
+ function(v) {
+ if (missing(v))
+ cat("get\n")
+ else {
+ cat("set\n")
+ x <<- v
+ }
+ x
+ }
+ })
> makeActiveBinding("X", f, .GlobalEnv)
NULL
> bindingIsActive("X", .GlobalEnv)
[1] TRUE
> X
get
[1] 1
> X <- 2
set
> X
get
[1] 2
>
> X[1]
get
[1] 2
> X[2] <- 1 # 'X' is fetched twice
get
get
set
> X
get
[1] 2 1
>
|
Overview and design of the track
package.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | # Example (transcript shown above) of how subset-assignment
# results in two retrievals when the object is an active binding.
f <- local( {
x <- 1
function(v) {
if (missing(v)) {
cat("get\n")
} else {
cat("set\n")
x <<- v
}
x
}
})
makeActiveBinding("X", f, .GlobalEnv)
bindingIsActive("X", .GlobalEnv)
X
X <- 2
X
X[1]
X[2] <- 1 # 'X' is fetched twice
X
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.