| Tensor | R Documentation |
can store on hard drive, and read slices of GB-level data in seconds
self
the sliced data
a data frame with the dimension names as index columns and
value_name as value column
original array
the collapsed data
dimdimension of the array
dimnamesdimension names of the array
use_indexwhether to use one dimension as index when storing data as multiple files
hybridwhether to allow data to be written to disk
last_usedtimestamp of the object was read
temporarywhether to remove the files once garbage collected
varnamesdimension names (read-only)
read_onlywhether to protect the swap files from being changed
swap_filefile or files to save data to
finalize()release resource and remove files for temporary instances
Tensor$finalize()
print()print out the data dimensions and snapshot
Tensor$print(...)
...ignored
.use_multi_files()Internally used, whether to use multiple files to cache data instead of one
Tensor$.use_multi_files(mult)
multlogical
new()constructor
Tensor$new( data, dim, dimnames, varnames, hybrid = FALSE, use_index = FALSE, swap_file = temp_tensor_file(), temporary = TRUE, multi_files = FALSE )
datanumeric array
dimdimension of the array
dimnamesdimension names of the array
varnamescharacters, names of dimnames
hybridwhether to enable hybrid mode
use_indexwhether to use the last dimension for indexing
swap_filewhere to store the data in hybrid mode
files to save data by index; default stores in
raveio_getopt('tensor_temp_path')
temporarywhether to remove temporary files when existing
multi_filesif use_index is true, whether to use multiple
subset()subset tensor
Tensor$subset(..., drop = FALSE, data_only = FALSE, .env = parent.frame())
...dimension slices
dropwhether to apply drop on subset data
data_onlywhether just return the data value, or wrap them as a
Tensor instance
.envenvironment where ... is evaluated
flatten()converts tensor (array) to a table (data frame)
Tensor$flatten(include_index = FALSE, value_name = "value")
include_indexlogical, whether to include dimension names
value_namecharacter, column name of the value
to_swap()Serialize tensor to a file and store it via
write_fst
Tensor$to_swap(use_index = FALSE, delay = 0)
use_indexwhether to use one of the dimension as index for faster loading
delayif greater than 0, then check when last used, if not long
ago, then do not swap to hard drive. If the difference of time is
greater than delay in seconds, then swap immediately.
to_swap_now()Serialize tensor to a file and store it via
write_fst immediately
Tensor$to_swap_now(use_index = FALSE)
use_indexwhether to use one of the dimension as index for faster loading
get_data()restore data from hard drive to memory
Tensor$get_data(drop = FALSE, gc_delay = 3)
dropwhether to apply drop to the data
gc_delayseconds to delay the garbage collection
set_data()set/replace data with given array
Tensor$set_data(v)
vthe value to replace the old one, must have the same dimension
noticethe a tensor is an environment. If you change at one place, the data from all other places will change. So use it carefully.
collapse()apply mean, sum, or median to collapse data
Tensor$collapse(keep, method = "mean")
keepwhich dimensions to keep
method"mean", "sum", or "median"
operate()apply the tensor by anything along given dimension
Tensor$operate(
by,
fun = .Primitive("/"),
match_dim,
mem_optimize = FALSE,
same_dimension = FALSE
)byR object
funfunction to apply
match_dimwhich dimensions to match with the data
mem_optimizeoptimize memory
same_dimensionwhether the return value has the same dimension as the original instance
if(!is_on_cran()){
# Create a tensor
ts <- Tensor$new(
data = 1:18000000, c(3000,300,20),
dimnames = list(A = 1:3000, B = 1:300, C = 1:20),
varnames = c('A', 'B', 'C'))
# Size of tensor when in memory is usually large
# `lobstr::obj_size(ts)` -> 8.02 MB
# Enable hybrid mode
ts$to_swap_now()
# Hybrid mode, usually less than 1 MB
# `lobstr::obj_size(ts)` -> 814 kB
# Subset data
start1 <- Sys.time()
subset(ts, C ~ C < 10 & C > 5, A ~ A < 10)
#> Dimension: 9 x 300 x 4
#> - A: 1, 2, 3, 4, 5, 6,...
#> - B: 1, 2, 3, 4, 5, 6,...
#> - C: 6, 7, 8, 9
end1 <- Sys.time(); end1 - start1
#> Time difference of 0.188035 secs
# Join tensors
ts <- lapply(1:20, function(ii){
Tensor$new(
data = 1:9000, c(30,300,1),
dimnames = list(A = 1:30, B = 1:300, C = ii),
varnames = c('A', 'B', 'C'), use_index = 2)
})
ts <- join_tensors(ts, temporary = TRUE)
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.