Tensor | R Documentation |
can store on hard drive, and read slices of GB-level data in seconds
self
the sliced data
a data frame with the dimension names as index columns and
value_name
as value column
original array
the collapsed data
dim
dimension of the array
dimnames
dimension names of the array
use_index
whether to use one dimension as index when storing data as multiple files
hybrid
whether to allow data to be written to disk
last_used
timestamp of the object was read
temporary
whether to remove the files once garbage collected
varnames
dimension names (read-only)
read_only
whether to protect the swap files from being changed
swap_file
file or files to save data to
finalize()
release resource and remove files for temporary instances
Tensor$finalize()
print()
print out the data dimensions and snapshot
Tensor$print(...)
...
ignored
.use_multi_files()
Internally used, whether to use multiple files to cache data instead of one
Tensor$.use_multi_files(mult)
mult
logical
new()
constructor
Tensor$new( data, dim, dimnames, varnames, hybrid = FALSE, use_index = FALSE, swap_file = temp_tensor_file(), temporary = TRUE, multi_files = FALSE )
data
numeric array
dim
dimension of the array
dimnames
dimension names of the array
varnames
characters, names of dimnames
hybrid
whether to enable hybrid mode
use_index
whether to use the last dimension for indexing
swap_file
where to store the data in hybrid mode
files to save data by index; default stores in
raveio_getopt('tensor_temp_path')
temporary
whether to remove temporary files when existing
multi_files
if use_index
is true, whether to use multiple
subset()
subset tensor
Tensor$subset(..., drop = FALSE, data_only = FALSE, .env = parent.frame())
...
dimension slices
drop
whether to apply drop
on subset data
data_only
whether just return the data value, or wrap them as a
Tensor
instance
.env
environment where ...
is evaluated
flatten()
converts tensor (array) to a table (data frame)
Tensor$flatten(include_index = FALSE, value_name = "value")
include_index
logical, whether to include dimension names
value_name
character, column name of the value
to_swap()
Serialize tensor to a file and store it via
write_fst
Tensor$to_swap(use_index = FALSE, delay = 0)
use_index
whether to use one of the dimension as index for faster loading
delay
if greater than 0, then check when last used, if not long
ago, then do not swap to hard drive. If the difference of time is
greater than delay
in seconds, then swap immediately.
to_swap_now()
Serialize tensor to a file and store it via
write_fst
immediately
Tensor$to_swap_now(use_index = FALSE)
use_index
whether to use one of the dimension as index for faster loading
get_data()
restore data from hard drive to memory
Tensor$get_data(drop = FALSE, gc_delay = 3)
drop
whether to apply drop
to the data
gc_delay
seconds to delay the garbage collection
set_data()
set/replace data with given array
Tensor$set_data(v)
v
the value to replace the old one, must have the same dimension
notice
the a tensor is an environment. If you change at one place, the data from all other places will change. So use it carefully.
collapse()
apply mean, sum, or median to collapse data
Tensor$collapse(keep, method = "mean")
keep
which dimensions to keep
method
"mean"
, "sum"
, or "median"
operate()
apply the tensor by anything along given dimension
Tensor$operate( by, fun = .Primitive("/"), match_dim, mem_optimize = FALSE, same_dimension = FALSE )
by
R object
fun
function to apply
match_dim
which dimensions to match with the data
mem_optimize
optimize memory
same_dimension
whether the return value has the same dimension as the original instance
if(!is_on_cran()){
# Create a tensor
ts <- Tensor$new(
data = 1:18000000, c(3000,300,20),
dimnames = list(A = 1:3000, B = 1:300, C = 1:20),
varnames = c('A', 'B', 'C'))
# Size of tensor when in memory is usually large
# `lobstr::obj_size(ts)` -> 8.02 MB
# Enable hybrid mode
ts$to_swap_now()
# Hybrid mode, usually less than 1 MB
# `lobstr::obj_size(ts)` -> 814 kB
# Subset data
start1 <- Sys.time()
subset(ts, C ~ C < 10 & C > 5, A ~ A < 10)
#> Dimension: 9 x 300 x 4
#> - A: 1, 2, 3, 4, 5, 6,...
#> - B: 1, 2, 3, 4, 5, 6,...
#> - C: 6, 7, 8, 9
end1 <- Sys.time(); end1 - start1
#> Time difference of 0.188035 secs
# Join tensors
ts <- lapply(1:20, function(ii){
Tensor$new(
data = 1:9000, c(30,300,1),
dimnames = list(A = 1:30, B = 1:300, C = ii),
varnames = c('A', 'B', 'C'), use_index = 2)
})
ts <- join_tensors(ts, temporary = TRUE)
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.