mvl_write_extent_index | R Documentation |
This function computes a hash-based index that allows to find indices of rows which hashes match query values. While it can be applied to arbitrary data, it is optimized for the common case when vectors contain stretches of repeated values describing row groups to be processed. This is particularly relevant for R because vectorized processing of row batches is the only practical way to scan very large tables using pure-R code.
mvl_write_extent_index(MVLHANDLE, L, name = NULL)
MVLHANDLE |
a handle to MVL file produced by mvl_open() |
L |
list of vector like MVL_OBJECTs |
name |
if specified add a named entry to MVL file directory |
mvl_write_extent_index()
creates the index in memory and then writes it out. The memory usage is proportional to the number of
repeat stretches. Sorting tables improves performance, but is not a requirement.
an object of class MVL_OFFSET that describes an offset into this MVL file. MVL offsets are vectors and can be concatenated. They can be written to MVL file directly, or as part of another object such as list.
mvl_order_vectors
, mvl_index_lapply
, mvl_find_matches
, mvl_group
, mvl_find_matches
, mvl_indexed_copy
, mvl_merge
, mvl_hash_vectors
, mvl_get_groups
## Not run:
Mtmp<-mvl_open("tmp_a.mvl", append=TRUE, create=TRUE)
mvl_write_object(Mtmp, data.frame(x=runif(100), y=(1:100) %% 10), "df1")
Mtmp<-mvl_remap(Mtmp)
mvl_write_extent_index(Mtmp, list(Mtmp$df1[,"y",ref=TRUE]), "df1_extent_index_y")
Mtmp<-mvl_remap(Mtmp)
mvl_index_lapply(Mtmp["df1_extent_index_y", ref=TRUE], list(c(2, 3)),
function(i, idx) { return(list(i, idx))})
# Example of full scan
mvl_index_lapply(Mtmp["df1_extent_index_y", ref=TRUE], ,
function(i, idx) { return(list(i, idx))})
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.