Description Usage Arguments Value Functions Examples
map + arrow: iterate over a function and collate the results into an Arrow dataset. This happens without the whole dataset being in memory, so is suitable for large data objects. The function must return a data.frame or tibble. The returned value is a path to the directory containing the Arrow dataset.
1 2 3 4 5 | marrow_dir(.x, .f, ..., .path, .partitioning = c(), .format = "parquet")
marrow_ds(.x, .f, ..., .path, .partitioning = c(), .format = "parquet")
marrow_files(.x, .f, ..., .path, .partitioning = c(), .format = "parquet")
|
.x |
vector or list of values for .f to iterate over |
.f |
function; must return a data.frame/tibble |
... |
other arguments to .f |
.path |
path to directory where collated Arrow dataset will be stored. will be created if it does not exist |
.partitioning |
character vector of columns to use for partitioning. Columns must exist in output of .f. |
.format |
"parquet" (the default) or "arrow". |
path to new dataset directory; character string of length one.
an Arrow Dataset
character vector containing paths to all files in dataset dir
marrow_dir
: Return path to directory containing dataset
marrow_ds
: Return Arrow Dataset
marrow_files
: Return paths to all files in dataset dir
1 2 3 4 5 6 7 8 | months <- unique(airquality$Month)
td <- tempdir()
part_of_aq <- function(month) {
airquality[airquality$Month==month,]
}
aq_arrow <- purrrow:::marrow_dir(months, part_of_aq,
.path = td)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.