shard: Shard a data.frame/data.table or disk.frame into chunk and...

View source: R/shard.r

shardR Documentation

Shard a data.frame/data.table or disk.frame into chunk and saves it into a disk.frame

Description

Shard a data.frame/data.table or disk.frame into chunk and saves it into a disk.frame

'distribute' is an alias for 'shard'

Usage

shard(
  df,
  shardby,
  outdir = tempfile(fileext = ".df"),
  ...,
  nchunks = recommend_nchunks(df),
  overwrite = FALSE
)

distribute(...)

Arguments

df

A data.frame/data.table or disk.frame. If disk.frame, then rechunk(df, ...) is run

shardby

The column(s) to shard the data by.

outdir

The output directory of the disk.frame

...

not used

nchunks

The number of chunks

overwrite

If TRUE then the chunks are overwritten

Examples


# shard the cars data.frame by speed so that rows with the same speed are in the same chunk
iris.df = shard(iris, "Species")

# clean up cars.df
delete(iris.df)

disk.frame documentation built on Aug. 24, 2023, 5:09 p.m.