group_by: Time GROUP BY operation

Description Usage Arguments Functions

Description

Internally this uses data.table, because it's fast and we want the speed.

Usage

1
2
3
4
5
6
group_by_local_shuffle(dir, nworkers = 3L,
  assign_groups = data_local_group_assign, group_fun = median)

greedy_group_assign(P, nworkers)

data_local_group_assign(P, w)

Arguments

dir

directory where data can be found

nworkers

number of workers

assign_groups

function to assign files and groups to workers, signature must match default

group_fun

function to apply to each group

Functions


clarkfitzg/RDataBenchmarks documentation built on June 29, 2019, 11:38 p.m.