group_by: Time GROUP BY operation

Description Usage Arguments Functions

Description

Internally this uses data.table, because it's fast and we want the speed.

Usage

1
2
3
4
5
6
group_by_local_shuffle(dir, nworkers = 3L,
  assign_groups = data_local_group_assign, group_fun = median)

greedy_group_assign(P, nworkers)

data_local_group_assign(P, w)

Arguments

dir

directory where data can be found

nworkers

number of workers

assign_groups

function to assign files and groups to workers, signature must match default

group_fun

function to apply to each group

Functions


clarkfitzg/r_data_benchmarks documentation built on July 1, 2019, 9:41 a.m.