Home

/

GitHub

/

tejaslodaya/troop

/

troop: group by - apply - multiprocess data.table

troop: group by - apply - multiprocess data.table
In tejaslodaya/troop: Groupby and Apply Function to a data.table Using Parallel Processing

View source: R/troop.R

troop

R Documentation

group by - apply - multiprocess data.table

Description

group by - apply - multiprocess data.table

Usage

troop(data, by, apply_func, preprocess_func = function() { },
  postprocess_func = function() { }, num_chunks = detectCores(logical =
  TRUE), preprocess_args = list(), postprocess_args = list(),
  packages = c(), export = c(), combine = "c", files_to_source = c())

Arguments

`data`	input data of type data.table
`by`	character vector giving columns to group by
`apply_func`	function to be run in parallel
`preprocess_func`	function that will be run before apply_func. useful to open file/db handles
`postprocess_func`	function that will be run after apply_func. useful to close file/db handles
`num_chunks`	number of chunks to divide the data into. defaults to number of logical cores available
`preprocess_args`	a list of args to be passed to preprocess_func
`postprocess_args`	a list of args to be passed to postprocess_func
`packages`	character vector of package names to be exported on each core. NOTE: each package used by apply_func should be included
`export`	character vector of variable names to be exported on each core. NOTE: each variable name to be accessed inside apply_func should be exported
`combine`	the way results should be combined. accepts: c, +, rbind. defaults to c (character vector)
`files_to_source`	character vector of file names to be sourced on each core. the userr should have permission to read the file

Value

result of apply_func after combining results from each core using combine parameter above

Examples

dt <- data.table(fread('sample.csv'))
v <- 10
foo <- function(data_chunk){
  # some complex operations
  nrow(data_chunk)
}
troop(dt, by = c('column1','column2'), apply_func = foo)
troop(dt, by = c('column1','column2'), apply_func = foo, files_to_source = c('somefile.R','anotherfile.R'))
troop(dt, by = c('column1','column2'), apply_func = foo, num_chunks = 10, packages = c('RODBC','xgboost'), export = c('v'), combine = 'c')

tejaslodaya/troop documentation built on March 6, 2023, 11:44 p.m.

tejaslodaya/troop index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

tejaslodaya/troop
Groupby and Apply Function to a data.table Using Parallel Processing

troop: group by - apply - multiprocess data.table
In tejaslodaya/troop: Groupby and Apply Function to a data.table Using Parallel Processing

group by - apply - multiprocess data.table

Description

Usage

Arguments

Value

See Also

Examples

Related to troop in tejaslodaya/troop...

R Package Documentation

Browse R Packages

We want your feedback!

tejaslodaya/troop Groupby and Apply Function to a data.table Using Parallel Processing

troop: group by - apply - multiprocess data.table In tejaslodaya/troop: Groupby and Apply Function to a data.table Using Parallel Processing

group by - apply - multiprocess data.table

Description

Usage

Arguments

Value

See Also

Examples

Related to troop in tejaslodaya/troop...

R Package Documentation

Browse R Packages

We want your feedback!

tejaslodaya/troop
Groupby and Apply Function to a data.table Using Parallel Processing

troop: group by - apply - multiprocess data.table
In tejaslodaya/troop: Groupby and Apply Function to a data.table Using Parallel Processing