dssRandomForest: Distributed random forests based on the randomForest package.
In sib-swiss/dsSwissKnifeClient: DataSHIELD Tools and Utilities - client side

dssRandomForest

R Documentation

Distributed random forests based on the randomForest package.

Description

Builds a random forest on each server node, as a model to predict classification of new data of the same type (new patients, same variables). The randomForest package needs to be installed on all nodes and on the client.

Usage

dssRandomForest(
  train = list(what = NULL, dep_var = NULL, expl_vars = NULL),
  test = list(forest = NULL, testData = NULL),
  async = TRUE,
  datasources = NULL,
  ...
)

Arguments

`train`	a list of parameters for the training phase. The elements are: what - name of the training data frame on the server. dep_var [string] - name of the response factor ("y"), i.e. the categories, expl_vars [vector[string]] - the classification variables
`test`	a list of parameters for the validation phase. The elements are: forest [list] - a list of forests obtained in the training phase , testData: new data to classify using the forests. If testData is a character, this will be considered the name of the remote data frame; the testing phase will take place on the remote servers. If testData is a local data frame the testing phase and prediction will take place in the client session. testData must have at least the columns in 'expl_vars' (We want to predict the value of 'dep_var' for it.)
`...`	- further arguments that will be passed to the randomForest function for the training phase only