dssRandomForest: Distributed random forests based on the randomForest package.

Description Usage Arguments Value

View source: R/dssRandomForest.R

Description

Builds a random forest on each server node, as a model to predict classification of new data of the same type (new patients, same variables). The randomForest package needs to be installed on all nodes and on the client.

Usage

1
2
3
4
5
6
7
8
9
dssRandomForest(
  what,
  dep_var,
  expl_vars = NULL,
  testData = NULL,
  async = TRUE,
  wait = TRUE,
  datasources = NULL
)

Arguments

what:

name of the training data frame on the server.

dep_var:

[string] the response factor ("y"), i.e. the categories that will be the leaves of each tree.

expl_vars:

[vector[string]] the classification variables.

testData:

[data frame] new data to classify using the forests. It must have at least the columns in 'expl_vars'. (We want to predict the value of 'dep_var' for it.)

Value

a randomForest object.

a list with members '$forests': the individual forests from the nodes, and '$prediction', the average prediction of 'testData' (if given) by all nodes together.


IulianD/dsSwissKnifeClient documentation built on June 23, 2020, 4:38 p.m.