rjungle: rjungle

Description Usage Arguments Details Value Author(s) References See Also

Description

Calls the rjungle C++ routines and grows a Random Jungle.

Usage

1
2
3
4
5
6
7
8
9
  rjungle(depVarName = "", data = NULL,
    dataFileName = NULL, ntree = 500, mtry = NULL,
    treeType = 1, importance = 1, replace = FALSE,
    proximity = FALSE, keepJungle = TRUE, nthread = 0,
    seed = 123, fileNameOut = character(),
    fileNameIn = character(), balanceData = FALSE,
    verbose = FALSE, convertdata = FALSE,
    inDir = character(), outDir = character(),
    options = "", ...)

Arguments

depVarName

A character string providing the name of the dependend variable, if any. If no variable is specified rjungle is run unsupervised.

data

A data.frame containing the data to be analysed.

dataFileName

A character string defining the name of (or the path to) the dataset to be used for analysis. Not implemented or needed at the moment. See fileNameIn.

ntree

The numeric number of trees to be grown in jungle. Defaults to 500.

mtry

A numeric, gives the number of randomly chosen variable sets.

treeType

A numeric taking values 15. Defines what type of regression or classification should be performed. 1: y(response): nominal, x(input): numeric 2: y nominal, x nominal 3: y numeric, x numeric 4: y numeric, x nominal 5: as 1, but recommended for more different values in the input variables (i.e many floating point numbers). Defaults to 1.

importance

A numeric taking values 15. If 1, only the GINI-index importance measure is evaluated. If 25 four other measures will be computed and sorted by the measure coerced with the specified value as follows: 1: GINI-Index 2: Breiman-Score 3: Liaw-Score 4: raw values, no normalization 5: Meng-Score

replace

A logical. If TRUE, the set of variables to choose the split from is drawn with replacement, otherwise without replacement. Defaults to FALSE.

proximity

A logical, indicates whether proximites should be computed. Defaults to FALSE.

keepJungle

A logical indicating whether to keep the jungle (e.g. for future use in the call of predict). Defaults to TRUE.

nthread

A numeric taking values from 1 to total number of CPUs. Defaults to 0 which is equivalent to 1.

seed

A numeric specifing the seed to be used for computation.Defaults to 123.

fileNameOut

A character specifing the prefix to the files produced by rjungle. If missing the files will be written to tempfile('rjungledata'). If specified the files will be written to getwd with fileNameOut as prefix to the files. If you want to write to the files to another directory specify the outDir parameter.

fileNameIn

A character, gives the name of the dataset to use if data is not specified. Must be a .dat file an stored in getwd if inDir is not given.

balanceData

to add

verbose

A logical indicating whether a progress file shall be created. If FALSE no file is created. Defaults to FALSE.

convertdata

to add

inDir

A character giving the directory of the input data. Only if fileNameIn. Defaults to getwd.

outDir

A character giving the directory where rjungle output data should be stored. Only applies if fileNameOut was specified. Defaults to getwd.

options

Further arguments to be passed to the rjungle call.

...

Further arguments to be passed to the system call.

Details

to add

Value

Returns an object of class rjungle. Contains the setting used in the call of rjungle function and the path to the temporary files produced by rjungle.

Author(s)

Daniel F. Schwarz with modifications by Andreas Bender and Jochen Kruppa

References

www.randomjungle.de

See Also

importance, confusion


jkruppa/Rjungle documentation built on May 19, 2019, 12:45 p.m.