synthetic_forest: Grow a tree ensemble on synthetic data
In talegari/forager: Compute auxiliary information (proximity, dissimilarity, outlyingness, depth) and imputation from tree ensembles on new data

Description Usage Arguments Details Value References Examples

Builds a random forest model to classify actual vs synthetic data where synthetic data is created by sampling each covariate as suggested in Understanding random forests by Brieman.

1 2	synthetic_forest(dataset, prop = TRUE, seed = 1L, implementation = "ranger", ...)

`dataset`	A dataframe
`prop`	(flag) Random sampling of covariates (when prop = TRUE) to generate synthetic data. Else, uniform sampling is used.
`seed`	(a positive integer) Seed for sampling.
`implementation`	(string) Implemenation to use to build the model. The following are supported: 'ranger', 'randomForest'.
`...`	Arguments to be passed to implementation.

Understanding random forests by Brieman involves creating synthetic data by sampling randomly from unvariate distributions of each covariate(feature). This supports two methods: First, where proportions or distribution is taken into account when sampling at random, second where the data is sampled assuming uniform distribution. The former corresponds to "Addcl1" from Horvath's paper and latter corresponds to "addc2". A random forest model is built using ranger or randomForest to learn what separates the actual data from the synthetic data. Default value of number of trees grown is 1000 and minimum node size to split is set to 5.

A tree ensemble with one these classes: 'ranger', 'randomForest'

Unsupervised Learning With Random Forest Predictors by Tao Shi & Steve Horvath.
Understanding random forests by Brieman.

# ranger
model_ranger <- synthetic_forest(iris, implementation = "ranger")
oob_error(model_ranger)

# randomForest
model_rf <- synthetic_forest(iris, implementation = "randomForest")
oob_error(model_rf)

# extratrees
model_et <- synthetic_forest(iris, implementation = "ranger", splitrule = "extratrees")
oob_error(model_et)

talegari/forager documentation built on May 3, 2019, 4:01 p.m.

talegari/forager index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

talegari/forager
Compute auxiliary information (proximity, dissimilarity, outlyingness, depth) and imputation from tree ensembles on new data

synthetic_forest: Grow a tree ensemble on synthetic data
In talegari/forager: Compute auxiliary information (proximity, dissimilarity, outlyingness, depth) and imputation from tree ensembles on new data

Description

Usage

Arguments

Details

Value

References

Examples

Related to synthetic_forest in talegari/forager...

R Package Documentation

Browse R Packages

We want your feedback!

talegari/forager Compute auxiliary information (proximity, dissimilarity, outlyingness, depth) and imputation from tree ensembles on new data

synthetic_forest: Grow a tree ensemble on synthetic data In talegari/forager: Compute auxiliary information (proximity, dissimilarity, outlyingness, depth) and imputation from tree ensembles on new data

Description

Usage

Arguments

Details

Value

References

Examples

Related to synthetic_forest in talegari/forager...

R Package Documentation

Browse R Packages

We want your feedback!

talegari/forager
Compute auxiliary information (proximity, dissimilarity, outlyingness, depth) and imputation from tree ensembles on new data

synthetic_forest: Grow a tree ensemble on synthetic data
In talegari/forager: Compute auxiliary information (proximity, dissimilarity, outlyingness, depth) and imputation from tree ensembles on new data