Description Usage Arguments Value
View source: R/generate_synthetic_data.R
Unsupervised learning of randomforest as suggested by Brieman (https://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm#unsup) involves creating synthetic data by sampling randomly from unvariate distributions of each covariate(feature). This supports two methods: First, where proportions or distribition is taken into account when sampling at random, second where the data is sampled assuming uniform distribution. The former corresponds to "Addcl1" from Horvath's paper (Unsupervised Learning With Random Forest Predictors: Tao Shi & Steve Horvath) and latter corresponds to "addc2".
1 | generate_synthetic_data(dataset, prop, seed)
|
dataset |
A dataframe |
prop |
Random sampling of covariates (when prop = TRUE) to generate synthetic data. Else, uniform sampling is used. |
seed |
Seed for sampling. |
A dataframe with synthetic data.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.