Description Usage Arguments Value Examples
View source: R/mondrian_forest.R
mondrian_forest
implements Lakshminarayanan et al's Mondrian Tree
algorithms described in [Lakshminarayanan et al. 2014]
(https://arxiv.org/abs/1406.2673) with a modification allowing for more
space-efficient dummy variable treatment of categorical variables.
1 2 | mondrian_forest(X, y_col_num, lambda, f_scale = 1, ntree = 25,
verbose = FALSE)
|
X |
Data (matrix or data frame) containing features and column of labels. |
y_col_num |
Numeric length 1 vector of column number of label (defaults to last ncol(X)). |
lambda |
Budget parameter, see [Lakshminarayanan et al. 2014]. |
f_scale |
Numeric length 1 vector if constant or length equal to the number
of categorical variables. |
ntree |
Numeric length 1 vector of number of trees to build in forest. |
verbose |
Boolean length 1 vector, prints additional information while algorithm is running (currently just time to build trees). |
A mondrianforest.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | library(mondrianforest); library(dplyr); library(purrr); library(magrittr)
set.seed(1)
test <- data.frame(x1 = rnorm(1000),
x2 = runif(1000),
x3 = rbeta(n = 1000, shape1 = 3, shape2 = 8),
# x3 is noise
x4 = rbinom(n = 1000, size = 4, prob = 0.2)) %>%
map_df(function(x) (x - min(x))/(max(x) - min(x))) %>%
mutate(y = x1*x2^2 + exp(x1) - x4,
x1 = cut_number(sin(x1), n = 10),
label = as.factor(case_when(y < 1 ~ "A",
y >=1 & y < 1.7 ~ "B",
y >= 1.7 ~ "C",
))) %>%
select(-y)
mf <- mondrian_forest(test[1:750, ], y_col_num = 5, lambda = 3)
table(test$label[751:1000], predict(mf, test[751:1000, ], type = "class"))
# Compare to Random Forest in installed:
# rf <- randomForest::randomForest(test[1:750, -5],
y = test$label[1:750],
ntree = 1000)
# table(test$label[751:1000], predict(rf, test[751:1000, ]))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.