Usage Arguments Value Author(s) Examples
View source: R/MSE_Test_File.R
1 | MSE_compare(m1, m2, X.test, y.test = NULL, B = 1000, return.preds = F, test_stat = if (is.null(y.test)) "KS" else "MSE")
|
m1 |
First model, usually generated with |
m2 |
Second model, usually generated with |
X.test |
Covariates of the test set with which the MSE is calculated. |
y.test |
Responses in the test set with which the MSE is calculated. |
B |
Number of permutations to use in the test. Note: this is the number of times the trees are permuted between forests to generate the permutation distribution, not the number of times each feature is permuted. |
return.preds |
Logical. Should model predictions be returned? |
test_stat |
Not currently useful. |
An object of the S4 class MSE_Test
originalStat |
A named vector of two quantities, |
PermDiffs |
A vector of the differences in permuted MSEs - these make up the permutation distribution. |
Importance |
A scalar of the SD Importance Z-score. |
Pvalue |
The p-value for the hypothesis tested. |
test_pts |
The test data frame. |
weak_learner |
The base models used in each ensemble - one for model 1 and one for model 2. |
model_original |
Model 1 |
model_permuted |
Model 2 |
test_stat |
Which test statistic is used. Will always be |
Tim Coleman
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 | N <- 1250
Nvar <- 10
N_test <- 150
name_vec <- paste("X", 1:(2*Nvar), sep = "")
# training data:
X <- data.frame(replicate(Nvar, runif(N)),
replicate(Nvar, cut(runif(N), 3,
labels = as.character(1:3))))
mutate(Y = 5*(X3) + .5*X2^2 + ifelse(X6 > 10*X1*X8*X9, 1, 0) + rnorm(N, sd = .05))
names(X) <- c(name_vec, "Y")
# some testing data:
X.t1 <- data.frame(replicate(Nvar, runif(N_test)),
replicate(Nvar, cut(runif(N_test), 3,
labels = as.character(1:3))))
mutate(Y = 5*(X3) + .5*X2^2 + ifelse(X6 > 10*X1*X8*X9, 1, 0) + rnorm(N_test, sd = .05))
names(X.t1) <- c(name_vec, "Y")
m_rpart <- bag.s(X = X
base.learner = "rpart", ntree = 100, k = 100^.85, mtry = 10, form = Y~., ranger = F)
m_ctree <- bag.s(X = X
base.learner = "ctree", ntree = 100, k = 100^.85, mtry = 10, form = Y~., ranger = F)
m_ctree_ranger <- bag.s(X = X
base.learner = "ctree", ntree = 100, k = 100^.85, mtry = 10, form = Y~., ranger = T)
m_rtree <- bag.s(X = X
base.learner = "rtree", ntree = 100, k = 100^.85, mtry = 10, form = Y~., ranger = F)
m_rtree_ranger <- bag.s(X = X
base.learner = "rtree", ntree = 100, k = 100^.85, mtry = 10, ranger = T)
m_glm <- bag.s(X = X
base.learner = "lm", ntree = 100, k = 100^.85, mtry = 10, ranger = F)
# Feature importance usage
X_pm <- X[sample(nrow(X), replace = F),which(names(X) != "Y")]
m_reduced <- bag.s(X = X_pm, y = X
base.learner = "rpart", ntree = 100, k = 100^.85, mtry = 10)
m_full <- bag.s(X = X
base.learner = "rpart", ntree = 100, k = 100^.85, mtry = 10)
full_vs_red <- MSE_compare(m_reduced, m_full, X.test = X.t1, y.test = X.t1$Y)
p1 <- data.frame(lapply(m_ctree_ranger, FUN = function(x) predict(x, data = X.t1)["predictions"]))
p2 <- data.frame(lapply(m_rpart, FUN = function(x) predict(x, newdata = X.t1)))
p_glm <- data.frame(lapply(m_glm, predict,
newx = model.matrix(y~., data = data.frame(X.t1, "y" = rep(0, N_test)))))
MSE_comp_rp_ct <- MSE_compare(m_rpart, m_ctree, X.test = X.t1, y.test = X.t1$Y)
MSE_comp_rp_glm <- MSE_compare(m_rpart, m_glm, X.test = X.t1, y.test = X.t1$Y)
MSE_comp_rtr_glm <- MSE_compare(m_rtree_ranger, m_glm, X.test = X.t1)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.