MSE_compare: Comparing two pre-trained models.

Usage Arguments Value Author(s) Examples

View source: R/MSE_Test_File.R

Usage

1
MSE_compare(m1, m2, X.test, y.test = NULL, B = 1000, return.preds = F, test_stat = if (is.null(y.test)) "KS" else "MSE")

Arguments

m1

First model, usually generated with bag.s.

m2

Second model, usually generated with bag.s.

X.test

Covariates of the test set with which the MSE is calculated.

y.test

Responses in the test set with which the MSE is calculated.

B

Number of permutations to use in the test. Note: this is the number of times the trees are permuted between forests to generate the permutation distribution, not the number of times each feature is permuted.

return.preds

Logical. Should model predictions be returned?

test_stat

Not currently useful.

Value

An object of the S4 class MSE_Test

originalStat

A named vector of two quantities, Original MSE, which corresponds to the MSE of the full model and Permuted MSE which corresponds to MSE of the reduced model.

PermDiffs

A vector of the differences in permuted MSEs - these make up the permutation distribution.

Importance

A scalar of the SD Importance Z-score.

Pvalue

The p-value for the hypothesis tested.

test_pts

The test data frame.

weak_learner

The base models used in each ensemble - one for model 1 and one for model 2.

model_original

Model 1

model_permuted

Model 2

test_stat

Which test statistic is used. Will always be "MSE" for this function.

Author(s)

Tim Coleman

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
N <- 1250
Nvar <- 10
N_test <- 150
name_vec <- paste("X", 1:(2*Nvar), sep = "")

# training data:
X <- data.frame(replicate(Nvar, runif(N)),
                replicate(Nvar, cut(runif(N), 3,
                                      labels = as.character(1:3)))) 
  mutate(Y = 5*(X3) + .5*X2^2 + ifelse(X6 > 10*X1*X8*X9, 1, 0) +  rnorm(N, sd = .05))
names(X) <- c(name_vec, "Y")

# some testing data:
X.t1 <- data.frame(replicate(Nvar, runif(N_test)),
                   replicate(Nvar, cut(runif(N_test), 3,
                                       labels = as.character(1:3)))) 
  mutate(Y = 5*(X3) + .5*X2^2 + ifelse(X6 > 10*X1*X8*X9, 1, 0) +  rnorm(N_test, sd = .05))
names(X.t1) <- c(name_vec, "Y")

m_rpart <- bag.s(X = X 
                       base.learner = "rpart", ntree = 100, k = 100^.85, mtry = 10, form = Y~., ranger = F)
m_ctree <- bag.s(X = X 
                 base.learner = "ctree", ntree = 100, k = 100^.85, mtry = 10, form = Y~., ranger = F)
m_ctree_ranger <- bag.s(X = X 
                    base.learner = "ctree", ntree = 100, k = 100^.85, mtry = 10, form = Y~., ranger = T)
m_rtree <- bag.s(X = X 
                 base.learner = "rtree", ntree = 100, k = 100^.85, mtry = 10, form = Y~., ranger = F)
m_rtree_ranger <- bag.s(X = X 
                        base.learner = "rtree", ntree = 100, k = 100^.85, mtry = 10, ranger = T)
m_glm  <- bag.s(X = X 
                                 base.learner = "lm", ntree = 100, k = 100^.85, mtry = 10, ranger = F)

# Feature importance usage
X_pm <- X[sample(nrow(X), replace = F),which(names(X) != "Y")]
m_reduced <- bag.s(X = X_pm, y = X 
                  base.learner = "rpart", ntree = 100, k = 100^.85, mtry = 10)
m_full <- bag.s(X = X 
                base.learner = "rpart", ntree = 100, k = 100^.85, mtry = 10)
full_vs_red <- MSE_compare(m_reduced, m_full, X.test = X.t1, y.test = X.t1$Y)

p1 <- data.frame(lapply(m_ctree_ranger, FUN = function(x) predict(x, data = X.t1)["predictions"]))
p2 <- data.frame(lapply(m_rpart, FUN = function(x) predict(x, newdata = X.t1)))
p_glm <- data.frame(lapply(m_glm, predict,
                           newx = model.matrix(y~., data = data.frame(X.t1, "y" = rep(0, N_test)))))

MSE_comp_rp_ct <- MSE_compare(m_rpart, m_ctree, X.test = X.t1, y.test = X.t1$Y)
MSE_comp_rp_glm <- MSE_compare(m_rpart, m_glm, X.test = X.t1, y.test = X.t1$Y)
MSE_comp_rtr_glm <- MSE_compare(m_rtree_ranger, m_glm, X.test = X.t1)

tim-coleman/RFtest documentation built on March 10, 2020, 12:28 p.m.