SuperLearner_tidiers: Tidying method(s) for a SuperLearner fit object

SuperLearner_tidiersR Documentation

Tidying method(s) for a SuperLearner fit object

Description

This method extends tidy to tidy the results from a SuperLearner fit (screening, prediction, or both) into a summary.

Usage

## S3 method for class 'SuperLearner'
tidy(
  x,
  algorithm = c("prediction", "screening", "both"),
  stringsAsFactors = FALSE,
  ...
)

Arguments

x

object of class SuperLearner

algorithm

one of "prediction" (the default), "screening", or "both" (where "both" indicates that information on both "prediction" and "screening" should be reported).

stringsAsFactors

Set to FALSE by default. Experimental feature – note that setting this to TRUE may produce warnings/errors (and/or may have no effect).

...

passed through to internal functions handling summaries of feature selection and/or prediction models (depending on the supplied value for algorithm). See details (below) for specific arguments accepted.

Details

This method can be used to summarize information related to the screening algorithm(s), prediction algorithm(s), or both.

Value

A data.frame without rownames. Column names included depend on the supplied value for algorithm.

Optional argument(s)

If algorithm is set to "screening" or "both", the optional argument includeAll can be set to TRUE to include results from the pass-thru screening algorithm, "All". By default, includeAll is set to FALSE and these results are excluded.

Resulting data.frame

Columns in resulting data.frame depend upon selected algorithm:

"prediction"

One row per element in SL.library. Five columns:

"estimate"

Coefficient estimate

"cvRisk"

Estimate of cross-validated risk.

"screener"

Screening algorithm

"predictor"

Prediction algorithm

"discrete"

Logical. Is this the discrete SuperLearner?

"screening"

Number of rows equal to the product of [number columns in X] and [number of unique screening algorithms in SL.library, not including "All" (by default)]. Three columns:

"screener"

Screening algorithm

"term"

Column of X

"selected"

Logical. Did the screening algorithm select this column of X?

"both"

Number of rows equal to the product of [number columns in X] and [number of elements in SL.library, not including any elements where the screening algorithm was set to "All" (by default)]. Seven columns constituting the union of the columns returned for "prediction" and "screening" (described above).

See Also

tidy.CV.SuperLearner

Examples

# based on an example in the SuperLearner package
set.seed(1)
n <- 100
p <- 10
X <- matrix(rnorm(n*p), nrow = n, ncol = p)
X <- data.frame(X)
Y <- rbinom(n, 1, plogis(.2*X[, 1] + .1*X[, 2] - .2*X[, 3] + .1*X[, 3]*X[, 4] - .2*abs(X[, 4])))

library(SuperLearner)
sl = SuperLearner(Y, X, family = binomial(), cvControl = list(V = 2),
                  SL.library = list(c("SL.mean", "screen.wgtd.corP"),
                                    c("SL.mean", "screen.wgtd.ttest"),
                                    c("SL.glm", "screen.wgtd.corP"),
                                    c("SL.glm", "screen.wgtd.ttest")))

library(broom)
tidy(sl)
tidy(sl, algorithm = "screening")
tidy(sl, algorithm = "both")


saraemoore/SLScreenExtra documentation built on Nov. 4, 2023, 9:31 p.m.