screen.wgtd.ttest: Weighted t-test screening algorithm

View source: R/twosamptests.R

screen.wgtd.ttestR Documentation

Weighted t-test screening algorithm

Description

Performs feature selection according to the ranking of t statistics or P-values returned from weighted t-tests. Implemented via wtd.t.test.

Usage

screen.wgtd.ttest(
  Y,
  X,
  family,
  obsWeights,
  id,
  selector = c("cutoff.k", "cutoff.k.percent"),
  k = switch(selector, cutoff.k = ceiling(0.5 * ncol(X)), cutoff.k.percent = 0.5, NULL),
  minP = NULL,
  ...
)

Arguments

Y

Outcome (numeric vector). See SuperLearner for specifics.

X

Predictor variable(s) (data.frame or matrix). See SuperLearner for specifics.

family

Error distribution to be used in the model: gaussian or binomial. See SuperLearner for specifics.

obsWeights

Optional numeric vector of observation weights. See SuperLearner for specifics.

id

Cluster identification variable. Currently unused.

selector

A string corresponding to a subset selecting function implemented in the FSelector package. One of: cutoff.k or cutoff.k.percent. Ignored if minP is non-NULL and at least k features have P-values at or below minP. Default: "cutoff.k".

k

Numeric. Minimum number or proportion of features to select. Passed through to the selector. For cutoff.k, this is an integer indicating the number of features to keep from X. For cutoff.k.percent, this is instead the proportion of features to keep. Ignored if minP is non-NULL and at least k features have P-values at or below minP.

minP

Numeric. To pass the screen, resulting P-values must not exceed this number. Ignored if NULL (default) or if fewer than k features have P-values at or below this value.

...

Passed to wtd.t.test. These arguments control bootstrapping of P-values and standard errors as well as forced scaling of weights.

Value

A logical vector with length equal to ncol(X)

Examples

# based on example in SuperLearner package
set.seed(1)
n <- 100
p <- 20
X <- matrix(rnorm(n*p), nrow = n, ncol = p)
X <- data.frame(X)
Y <- rbinom(n, 1, plogis(.2*X[, 1] + .1*X[, 2] - .2*X[, 3] + .1*X[, 3]*X[, 4] - .2*abs(X[, 4])))
obsWeights <- 1/runif(n)
screen.wgtd.ttest(Y, X, binomial(), obsWeights, seq(n), k = 4)

screen.wgtd.ttest4 <- function(..., k = 4){
    screen.wgtd.ttest(..., k = k)
}

library(SuperLearner)
sl = SuperLearner(Y, X, family = binomial(), cvControl = list(V = 2),
                  obsWeights = obsWeights,
                  SL.library = list(c("SL.lm", "All"),
                                    c("SL.lm", "screen.wgtd.ttest4")))
sl
sl$whichScreen

saraemoore/SLScreenExtra documentation built on Nov. 4, 2023, 9:31 p.m.