GeneralStep: General step of panning algorithm

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

GeneralStep computes the intial step of the Panning Algorithm.

Usage

1
2
3
4
GeneralStep(y, X, Id_1s, pi = 0.5, B = 500L, d, alpha = 0.05,
  seed = 854751L, K = 10L, m = 10L, family, type = NULL, divergence,
  W = NULL, proc = 1L, C0 = 0.5, increasing = FALSE, trace = TRUE,
  ...)

Arguments

y, X, m, K, family, type, divergence, C0, W, increasing, trace, ...

(see function CVmFold)

Id_1s

is the set of indices of promising variables of model of size d-1.

pi

is the probability of selecting a predictor from Id_1s.

d, alpha, B, seed, proc

(see function InitialStep)

Details

This function computes the m-fold Cross-validation (CV) prediction error for B models of size d. Each of those B models are randomly constructed with the following scheme: a predictor has a probability pi to be selected from Id_1s and a probability 1-pi from its complement; a predictor can appear at maximum once in one model (no replacement within a model).

The seed can be fixed for reproducibility.

This function is computationnaly time consuming proportionally to the size of B.

Value

GeneralStep returns a list with the following components (exactly the same as in InitialStep):

Ids

is the set I_d^* of indices of predictors with prediction errors cv.error<= q.alpha.

Sds

is the set S_d^* of models of size d with prediction errors cv.error<= q.alpha.

cv.error

is a (B x 1) vector of CV predictions errors.

q.alpha

is the empirical alpha-quantile computed on cv.error.

var.mat

is a (Bxd) matrix of indices of the explored models.

The indices returned by Ids are the column number of X as it is inputed, and not the name of the column. The indices are sorted by increasing number. Duplicates are deleted. Sds may contain duplicates.

Author(s)

Samuel Orso Samuel.Orso@unige.ch

References

Guerrier, S., Mili, N., Molinari, R., Orso, S., Avella-Medina, M. and Ma, Y. (2015) A Paradigmatic Regression Algorithm for Gene Selection Problems. submitted manuscript. http://arxiv.org/abs/1511.07662.

See Also

CVmFold, InitialStep

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
## Not run: 
#####
# Simulate a logistic regression
n <- 50
set.seed(123)
beta <- c(1, rpois(40, lambda = 0.5))
p <- length(beta)
X <- matrix(rnorm((p-1)*n), nrow=n, ncol=(p-1))
y <- rbinom(n,1,1/(1+exp(-tcrossprod(beta, cbind(1, X)))))
#####
# Assume that Id_1s obtained from the Initial Step is
# (see example in \code{\link[panning]{InitialStep}})
Id_1s <- c(24,33)
# (can take several seconds to run)
GStep <- GeneralStep(y = y, X = X, Id_1s = c(24,33), d = 2, B = 50,
                     family = binomial(link = "logit"), type = "response",
                     divergence = "classification", trace = FALSE)

# Run the parallelised version (4 cores)
GStep <- GeneralStep(y = y, X = X, Id_1s = c(24,33), d = 2, B = 50,
                     family = binomial(link = "logit"), type = "response",
                     divergence = "classification", proc = 2, trace = FALSE)

## End(Not run)

SMAC-Group/panning documentation built on May 9, 2019, 11:19 a.m.