GeneralStep: General step of panning algorithm
In SMAC-Group/panning: An implementation of the Panning Algorithm

Description Usage Arguments Details Value Author(s) References See Also Examples

GeneralStep computes the intial step of the Panning Algorithm.

GeneralStep(y, X, Id_1s, pi = 0.5, B = 500L, d, alpha = 0.05,
  seed = 854751L, K = 10L, m = 10L, family, type = NULL, divergence,
  W = NULL, proc = 1L, C0 = 0.5, increasing = FALSE, trace = TRUE,
  ...)

`y, X, m, K, family, type, divergence, C0, W, increasing, trace, ...`	(see function `CVmFold`)
`Id_1s`	is the set of indices of promising variables of model of size `d-1`.
`pi`	is the probability of selecting a predictor from `Id_1s`.
`d, alpha, B, seed, proc`	(see function `InitialStep`)

This function computes the m-fold Cross-validation (CV) prediction error for B models of size d. Each of those B models are randomly constructed with the following scheme: a predictor has a probability pi to be selected from Id_1s and a probability 1-pi from its complement; a predictor can appear at maximum once in one model (no replacement within a model).

The seed can be fixed for reproducibility.

This function is computationnaly time consuming proportionally to the size of B.

GeneralStep returns a list with the following components (exactly the same as in InitialStep):

Ids: is the set I_d^* of indices of predictors with prediction errors cv.error<= q.alpha.
Sds: is the set S_d^* of models of size d with prediction errors cv.error<= q.alpha.
cv.error: is a (B x 1) vector of CV predictions errors.
q.alpha: is the empirical alpha-quantile computed on cv.error.
var.mat: is a (Bxd) matrix of indices of the explored models.

The indices returned by Ids are the column number of X as it is inputed, and not the name of the column. The indices are sorted by increasing number. Duplicates are deleted. Sds may contain duplicates.

Samuel Orso Samuel.Orso@unige.ch

Guerrier, S., Mili, N., Molinari, R., Orso, S., Avella-Medina, M. and Ma, Y. (2015) A Paradigmatic Regression Algorithm for Gene Selection Problems. submitted manuscript. http://arxiv.org/abs/1511.07662.

CVmFold, InitialStep

## Not run: 
#####
# Simulate a logistic regression
n <- 50
set.seed(123)
beta <- c(1, rpois(40, lambda = 0.5))
p <- length(beta)
X <- matrix(rnorm((p-1)*n), nrow=n, ncol=(p-1))
y <- rbinom(n,1,1/(1+exp(-tcrossprod(beta, cbind(1, X)))))
#####
# Assume that Id_1s obtained from the Initial Step is
# (see example in \code{\link[panning]{InitialStep}})
Id_1s <- c(24,33)
# (can take several seconds to run)
GStep <- GeneralStep(y = y, X = X, Id_1s = c(24,33), d = 2, B = 50,
                     family = binomial(link = "logit"), type = "response",
                     divergence = "classification", trace = FALSE)

# Run the parallelised version (4 cores)
GStep <- GeneralStep(y = y, X = X, Id_1s = c(24,33), d = 2, B = 50,
                     family = binomial(link = "logit"), type = "response",
                     divergence = "classification", proc = 2, trace = FALSE)

## End(Not run)