InitialStep: Initial step of the panning algorithm

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

InitialStep computes the intial step of the Panning Algorithm.

Usage

1
2
3
InitialStep(y, X, d = 1L, alpha = 0.05, B = NULL, seed = 951L,
  m = 10L, K = 10L, family, type = NULL, divergence, W = NULL,
  proc = 1L, C0 = 0.5, increasing = FALSE, trace = TRUE, ...)

Arguments

y, X, m, K, family, type, divergence, C0, W, increasing, trace, ...

(see function CVmFold)

d

the dimension of the model of interest (intercept is always included).

alpha

the level of the quantile of the prediction errors.

B

the number of bootstrap replicates.

seed

the seed for the random number generator.

proc

number of processor(s) for parallelisation.

Details

This function computes exhaustively the m-fold Cross-validation (CV) prediction error for all the C(p,d) possible models of size d by calling the CVmFold function. If B=NULL (default), then B is set to be equal to C(p,d).

If B takes a positive integer value smaller than the total number of models C(p,d), then the function computes the CV prediction errors for B models of size d randomly selected. In this case, it is possible to set the seed for reproducibility.

At this stage, the algorithm does not allow for interaction terms among variables.

This function is computationnaly time consuming proportionally to the size of B.

Value

InitialStep returns a list with the following components:

Ids

is the set I_d^* of indices of predictors with prediction errors cv.error<= q.alpha.

Sds

is the set S_d^* of models of size d with prediction errors cv.error<= q.alpha.

cv.error

is a (B x 1) vector of CV predictions errors.

q.alpha

is the empirical alpha-quantile computed on cv.error.

var.mat

is a (Bxd) matrix of indices of the explored models.

The indices returned by Ids are the column number of X as it is inputed, and not the name of the column. The indices are sorted by increasing number. Duplicates are deleted. Sds may contain duplicates.

Author(s)

Samuel Orso Samuel.Orso@unige.ch

References

Guerrier, S., Mili, N., Molinari, R., Orso, S., Avella-Medina, M. and Ma, Y. (2015) A Paradigmatic Regression Algorithm for Gene Selection Problems. submitted manuscript. http://arxiv.org/abs/1511.07662.

See Also

CVmFold, GeneralStep

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
## Not run: 
#####
# Simulate a logistic regression
n <- 50
set.seed(123)
beta <- c(1, rpois(40, lambda = 0.5))
p <- length(beta)
X <- matrix(rnorm((p-1)*n), nrow=n, ncol=(p-1))
y <- rbinom(n,1,1/(1+exp(-tcrossprod(beta, cbind(1, X)))))
#####

# (can take several seconds to run)
IStep <- InitialStep(y = y, X = X, family = binomial(link = "logit"), type = "response",
                     divergence = "classification", trace = FALSE)

# Run the parallelised version (4 cores)
IStep <- InitialStep(y = y, X = X, family = binomial(link = "logit"), type = "response",
                     divergence = "classification", proc = 2, trace = FALSE)

## End(Not run)

SMAC-Group/panning documentation built on May 9, 2019, 11:19 a.m.