cv.splitSelect: Split Selection Modeling for Low-Dimensional Data -...

Description Usage Arguments Value Author(s) See Also Examples

View source: R/cv.splitSelect.R

Description

cv.splitSelect performs the best split selection algorithm with cross-validation

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
cv.splitSelect(
  x,
  y,
  intercept = TRUE,
  G,
  use.all = TRUE,
  family = c("gaussian", "binomial")[1],
  group.model = c("glmnet", "LS", "Logistic")[1],
  alphas = 0,
  nsample = NULL,
  fix.partition = NULL,
  fix.split = NULL,
  nfolds = 10,
  parallel = FALSE,
  cores = getOption("mc.cores", 2L)
)

Arguments

x

Design matrix.

y

Response vector.

intercept

Boolean variable to determine if there is intercept (default is TRUE) or not.

G

Number of groups into which the variables are split. Can have more than one value.

use.all

Boolean variable to determine if all variables must be used (default is TRUE).

family

Description of the error distribution and link function to be used for the model. Must be one of "gaussian" or "binomial".

group.model

Model used for the groups. Must be one of "glmnet" or "LS".

alphas

Elastic net mixing parameter. Should be between 0 (default) and 1.

nsample

Number of sample splits for each value of G. If NULL, then all splits will be considered (unless there is overflow).

fix.partition

Optional list with G elements indicating the partitions (in each row) to be considered for the splits.

fix.split

Optional matrix with p columns indicating the groups (in each row) to be considered for the splits.

nfolds

Number of folds for the cross-validation procedure.

parallel

Boolean variable to determine if parallelization of the function. Default is FALSE.

cores

Number of cores for the parallelization for the function.

Value

An object of class cv.splitSelect.

Author(s)

Anthony-Alexander Christidis, anthony.christidis@stat.ubc.ca

See Also

coef.cv.splitSelect, predict.cv.splitSelect

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
# Setting the parameters
p <- 4
n <- 30
n.test <- 5000
beta <- rep(5,4)
rho <- 0.1
r <- 0.9
SNR <- 3
# Creating the target matrix with "kernel" set to rho
target_cor <- function(r, p){
  Gamma <- diag(p)
  for(i in 1:(p-1)){
    for(j in (i+1):p){
      Gamma[i,j] <- Gamma[j,i] <- r^(abs(i-j))
    }
  }
  return(Gamma)
}
# AR Correlation Structure
Sigma.r <- target_cor(r, p)
Sigma.rho <- target_cor(rho, p)
sigma.epsilon <- as.numeric(sqrt((t(beta) %*% Sigma.rho %*% beta)/SNR))
# Simulate some data
x.train <- mvnfast::rmvn(30, mu=rep(0,p), sigma=Sigma.r)
y.train <- 1 + x.train %*% beta + rnorm(n=n, mean=0, sd=sigma.epsilon)

# Generating the coefficients for a fixed partition of the variables

split.out <- cv.splitSelect(x.train, y.train, G=2, use.all=TRUE,
                            fix.partition=list(matrix(c(2,2), 
                                               ncol=2, byrow=TRUE)), 
                            fix.split=NULL,
                            intercept=TRUE, group.model="glmnet", alphas=0, nfolds=10)

splitSelect documentation built on Nov. 9, 2021, 9:07 a.m.