tune_sgspls: Compute cross-validated mean squared prediction error for...

Description Usage Arguments Value References See Also Examples

Description

Tuning function for finding the number of groups, and sparsities to select for an sgspls object. Offers a sequential way to find optimal sparsities, and number of groups for either block.

Usage

1
2
tune_sgspls(pls_obj, sparsities = NULL, group_seq = NULL, block = "X",
  folds = 10, progressBar = TRUE, setseed = 1, scale_resp = TRUE)

Arguments

pls_obj

List of parameters or object of class cv.sgspls used to perform cross validation (see examples below).

sparsities

Matrix of sparsities, with columns corresponding to group, subgroup and individual sparsity levels to tune over. If it is NULL then a preselected set of sparsity levels is used.

group_seq

a vector containing the number of groups to tune over.

block

A string either "X" or "Y" to indicate which block to tune parameters over.

folds

The number of folds to use in cross validation.

progressBar

Logical, indicating if a progress bar is shown.

setseed

False, or integer for replicating tuning parameters.

scale_resp

Logical, the MSEP is standardised across responses (see perf function for details).

Value

tune_sgspls returns a list with class cv.sgspls with measures:

result_tuning

A matrix containing the tuning parameters and MSEP values.

best

A vector containing the optimal tuning parameters.

parameters

A list of the parameters for a sgspls object.

tuning_sparsities

A matrix of group, subgroup and individual sparsities tuned over.

folds

Number of folds used in cross validation.

min_cv

Minimum MSEP score.

group_seq

Groups tuned over in cross validation.

References

Liquet Benoit, Lafaye de Micheaux, Boris Hejblum, Rodolphe Thiebaut. A group and Sparse Group Partial Least Square approach applied in Genomics context. Submitted.

See Also

sgspls Tuning functions calc_pve, tune_groups. Model performance and estimation predict.sgspls, perf.sgspls, coef.sgspls

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
 set.seed(1)
 n = 50; p = 510; 
 size.groups = 30; size.subgroups = 5
 groupX <- ceiling(1:p / size.groups)
 subgroupX <- ceiling(1:p / size.subgroups)
 
 X = matrix(rnorm(n * p), ncol = p, nrow = n)
 
 beta <- rep(0,p)
 bSG <- -2:2; b0 <- rep(0,length(bSG))
 betaG <- c(bSG, b0, bSG, b0, bSG, b0)
 beta[1:size.groups] <- betaG
 
 y = X %*% beta + rnorm(n)
 
 #--------------------------------------#
 #-- Set up a basic model to tune --#
 
 cv_pls <- list(X=X, Y=y, groupX=groupX, subgroupX=subgroupX)
 
 #---------------------------------------------#
 #-- Tune over 1 to 2 groups and multiple    --#
 #-- sparsity levels for the first component --#
 
 cv_pls_comp1 <- tune_sgspls(pls_obj = cv_pls, group_seq = 1:2, scale_resp = FALSE)
 
 #-- MSEP is on the original scale for the response --#
 cv_pls_comp1$results_tuning
 cv_pls_comp1$best
 
 ## Not run: 
 # Use the optimal fit for the first component and tune the second component
 cv_pls_comp2 <- tune_sgspls(pls_obj = cv_pls_comp1, group_seq = 1:2, scale_resp = FALSE)
 cv_pls_comp2$results_tuning
 cv_pls_comp2$best
 
 # Use the optimal fit for the second component and tune the third component
 cv_pls_comp3 <- tune_sgspls(pls_obj =  cv_pls_comp2, group_seq = 1:2, scale_resp = FALSE)
 cv_pls_comp3$best
 
 model <- do.call(sgspls, args = cv_pls_comp3$parameters)
 
 model
 
 model_coef <- coef(model, type = "coefficients")
 
 cbind(beta, model_coef$B[,,2])
 
## End(Not run)

matt-sutton/sgspls documentation built on June 22, 2019, 10:21 a.m.