sRDA: sRDA.

Description Usage Arguments Value Author(s) References Examples

View source: R/sRDA.R

Description

sRDA.

Sparse Redundancy Analysis (sRDA) to express the maximum variance in the predicted data set by a linear combination of variables (latent variable) of the predictive data set. Elastic net penalization (with its variants, UST, Ridge and Lasso penalization) is implemented for sparsity and smoothness with a built in cross validation procedure to obtain the optimal penalization parameters. It is possible to obtain multiple latent variables which are orthogonal to each other, thus each explaining a different protion of variance in the predicted data set. sRDA is implemented in a Partial Least Squares framework, for more details see Csala et al. (2017).

Usage

1
2
3
4
sRDA(predictor, predicted, penalization = "enet", ridge_penalty = 1,
  nonzero = 1, max_iterations = 100, tolerance = 1 * 10^-20,
  cross_validate = FALSE, parallel_CV = FALSE, nr_subsets = 10,
  multiple_LV = FALSE, nr_LVs = 1)

Arguments

predictor

The n*p matrix of the predictor data set

predicted

The n*q matrix of the predicted data set

penalization

The penalization method applied during the analysis (none, enet or ust)

ridge_penalty

The ridge penalty parameter of the predictor set's latent variable used for enet (an integer if cross_validate = FALSE, a list otherwise)

nonzero

The number of non-zero weights of the predictor set's latent variable used for enet or ust (an integer if cross_validate = FALSE, a list otherwise)

max_iterations

The maximum number of iterations of the algorithm (integer)

tolerance

Convergence criteria (number, a small positive tolerance)

cross_validate

K-fold cross validation to find best optimal penalty parameters (TRUE or FALSE)

parallel_CV

Run the cross validation parallel (TRUE or FALSE)

nr_subsets

Number of subsets for k-fold cross validation (integer, the value for k)

multiple_LV

Obtain multiple latent variable pairs (TRUE or FALSE)

nr_LVs

Number of latent variable pairs (components) to be obtained (integer)

Value

An object of class "sRDA".

XI

Predictor set's latent variable scores

ETA

Predictive set's latent variable scores

ALPHA

Weights of the predictor set's latent variable

BETA

Weights of the predicted set's latent variable

nr_iterations

Number of iterations ran before convergence (or max number of iterations)

SOLVE_XIXI

Inverse of the predictor set's latent variable variance matrix

iterations_crts

The convergence criterion value (a small positive tolerance)

sum_absolute_betas

Sum of the absolute values of beta weights

ridge_penalty

The ridge penalty parameter used for the model

nr_nonzeros

The number of nonzero alpha weights in the model

nr_latent_variables

The number of latient variable pairs (components) in the model

CV_results

The detailed results of cross validations (if cross_validate is TRUE)

Author(s)

Attila Csala

References

Csala A., Voorbraak F.P.J.M., Zwinderman A.H., and Hof M.H. (2017) Sparse redundancy analysis of high-dimensional genetic and genomic data. Bioinformatics, 33, pp.3228-3234. https://doi.org/10.1093/bioinformatics/btx374

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
# generate data with few highly correlated variahbles
dataXY <- generate_data(nr_LVs = 2,
                           n = 250,
                           nr_correlated_Xs = c(5,20),
                           nr_uncorrelated_Xs = 250,
                           mean_reg_weights_assoc_X =
                             c(0.9,0.5),
                           sd_reg_weights_assoc_X =
                             c(0.05, 0.05),
                           Xnoise_min = -0.3,
                           Xnoise_max = 0.3,
                           nr_correlated_Ys = c(10,15),
                           nr_uncorrelated_Ys = 350,
                           mean_reg_weights_assoc_Y =
                             c(0.9,0.6),
                           sd_reg_weights_assoc_Y =
                             c(0.05, 0.05),
                           Ynoise_min = -0.3,
                           Ynoise_max = 0.3)



# seperate predictor and predicted sets
X <- dataXY$X
Y <- dataXY$Y

# run sRDA
RDA.res <- sRDA(predictor = X, predicted = Y, nonzero = 5,
ridge_penalty = 1, penalization = "ust")


# check first 10 weights of X
RDA.res$ALPHA[1:10]

## Not run: 
# run sRDA with cross-validation to determine best penalization parameters
RDA.res <- sRDA(predictor = X, predicted = Y, nonzero = c(5,10,15),
ridge_penalty = c(0.1,1), penalization = "enet", cross_validate = TRUE,
parallel_CV = TRUE)

# check first 10 weights of X
RDA.res$ALPHA[1:10]

# check the Ridge parameter and the number of nonzeros included in the model
RDA.res$ridge_penalty
RDA.res$nr_nonzeros

# check how much time the cross validation did take
RDA.res$CV_results$stime

# obtain multiple latent variables (components)
RDA.res <- sRDA(predictor = X, predicted = Y, nonzero = c(5,10,15),
ridge_penalty = c(0.1,1), penalization = "enet", cross_validate = TRUE,
parallel_CV = TRUE, multiple_LV = TRUE, nr_LVs = 2, max_iterations = 5)

# check first 20 weights of X in first two component
RDA.res$ALPHA[[1]][1:20]
RDA.res$ALPHA[[2]][1:20]

# components are orthogonal to each other
t(RDA.res$XI[[1]]) %*% RDA.res$XI[[2]]


## End(Not run)

sRDA documentation built on May 2, 2019, 6:43 a.m.

Related to sRDA in sRDA...