Multiblock Sparse Multivariable Analysis

Description

This is a function for a matrix decomposition method incorporating sparse and supervised modeling for a multiblock multivariable data analysis

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
msma(X, ...)

## Default S3 method:
msma(X, Y = NULL, Z = NULL, comp = 2, lambdaX = NULL,
  lambdaY = NULL, eta = 1, type = "lasso", inX = NULL, inY = NULL,
  muX = 0, muY = 0, defmethod = "canonical", scaling = TRUE,
  verbose = FALSE, ...)

## S3 method for class 'msma'
print(x, ...)

## S3 method for class 'msma'
plot(x, ...)

Arguments

X

a (list of) matrix, explanatory variable(s) which is required.

...

further arguments passed to or from other methods.

Y

a (list of) matrix, objective variable(s). This is optional. If no input for Y, then the PCA method is implemented.

Z

a vector, response variable(s). This is optional. The length is the number of subjects. If no input for Z, then the unsupervised PLS/PCA is implemented.

comp

numeric scalar for the number of components to be considered.

lambdaX

numeric vector of regularized parameters for X with length equal to the number of blocks. If omitted, no regularization is conducted.

lambdaY

numeric vector of regularized parameters for Y with length equal to the number of blocks. If omitted, no regularization is conducted.

eta

numeric scalar, the parameter indexing the penalty family. This version has only the choice 1.

type

a character, the penalty family. This version has only the choice "lasso".

inX

a (list of) numeric vector to specify the variables of X which are always in the model.

inY

a (list of) numeric vector to specify the variables of Y which are always in the model.

muX

a numeric scalar for the weight of X for the supervised. 0<=muX<=1.

muY

a numeric scalar for the weight of Y for the supervised. 0<=muY<=1.

defmethod

a character, the deflation method, this version has only the choice "canonical".

scaling

a logical, whether the scaling data is done, the default is TRUE.

verbose

information

x

an object of class "msma", usually, a result of a call to msma

Details

msma requires at least one input X as the matrix or the list. In this case, the (multiblock) principal components analysis is conducted. If Y is also specified, the partial least squares with X as explanatory variables and Y as objective variables. This function scaled each data matrix to mean 0 and variance 1 in the default. The block structure can be represented as the list. If Z is also specified, the supervised version is implemented and the degree is controlled by muX or muY where 0<=muX<=1, 0<=muY<=1, and 0<=muX+muY<1. If the positive lambdaX or lambdaY is specified, the sparse estimation based on L1 penalty is implemented.

Value

dmode

Which modes "PLS" or "PCA"

X

Scaled X which has a list form.

Y

Scaled Y which has a list form.

Xscale

Scaling information for X. The means and standard deviations for each block of X are returned.

Yscale

Scaling information for Y. The means and standard deviations for each block of Y are returned.

comp

the number of components

wbX

block loading for X. The list has same length as that of the input list X (the number of blocks) and consists of the matrix with the number of variables in the row and the number of components in the column.

sbX

block score for X. The list has same length as that of the input list X (the number of blocks) and consists of the matrix with the number of subjects in the row and the number of components in the column.

wbY

block loading for Y. The list has same length as that of the input list Y (the number of blocks) and consists of the matrix with the number of variables in the row and the number of components in the column.

sbY

block score for Y. The list has same length as that of the input list Y (the number of blocks) and consists of the matrix with the number of subjects in the row and the number of components in the column.

ssX

super score for X. The matrix has the number of subjects in the row and the number of components in the column.

wsX

super loading for X. The matrix has the number of blocks in the row and the number of components in the column.

ssY

super score for Y. The matrix has the number of subjects in the row and the number of components in the column.

wsY

super loading for Y. The matrix has the number of blocks in the row and the number of components in the column.

nzwbX

number of nonzeros in block loading for X

nzwbY

number of nonzeros in block loading for Y

selectXnames

names of selected variables for X. This returns the names of X

selectYnames

names of selected variables for Y. This returns the names of Y

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
##### data #####
tmpdata = simdata(n = 50, rho = 0.8, Yps = c(10, 12, 15), Xps = 20, seed=1)
X = tmpdata$X; Y = tmpdata$Y 

##### One Component #####
fit1 = msma(X, Y, comp=1, lambdaX=2, lambdaY=1:3)
fit1

##### Two Component #####
fit2 = msma(X, Y, comp=2, lambdaX=2, lambdaY=1:3)
fit2

##### Matrix data #####
sigma = matrix(0.8, 10, 10)
diag(sigma) = 1
X2 = rmvnorm(50, rep(0, 10), sigma)
Y2 = rmvnorm(50, rep(0, 10), sigma)

fit3 = msma(X2, Y2, comp=1, lambdaX=2, lambdaY=2)
fit3

##### Sparse Principal Component Analysis #####
fit5 = msma(X2, comp=5, lambdaX=2.5)
summary(fit5)