msma: Multiblock Sparse Partial Least Squares

View source: R/src.r

msmaR Documentation

Multiblock Sparse Partial Least Squares

Description

This is a function for a matrix decomposition method incorporating sparse and supervised modeling for a multiblock multivariable data analysis

Usage

msma(X, ...)

## Default S3 method:
msma(
  X,
  Y = NULL,
  Z = NULL,
  comp = 2,
  lambdaX = NULL,
  lambdaY = NULL,
  lambdaXsup = NULL,
  lambdaYsup = NULL,
  eta = 1,
  type = "lasso",
  inX = NULL,
  inY = NULL,
  inXsup = NULL,
  inYsup = NULL,
  muX = 0,
  muY = 0,
  defmethod = "canonical",
  scaling = TRUE,
  verbose = FALSE,
  intseed = 1,
  ceps = 1e-04,
  ...
)

## S3 method for class 'msma'
print(x, ...)

Arguments

X

a matrix or list of matrices indicating the explanatory variable(s). This parameter is required.

...

further arguments passed to or from other methods.

Y

a matrix or list of matrices indicating objective variable(s). This is optional. If there is no input for Y, then PCA is implemented.

Z

a vector, response variable(s) for implementing the supervised version of (multiblock) PCA or PLS. This is optional. The length of Z is the number of subjects. If there is no input for Z, then unsupervised PLS/PCA is implemented.

comp

numeric scalar for the maximum number of componets to be considered.

lambdaX

numeric vector of regularized parameters for X, with a length equal to the number of blocks. If lambdaX is omitted, no regularization is conducted.

lambdaY

numeric vector of regularized parameters for Y, with a length equal to the number of blocks. If lambdaY is omitted, no regularization is conducted.

lambdaXsup

numeric vector of regularized parameters for the super weight of X with length equal to the number of blocks. If omitted, no regularization is conducted.

lambdaYsup

numeric vector of regularized parameters for the super weight of Y with length equal to the number of blocks. If omitted, no regularization is conducted.

eta

numeric scalar indicating the parameter indexing the penalty family. This version contains only choice 1.

type

a character, indicating the penalty family. In this version, only one choice is available: "lasso."

inX

a vector or list of numeric vectors specifying the variables in X, always included in the model

inY

a vector or list of numeric vectors specifying the variables in Y, always included in the model

inXsup

a (list of) numeric vector to specify the blocks of X which are always in the model.

inYsup

a (list of) numeric vector to specify the blocks of Y which are always in the model.

muX

a numeric scalar for the weight of X for the supervised case. 0 <= muX <= 1.

muY

a numeric scalar for the weight of Y for the supervised case. 0 <= muY <= 1.

defmethod

a character representing the deflation method. This version has only the choice "canonical."

scaling

a logical, indicating whether or not data scaling is performed. The default is TRUE.

verbose

information

intseed

seed number for the random number in the parameter estimation algorithm.

ceps

a numeric scalar for the convergence condition of the algorithm

x

an object of class "msma", usually, a result of a call to msma

Details

msma requires at least one input X (a matrix or list). In this case, (multiblock) PCA is conducted. If Y is also specified, then a PLS is conducted using X as explanatory variables and Y as objective variables. This function scales each data matrix to a mean of 0 and variance of 1 in the default. The block structure can be represented as a list. If Z is also specified, a supervised version is implemented, and the degree is controlled by muX or muY, where 0 <= muX <= 1, 0 <= muY <= 1, and 0 <= muX + muY < 1. If a positive lambdaX or lambdaY is specified, then a sparse estimation based on the L1 penalty is implemented.

Value

dmode

Which modes "PLS" or "PCA"

X

Scaled X which has a list form.

Y

Scaled Y which has a list form.

Xscale

Scaling information for X. The means and standard deviations for each block of X are returned.

Yscale

Scaling information for Y. The means and standard deviations for each block of Y are returned.

comp

the number of componets

wbX

block loading for X

sbX

block score for X

wbY

block loading for Y

sbY

block score for Y

ssX

super score for X

wsX

super loading for X

ssY

super score for Y

wsY

super loading for Y

nzwbX

number of nonzeros in block loading for X

nzwbY

number of nonzeros in block loading for Y

nzwsX

number of nonzeros in super loading for X

nzwsY

number of nonzeros in super loading for Y

selectXnames

names of selected variables for X

selectYnames

names of selected variables for Y

avX

the adjusted variance of the score for X

avY

the adjusted variance of the score for Y

cpevX

the cumulative percentage of the explained variance for X

cpevY

the cumulative percentage of the explained variance for Y

reproduct

Predictivity. Correlation between Y and the predicted Y

predictiv

Reproductivity. Correlation between the score for Y and the outcome Z

Examples

##### data #####
tmpdata = simdata(n = 50, rho = 0.8, Yps = c(10, 12, 15), Xps = 20, seed=1)
X = tmpdata$X; Y = tmpdata$Y 

##### One Component #####
fit1 = msma(X, Y, comp=1, lambdaX=2, lambdaY=1:3)
fit1

##### Two Component #####
fit2 = msma(X, Y, comp=2, lambdaX=2, lambdaY=1:3)
fit2

##### Sparse Principal Component Analysis #####
fit3 = msma(X, comp=5, lambdaX=2.5)
summary(fit3)


msma documentation built on Aug. 25, 2023, 9:07 a.m.