sesJIVE: Sparse Exponential Family Supervised JIVE (sesJIVE)

View source: R/sesJIVE.R

sesJIVER Documentation

Sparse Exponential Family Supervised JIVE (sesJIVE)

Description

Given multi-source data and an outcome, sesJIVE can simultaneously identify shared (joint) and source-specific (individual) underlying structure while building a prediction model for an outcome using these structures. These two components are weighted to compromise between explaining variation in the multi-source data and in the outcome, and the method can enforce sparsity in the results if specified. Data and the outcome can follow a normal, Bernoulli, or Poisson distribution.

Usage

sesJIVE(
  X,
  Y,
  rankJ = 1,
  rankA = rep(1, length(X)),
  wts = NULL,
  max.iter = 1000,
  threshold = 0.001,
  family.x = rep("gaussian", length(X)),
  family.y = "gaussian",
  numCores = 1,
  show.error = F,
  sparse = F,
  lambda = NULL,
  intercept = T,
  show.lambda = F,
  initial = "uninformative"
)

Arguments

X

A list of two or more linked data matrices. Each matrix must have the same number of columns which is assumed to be common.

Y

A numeric outcome expressed as a vector with length equal to the number of columns in each view of X.

rankJ

An integer specifying the joint rank of the data. If rankJ=NULL, ranks will be determined by the method option.

rankA

A vector specifying the individual ranks of the data. If rankA=NULL, ranks will be determined by the method option.

wts

A value or vector of values between 0 and 1. If wts is a single value, X will be weighted by wts and Y will be weighted by 1-wts. if wts is a vector, 5-fold CV will pick the wts that minimizes the test deviance.

max.iter

The maximum number of iterations for each instance of the sJIVE algorithm.

threshold

The threshold used to determine convergence of the algorithm.

family.x

A vector of length equal to the number of X data matrices with each element specifying which exponential family the data follows. Options are "gaussian", "binomial", or "poisson". Default is that all X matrices are gaussian

family.y

A string specifying which exponential family the outcome follows. Options are "gaussian", "binomial", or "poisson". Default is "gaussian".

numCores

The number of cores to use when determining the optimal lambda. Default is 1.

show.error

A boolean indicating whether or not to display the weighted log-likelihood after each iteration. Default is FALSE

sparse

A boolean indication whether or not to enforce sparsity in the loadings. See description below for more information.

lambda

A value or vector indicating what values of lambda to consider. If a vector of values, the optimal lambda will be chosen based on CV.

intercept

A boolean indicating whether or not there should be an intercept term in the results.

show.lambda

A boolean indicating if an intermediate table should be printed that shows the predictive performance of each candidate lambda value.

initial

Either "uninformative", "svd", "jive", or "no sparsity" indicating how to generate the initial values for the algorithm. See description for more details.

Details

The method requires the data to be centered and scaled. This can be done prior to running the method or by specifying center.scale=T. The rank of the joint and individual components, the weight between the data and the outcome, and the lambda value for sparsity can be pre-specified or adaptively selected within the function. The method will print the ranks, the weight, the lambda value, and the number of iterations needed to reach convergence.

The sesJIVE algorithm uses an iterative reweighted least squares (IRLS) approach to solve for the parameters. The parameter estimates are initialized by the initial option in the function. "uninformative" will use random values (via the rnorm function) to initialize the starting values. "svd" will take the singular value decomposition (SVD) of the concatenated X matrix to initialize the joint components, and will take the SVD of each individual X matrix to initialize the individual components. Lastly, "jive" will run Lock et al.'s Joint and Variation Explained (JIVE) (2013) method and use the model fit to initialize the parameters.

sesJIVE extends JIVE and sJIVE to allow for different data distributions and sparsity. It decomposes multi-source data into low-rank, orthogonal joint and individual components in a generalized framework that allows each X dataset to follow any exponential family distribution. Each component is broken down into the loadings, or left eigenvectors, and the scores, the product of the eigenvalues and the right eigenvectors. The number of eigenvectors is equal to the rank of the component, and the scores are used to predict y. Sparsity is enforced on the loadings using a LASSO penalty (Tibshirani, 1996), but the fitted score matrices do not have any penalization.

Value

sesJIVE returns an object of class "sesJIVE". The function summary (i.e. summary.sesJIVE) can be used to summarize the model results, including a variance table and testing the significance of the joint and individual components.

An object of class "sesJIVE" is a list containing the following components.

S_J

A matrix capturing the joint scores of the data.

S_I

A list containing matrices that capture the individual scores of the data.

U_I

A list containing matrices that capture the joint loadings of the data.

W_I

A list containing matrices that capture the individual loadings of the data.

theta1

A vector that captures the effect of the joint scores on the outcome.

theta2

A list containing vectors that capture the effect of the individual scores on the outcome.

fittedY

The fitted Y values.

error

The error value at which the model converged.

all.error

The error value at each iteration.

iterations

The number of iterations needed to reach convergence.

rankJ

The rank of the joint structure.

rankA

The rank of the individual structure.

eta

The weight between the data and the outcome.

data

A list containing the centered and scaled data sets, if applicable.

See Also

predict.sesJIVE summary.sesJIVE


enorthrop/sup.r.jive documentation built on Nov. 18, 2022, 6:01 p.m.