sesJIVE | R Documentation |
Given multi-source data and an outcome, sesJIVE can simultaneously identify shared (joint) and source-specific (individual) underlying structure while building a prediction model for an outcome using these structures. These two components are weighted to compromise between explaining variation in the multi-source data and in the outcome, and the method can enforce sparsity in the results if specified. Data and the outcome can follow a normal, Bernoulli, or Poisson distribution.
sesJIVE( X, Y, rankJ = 1, rankA = rep(1, length(X)), wts = NULL, max.iter = 1000, threshold = 0.001, family.x = rep("gaussian", length(X)), family.y = "gaussian", numCores = 1, show.error = F, sparse = F, lambda = NULL, intercept = T, show.lambda = F, initial = "uninformative" )
X |
A list of two or more linked data matrices. Each matrix must have the same number of columns which is assumed to be common. |
Y |
A numeric outcome expressed as a vector with length equal
to the number of columns in each view of |
rankJ |
An integer specifying the joint rank of the data.
If |
rankA |
A vector specifying the individual ranks of the data.
If |
wts |
A value or vector of values between 0 and 1. If |
max.iter |
The maximum number of iterations for each instance of the sJIVE algorithm. |
threshold |
The threshold used to determine convergence of the algorithm. |
family.x |
A vector of length equal to the number of |
family.y |
A string specifying which exponential family the outcome follows. Options are "gaussian", "binomial", or "poisson". Default is "gaussian". |
numCores |
The number of cores to use when determining the optimal lambda. Default is 1. |
show.error |
A boolean indicating whether or not to display the weighted log-likelihood after each iteration. Default is FALSE |
sparse |
A boolean indication whether or not to enforce sparsity in the loadings. See description below for more information. |
lambda |
A value or vector indicating what values of lambda to consider. If a vector of values, the optimal lambda will be chosen based on CV. |
intercept |
A boolean indicating whether or not there should be an intercept term in the results. |
show.lambda |
A boolean indicating if an intermediate table should be printed that shows the predictive performance of each candidate lambda value. |
initial |
Either "uninformative", "svd", "jive", or "no sparsity" indicating how to generate the initial values for the algorithm. See description for more details. |
The method requires the data to be centered and scaled. This can be done prior to running the method or by specifying center.scale=T. The rank of the joint and individual components, the weight between the data and the outcome, and the lambda value for sparsity can be pre-specified or adaptively selected within the function. The method will print the ranks, the weight, the lambda value, and the number of iterations needed to reach convergence.
The sesJIVE algorithm uses an iterative reweighted least squares (IRLS)
approach to solve for the parameters. The parameter estimates are
initialized by the initial
option in the function. "uninformative"
will use random values (via the rnorm
function) to initialize the
starting values. "svd" will take the singular value decomposition (SVD) of the
concatenated X matrix to initialize the joint components, and will take the SVD
of each individual X matrix to initialize the individual components. Lastly,
"jive" will run Lock et al.'s Joint and Variation Explained (JIVE) (2013) method
and use the model fit to initialize the parameters.
sesJIVE extends JIVE and sJIVE to allow for different data distributions and
sparsity. It decomposes multi-source data into low-rank,
orthogonal joint and individual components in a generalized framework that allows
each X dataset to follow any exponential family distribution. Each component is broken down
into the loadings, or left eigenvectors, and the scores, the product of the
eigenvalues and the right eigenvectors. The number of eigenvectors is equal to
the rank of the component, and the scores are used to predict y
. Sparsity is enforced
on the loadings using a LASSO penalty (Tibshirani, 1996), but the fitted score matrices
do not have any penalization.
sesJIVE
returns an object of class "sesJIVE". The function summary
(i.e. summary.sesJIVE
) can be used to summarize the model results, including a
variance table and testing the significance of the joint and individual components.
An object of class "sesJIVE" is a list containing the following components.
S_J |
A matrix capturing the joint scores of the data. |
S_I |
A list containing matrices that capture the individual scores of the data. |
U_I |
A list containing matrices that capture the joint loadings of the data. |
W_I |
A list containing matrices that capture the individual loadings of the data. |
theta1 |
A vector that captures the effect of the joint scores on the outcome. |
theta2 |
A list containing vectors that capture the effect of the individual scores on the outcome. |
fittedY |
The fitted Y values. |
error |
The error value at which the model converged. |
all.error |
The error value at each iteration. |
iterations |
The number of iterations needed to reach convergence. |
rankJ |
The rank of the joint structure. |
rankA |
The rank of the individual structure. |
eta |
The weight between the data and the outcome. |
data |
A list containing the centered and scaled data sets, if applicable. |
predict.sesJIVE
summary.sesJIVE
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.