knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
This vignette illustrates an example on how to use the sup.r.jive
package to run supervised Joint and Variation Explained (sJIVE). sJIVE was developed by Palzer et al. (2022) and is a multi-source supervised method that can simultanesously identify joint and individual components among the data sources, and build a linear prediction model.
Here, we avoid methodological details and focus on the functionality of the sup.r.jive
package.
If you haven't used this package before, you'll have to install it first. You can install the development version of sup.r.jive
from GitHub with:
# install.packages("devtools") #devtools::install_github("enorthrop/sup.r.jive") library(sup.r.jive)
For this example, we will use the SimData.norm
data file contained in this package. After loading the data, we can see that it contains 2 data matrices saved as a list (X
) and contains a continuous outcome vector (Y
). Note that both data matrices have 40 rows or predictors, and have 30 columns or observations. The number of rows need not match across sources, but each source and the outcome must be collected on the same group of individuals.
data("SimData.norm") str(SimData.norm)
Let's say that we want to uncover the joint structure between the data sources, each source's individual structure, as well as predict the outcome Y
. We can calculate this by running the sJIVE()
function. By default, the function will center and scale both X
and Y
. The ranks of the joint and individual components will be calculated by the permutation approach proposed in JIVE by Lock et al. (2013), and the tuning parameter, eta
, will be calculated by 5-fold cross-validation.
fit <- sJIVE(X=SimData.norm$X, Y=SimData.norm$Y, rankJ=1, rankA = c(1,1))
The output shows the final ranks for the joint and individual components as well as the tuning parameter and the number of iterations it took to reach convergence. By printing the summary model output, we can find similar information.
summary(fit)
If we want to make predictions, we can do so using the predict function. Note that the Y predictions are for the centered and scaled outcome. fit$data
contains the centered and scaled data, which should be used when assessing prediction accuracy of the method.
fit.pred <- predict(fit, newdata = SimData.norm$X) #MSE sum((fit$data$Y-fit.pred$Ypred)^2)/length(fit$data$Y)
The sup.r.jive
package contains 3 visualization tools that help display the results. Each of the 3 functions will work for an object of class JIVE.pred, sJIVE, or sesJIVE as its input, but in this vignette, we will briefly discuss each function using the fitted sJIVE model from above.
This function displays a heatmap of the contribution of the joint and individual components. Note that there is not a heatmap for the residual error of the fitted model. the order_by
option in this function allows the user to order the column by the original data (default), the joint component (order_by=1
), or the i'th individual component (order_by=i
).
plotHeatmap(fit) plotHeatmap(fit, order_by = 0, ylab="Outcome", xlab=c("Data1", "Data2"))
This function creates a series of barplots to display the percent of variance explained by the joint and individual components. The col
option allows you to choose the color palette in the graphs. The values graphed in the barplots can be found by the \$variance
output from the summary()
function.
plotVarExplained(fit, col=c("grey20", "grey43", "grey65"))
This function displays two diagnostic plots for the fitted $Y$ values. The first plot compares the residuals to the fitted values, and the second plot is a Q-Q plot to look at the quantiles.
plotFittedValues(fit)
Palzer, EF, C Wendt, R Bowler, CP Hersh, SE Safo, and EF Lock. 2021. "sJIVE: Supervised Joint and Individual Variation Explained." Pre-print on arXiv.
Lock, EF, KA Hoadley, JS Marron, and AB Nobel. 2013. “Joint and Individual Variation Explained (JIVE) for Integrated Analysis of Multiple Data Types.” \textit{The Annals of Applied Statistics} 7 (1): 523–42.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.