Description Model and assumptions Fitting Obtaining results Cross-validating Misc Citation Author(s)
This is based on work of (Trygg, Wold, 2003). Includes the O2PLS fit, some misc functions and some cross-validation tools.
Note that the rows of X
and Y
are the subjects and columns are variables.
The number of columns may be different, but the subjects should be the same in both datasets.
The O2PLS model (Trygg & Wold, 2003) decomposes two datasets X and Y into three parts.
1. A joint part, representing the relationship between X and Y
2. An orthogonal part, representing the unrelated latent variation in X and Y separately.
3. A noise part capturing all residual variation.
See also the corresponding paper for interpretation (el Bouhaddani et al, 2016).
The O2PLS fit is done with o2m
.
For data X
and Y
you can run o2m(X,Y,n,nx,ny)
for an O2PLS fit with n
joint and nx, ny
orthogonal components.
See the help page of o2m
for more information on parameters.
There are four ways to obtain an O2PLS fit, depending on the dimensionality.
For the not-too-high dimensional case, you may use o2m
with default parameters. E.g. o2m(X,Y,n,nx,ny)
.
In case you don't want the fancy output, but only the parameters, you may add stripped = TRUE
to obtain a stripped version of o2m
which avoids calculating and storing some matrices. E.g. o2m(X,Y,n,nx,ny,stripped=TRUE)
.
For high dimensional cases defined by ncol(X)>p_thresh
and ncol(Y)>q_thresh
a Power-Method approach is used which avoids storing large matrices. E.g. o2m(X,Y,n,nx,ny,p_thresh=3000,q_thresh=3000)
.
The thresholds are by default both at 3000 variables.
If you want a stripped version in the high dimensional case, add stripped = TRUE
. E.g. o2m(X,Y,n,nx,ny,stripped=TRUE,p_thresh=3000,q_thresh=3000)
.
After fitting an O2PLS model, by running e.g. fit = o2m(X,Y,n,nx,ny)
, the results can be visualised.
Use plot(fit,...)
to plot the desired loadings with/without ggplot2.
Use summary(fit,...)
to see the relative explained variances in the joint/orthogonal parts.
Also plotting the joint scores fit$Tt, fit$U
and orthogonal scores fit$T_Yosc, fit$U_Xosc
are of help.
Determining the number of components n,nx,ny
is an important task. For this we have two methods.
See citation("O2PLS")
for our proposed approach for determining the number of components, implemented in crossval_o2m_adjR2
!
Cross-validation (CV) is done with crossval_o2m
and crossval_o2m_adjR2
, both have built in parallelization which relies on the parallel
package.
Usage is something like crossval_o2m(X,Y,a,ax,ay)
where a,ax,ay
are vectors of integers. See the help pages.
kcv
is the number of folds, with kcv = nrow(X)
for Leave-One-Out CV.
For crossval_o2m_adjR2
the same parameters are to be specified. This way of cross-validating is (potentially much)
faster than the standard approach.
Also some handy tools are available
orth(X)
is a function to obtain an orthogonalized version of a matrix or vector X
.
ssq(X)
is a function to calculate the sum of squares (or squared Frobenius norm) of X
. See also vnorm
for calculating the norm of each column in X
.
mse(x, y)
returns the mean squared difference between two matrices/vectors. By default y=0
.
If you use the R package in your research, please cite the corresponding paper:
Bouhaddani, S., Houwing-duistermaat, J., Jongbloed, G., Salo, P., Perola, M., & Uh, H.-W. (2016). Evaluation of O2PLS in Omics data integration. BMC Bioinformatics BMTL Supplement. doi:10.1186/s12859-015-0854-z
The bibtex entry can be obtained with command citation("O2PLS")
.
Thank You!
The original paper proposing O2PLS is
Trygg, J., & Wold, S. (2003). O2-PLS, a two-block (X-Y) latent variable regression (LVR) method with an integral OSC filter. Journal of Chemometrics, 17(1), 53-64. http://doi.org/10.1002/cem.775
Said el Bouhaddani (s.el_bouhaddani@lumc.nl), Jeanine Houwing-Duistermaat (J.J.Houwing@lumc.nl), Geurt Jongbloed (G.Jongbloed@tudelft.nl), Szymon Kielbasa (S.M.Kielbasa@lumc.nl), Hae-Won Uh (H.Uh@lumc.nl).
Maintainer: Said el Bouhaddani (s.el_bouhaddani@lumc.nl).
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.