o2m: Perform O2PLS data integration with two-way orthogonal...

View source: R/OmicsPLS_o2m.R

o2mR Documentation

Perform O2PLS data integration with two-way orthogonal corrections

Description

NOTE THAT THIS FUNCTION DOES NOT CENTER NOR SCALE THE MATRICES! Any normalization you will have to do yourself. It is best practice to at least center the variables though.

Usage

o2m(
  X,
  Y,
  n,
  nx,
  ny,
  stripped = FALSE,
  p_thresh = 3000,
  q_thresh = p_thresh,
  tol = 1e-10,
  max_iterations = 1000,
  sparse = F,
  groupx = NULL,
  groupy = NULL,
  keepx = NULL,
  keepy = NULL,
  max_iterations_sparsity = 1000
)

Arguments

X

Numeric matrix. Vectors will be coerced to matrix with as.matrix (if this is possible)

Y

Numeric matrix. Vectors will be coerced to matrix with as.matrix (if this is possible)

n

Integer. Number of joint PLS components. Must be positive.

nx

Integer. Number of orthogonal components in X. Negative values are interpreted as 0

ny

Integer. Number of orthogonal components in Y. Negative values are interpreted as 0

stripped

Logical. Use the stripped version of o2m (usually when cross-validating)?

p_thresh

Integer. If X has more than p_thresh columns, a power method optimization is used, see o2m2

q_thresh

Integer. If Y has more than q_thresh columns, a power method optimization is used, see o2m2

tol

Double. Threshold for which the NIPALS method is deemed converged. Must be positive.

max_iterations

Integer. Maximum number of iterations for the NIPALS method.

sparse

Boolean. Default value is FALSE, in which case O2PLS will be fitted. Set to TRUE for GO2PLS.

groupx

Vector. Used when sparse = TRUE. A vector of strings indicating group names of each X-variable. Its length must be equal to the number of variables in X. The order of group names must corresponds to the order of the variables.

groupy

Vector. Used when sparse = TRUE. A vector of strings indicating group names of each Y-variable. The length must be equal to the number of variables in Y. The order of group names must corresponds to the order of the variables.

keepx

Vector. Used when sparse = TRUE. A vector of length n indicating how many variables (or groups if groupx is provided) to keep in each of the joint component of X. If the input is an integer, all the components will have the same amount of variables or groups retained.

keepy

Vector. Used when sparse = TRUE. A vector of length n indicating how many variables (or groups if groupx is provided) to keep in each of the joint component of Y. If the input is an integer, all the components will have the same amount of variables or groups retained.

max_iterations_sparsity

Integer. Used when sparse = TRUE. Maximum number of iterations for the NIPALS method for GO2PLS.

Details

If both nx and ny are zero, o2m is equivalent to PLS2 with orthonormal loadings. This is a ‘slower’ (in terms of memory) implementation of O2PLS, and is using svd, use stripped=T for a stripped version with less output. If either ncol(X) > p_thresh or ncol(Y) > q_thresh, the NIPALS method is used which does not store the entire covariance matrix. The squared error between iterands in the NIPALS approach can be adjusted with tol. The maximum number of iterations in the NIPALS approach is tuned by max_iterations.

Value

A list containing

Tt

Joint X scores

W.

Joint X loadings

U

Joint Y scores

C.

Joint Y loadings

E

Residuals in X

Ff

Residuals in Y

T_Yosc

Orthogonal X scores

P_Yosc.

Orthogonal X loadings

W_Yosc

Orthogonal X weights

U_Xosc

Orthogonal Y scores

P_Xosc.

Orthogonal Y loadings

C_Xosc

Orthogonal Y weights

B_U

Regression coefficient in Tt ~ U

B_T.

Regression coefficient in U ~ Tt

H_TU

Residuals in Tt in Tt ~ U

H_UT

Residuals in U in U ~ Tt

X_hat

Prediction of X with Y

Y_hat

Prediction of Y with X

R2X

Variation (measured with ssq) of the modeled part in X (defined by joint + orthogonal variation) as proportion of variation in X

R2Y

Variation (measured with ssq) of the modeled part in Y (defined by joint + orthogonal variation) as proportion of variation in Y

R2Xcorr

Variation (measured with ssq) of the joint part in X as proportion of variation in X

R2Ycorr

Variation (measured with ssq) of the joint part in Y as proportion of variation in Y

R2X_YO

Variation (measured with ssq) of the orthogonal part in X as proportion of variation in X

R2Y_XO

Variation (measured with ssq) of the orthogonal part in Y as proportion of variation in Y

R2Xhat

Variation (measured with ssq) of the predicted X as proportion of variation in X

R2Yhat

Variation (measured with ssq) of the predicted Y as proportion of variation in Y

W_gr

Joint loadings of X at group level (only available when GO2PLS is used)

C_gr

Joint loadings of Y at group level (only available when GO2PLS is used)

See Also

summary.o2m, plot.o2m, crossval_o2m_adjR2, crossval_sparsity

Examples

test_X <- scale(matrix(rnorm(100*10),100,10))
test_Y <- scale(matrix(rnorm(100*11),100,11))
#  --------- Default run ------------ 
o2m(test_X, test_Y, 3, 2, 1)
#  ---------- Stripped version ------------- 
o2m(test_X, test_Y, 3, 2, 1, stripped = TRUE)
#  ---------- High dimensional version ---------- 
o2m(test_X, test_Y, 3, 2, 1, p_thresh = 1)
#  ------ High D and stripped version --------- 
o2m(test_X, test_Y, 3, 2, 1, stripped = TRUE, p_thresh = 1)
#  ------ Now with more iterations -------- 
o2m(test_X, test_Y, 3, 2, 1, stripped = TRUE, p_thresh = 1, max_iterations = 1e6)
#  ---------------------------------- 


selbouhaddani/OmicsPLS documentation built on Aug. 25, 2022, 9:52 p.m.