SBF | R Documentation |
Function to compute Shared Basis Factorization (SBF) and Orthogonal Shared Basis Factorization (OSBF)
SBF( matrix_list = NULL, check_col_matching = FALSE, col_sep = "_", col_index = NULL, weighted = FALSE, orthogonal = FALSE, transform_matrix = FALSE, minimizeError = TRUE, optimizeV = TRUE, initial_exact = FALSE, max_iter = 10000, tol = 1e-10, verbose = FALSE )
matrix_list |
A list containing Di matrices for joint matrix factorization. Column names of each Di matrix may or may not have information about tissue or cell type. |
check_col_matching |
if the column names have information about tissue or cell type and one-to-one correspondence of tissue types across species has to be checked, set this parameter to be TRUE. Default FALSE. |
col_sep |
separator in column names to separate different fields. Example for column names 'hsapiens_brain', 'hsapiens_heart' etc., the separator is underscore. Set it to NULL if column matching across species has to be performed and there is no separator in the column names. Only checked if check_col_matching = TRUE. Default underscore. |
col_index |
If a separator separates information in column names, the col_index is the index in the column name corresponding to tissue or cell type. E.g. for column name 'hsapiens_brain', col_index is 2. Only checked if check_col_matching = TRUE. Default NULL. |
weighted |
If TRUE each Di^TDi is scaled using inverse variance weights Default FALSE. |
orthogonal |
TRUE will compute OSBF. Default FALSE. |
transform_matrix |
If TRUE, then Di will be transformed to compute correlation matrix, and V is computed based on this instead of Di^TDi. An unbiased estimate of covariance (denominator n-1) is used for the computing correlation. Default FALSE. |
minimizeError |
If true, the factorization error is minimized for the OSBF by invoking 'optimizeFactorization' function. Default TRUE. |
optimizeV |
Whether initial V should be update or not when minimizing OSBF factorization error. Default TRUE. This is an argument for 'optimizeFactorization' function. |
initial_exact |
Whether the initial value of U, Delta, and V gives exact factorization. Default FALSE. This is an argument for 'optimizeFactorization' function. |
max_iter |
Maximum number of iterations. In each iteration u, d, and v are updated. Default 1e4. This is an argument for 'optimizeFactorization' function. |
tol |
Tolerance threshold During the iterations, if the difference between previous best and current best factorization error becomes less than tol, no more iteration is performed. Default tol = 1e-10. This is an argument for 'optimizeFactorization' function. |
verbose |
if TRUE print verbose lines. Default FALSE. |
a list containing u, delta, v, m, lambda (eigenvalues of m), and other outputs of SBF/OSBF factorization.
# create test dataset set.seed(1231) mymat <- createRandomMatrices(n = 4, ncols = 3, nrows = 4:6) # SBF call. Estimate V using the sum of Di^TDi sbf <- SBF(matrix_list = mymat) # SBF call. Estimate V using inverse-variance weighted Di^TDi sbf <- SBF(matrix_list = mymat, weighted = TRUE) # calculate decomposition error decomperror <- calcDecompError(mymat, sbf$u, sbf$delta, sbf$v) # SBF call using correlation matrix sbf_cor <- SBF(matrix_list = mymat, transform_matrix = TRUE) decomperror <- calcDecompError(mymat, sbf_cor$u, sbf_cor$delta, sbf_cor$v) # SBF call for gene expression dataset using correlation matrix avg_counts <- SBF::TissueExprSpecies sbf_cor <- SBF(matrix_list = avg_counts, transform_matrix = TRUE) # OSBF call for gene expression dataset using correlation matrix avg_counts <- SBF::TissueExprSpecies asbf_cor <- SBF(matrix_list = avg_counts, orthogonal = TRUE, transform_matrix = TRUE, tol = 1e-2)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.