SBF: Compute Shared Basis Factorization (SBF) and Orthogonal...

View source: R/SBF.R

SBFR Documentation

Compute Shared Basis Factorization (SBF) and Orthogonal Shared Basis Factorization (OSBF)

Description

Function to compute Shared Basis Factorization (SBF) and Orthogonal Shared Basis Factorization (OSBF)

Usage

SBF(
  matrix_list = NULL,
  check_col_matching = FALSE,
  col_sep = "_",
  col_index = NULL,
  weighted = FALSE,
  orthogonal = FALSE,
  transform_matrix = FALSE,
  minimizeError = TRUE,
  optimizeV = TRUE,
  initial_exact = FALSE,
  max_iter = 10000,
  tol = 1e-10,
  verbose = FALSE
)

Arguments

matrix_list

A list containing Di matrices for joint matrix factorization. Column names of each Di matrix may or may not have information about tissue or cell type.

check_col_matching

if the column names have information about tissue or cell type and one-to-one correspondence of tissue types across species has to be checked, set this parameter to be TRUE. Default FALSE.

col_sep

separator in column names to separate different fields. Example for column names 'hsapiens_brain', 'hsapiens_heart' etc., the separator is underscore. Set it to NULL if column matching across species has to be performed and there is no separator in the column names. Only checked if check_col_matching = TRUE. Default underscore.

col_index

If a separator separates information in column names, the col_index is the index in the column name corresponding to tissue or cell type. E.g. for column name 'hsapiens_brain', col_index is 2. Only checked if check_col_matching = TRUE. Default NULL.

weighted

If TRUE each Di^TDi is scaled using inverse variance weights Default FALSE.

orthogonal

TRUE will compute OSBF. Default FALSE.

transform_matrix

If TRUE, then Di will be transformed to compute correlation matrix, and V is computed based on this instead of Di^TDi. An unbiased estimate of covariance (denominator n-1) is used for the computing correlation. Default FALSE.

minimizeError

If true, the factorization error is minimized for the OSBF by invoking 'optimizeFactorization' function. Default TRUE.

optimizeV

Whether initial V should be update or not when minimizing OSBF factorization error. Default TRUE. This is an argument for 'optimizeFactorization' function.

initial_exact

Whether the initial value of U, Delta, and V gives exact factorization. Default FALSE. This is an argument for 'optimizeFactorization' function.

max_iter

Maximum number of iterations. In each iteration u, d, and v are updated. Default 1e4. This is an argument for 'optimizeFactorization' function.

tol

Tolerance threshold During the iterations, if the difference between previous best and current best factorization error becomes less than tol, no more iteration is performed. Default tol = 1e-10. This is an argument for 'optimizeFactorization' function.

verbose

if TRUE print verbose lines. Default FALSE.

Value

a list containing u, delta, v, m, lambda (eigenvalues of m), and other outputs of SBF/OSBF factorization.

Examples

# create test dataset
set.seed(1231)
mymat <- createRandomMatrices(n = 4, ncols = 3, nrows = 4:6)

# SBF call. Estimate V using the sum of Di^TDi
sbf <- SBF(matrix_list = mymat)

# SBF call. Estimate V using inverse-variance weighted Di^TDi
sbf <- SBF(matrix_list = mymat, weighted = TRUE)
# calculate decomposition error
decomperror <- calcDecompError(mymat, sbf$u, sbf$delta, sbf$v)

# SBF call using correlation matrix
sbf_cor <- SBF(matrix_list = mymat, transform_matrix = TRUE)
decomperror <- calcDecompError(mymat, sbf_cor$u, sbf_cor$delta, sbf_cor$v)

# SBF call for gene expression dataset using correlation matrix
avg_counts <- SBF::TissueExprSpecies
sbf_cor <- SBF(matrix_list = avg_counts, transform_matrix = TRUE)

# OSBF call for gene expression dataset using correlation matrix
avg_counts <- SBF::TissueExprSpecies
asbf_cor <- SBF(matrix_list = avg_counts, orthogonal = TRUE,
                transform_matrix = TRUE, tol = 1e-2)

amalthomas111/SBF documentation built on Sept. 2, 2022, 11:27 a.m.