corralm: Multi-table correspondence analysis (list of matrices)

View source: R/corralm.R

corralm_matlistR Documentation

Multi-table correspondence analysis (list of matrices)

Description

This multi-table adaptation of correpondence analysis applies the same scaling technique and enables data alignment by finding a set of embeddings for each dataset within shared latent space.

Usage

corralm_matlist(
  matlist,
  method = c("irl", "svd"),
  ncomp = 30,
  rtype = c("indexed", "standardized", "hellinger", "freemantukey", "pearson"),
  vst_mth = c("none", "sqrt", "freemantukey", "anscombe"),
  rw_contrib = NULL,
  ...
)

corralm_sce(
  sce,
  splitby,
  method = c("irl", "svd"),
  ncomp = 30,
  whichmat = "counts",
  fullout = FALSE,
  rw_contrib = NULL,
  ...
)

corralm(inp, whichmat = "counts", fullout = FALSE, ...)

## S3 method for class 'corralm'
print(x, ...)

Arguments

matlist

(for corralm_matlist) list of input matrices; input matrices should be counts (raw or log). Matrices should be aligned row-wise by common features (either by sample or by gene)

method

character, the algorithm to be used for svd. Default is irl. Currently supports 'irl' for irlba::irlba or 'svd' for stats::svd

ncomp

numeric, number of components; Default is 30

rtype

character indicating what type of residual should be computed; options are '"indexed"', '"standardized"' (or '"pearson"' is equivalent), '"freemantukey"', and '"hellinger"'; defaults to '"standardized"' for corral and '"indexed"' for corralm. '"indexed"', '"standardized"', and '"freemantukey"' compute the respective chi-squared residuals and are appropriate for count data. The '"hellinger"' option is appropriate for continuous data.

vst_mth

character indicating whether a variance-stabilizing transform should be applied prior to calculating chi-squared residuals; defaults to '"none"'

rw_contrib

numeric vector, same length as the matlist. Indicates the weight that each dataset should contribute to the row weights. When set to NULL the row weights are *not* combined and each matrix is scaled independently (i.e., using their observed row weights, respectively). When set to a vector of all the same values, this is equivalent to taking the mean. Another option is to the number of observations per matrix to create a weighted mean. Regardless of input scale, row weights for each table must sum to 1 and thus are scaled. When this option is specified (i.e., not 'NULL'), the 'rtype' argument will automatically be set to 'standardized', and whatever argument is given will be ignored.

...

(additional arguments for methods)

sce

(for corralm_sce) SingleCellExperiment; containing the data to be integrated. Default is to use the counts, and to include all of the data in the integration. These can be changed by passing additional arguments. See sce2matlist function documentation for list of available parameters.

splitby

character; name of the attribute from colData that should be used to separate the SCE.

whichmat

char, when using SingleCellExperiment or other SummarizedExperiment, can be specified. default is 'counts'.

fullout

boolean; whether the function will return the full corralm output as a list, or a SingleCellExperiment; defaults to SingleCellExperiment (FALSE). To get back the corralm_matlist-style output, set this to TRUE.

inp

list of matrices (any type), a SingleCellExperiment, list of SingleCellExperiments, list of SummarizedExperiments, or MultiAssayExperiment. If using SingleCellExperiment or SummarizedExperiment, then include the whichmat argument to specify which slot to use (defaults to counts). Additionally, if it is one SingleCellExperiment, then it is also necessary to include the splitby argument to specify the batches. For a MultiAssayExperiment, it will take the intersect of the features across all the assays, and use those to match the matrices; to use a different subset, select desired subsets then call corral

x

(print method) corralm object; the list output from corralm_matlist

Details

corralm is a wrapper for corralm_matlist and corralm_sce, and can be called on any of the acceptable input types (see inp below).

Value

When run on a list of matrices, a list with the correspondence analysis matrix decomposition result, with indices corresponding to the concatenated matrices (in order of the list):

d

a vector of the diagonal singular values of the input mat (from SVD output)

u

a matrix of with the left singular vectors of mat in the columns (from SVD output)

v

a matrix of with the right singular vectors of mat in the columns. When cells are in the columns, these are the cell embeddings. (from SVD output)

eigsum

sum of the eigenvalues for calculating percent variance explained

For SingleCellExperiment input, returns the SCE with embeddings in the reducedDim slot 'corralm'

For a list of SingleCellExperiments, returns a list of the SCEs with the embeddings in the respective reducedDim slot 'corralm'

.

Examples

listofmats <- list(matrix(sample(seq(0,20,1),1000,replace = TRUE),nrow = 25),
                   matrix(sample(seq(0,20,1),1000,replace = TRUE),nrow = 25))
result <- corralm_matlist(listofmats)
library(DuoClustering2018)
library(SingleCellExperiment)
sce <- sce_full_Zhengmix4eq()[1:100,sample(1:3500,100,replace = FALSE)]
colData(sce)$Method <- matrix(sample(c('Method1','Method2'),100,replace = TRUE))
result <- corralm_sce(sce, splitby = 'Method')


listofmats <- list(matrix(sample(seq(0,20,1),1000,replace = TRUE),nrow = 20),
                   matrix(sample(seq(0,20,1),1000,replace = TRUE),nrow = 20))
corralm(listofmats)

library(DuoClustering2018)
library(SingleCellExperiment)
sce <- sce_full_Zhengmix4eq()[seq(1,100,1),sample(seq(1,3500,1),100,replace = FALSE)]
colData(sce)$Method <- matrix(sample(c('Method1','Method2'),100,replace = TRUE))
result <- corralm(sce, splitby = 'Method')

# default print method for corralm objects

laurenhsu1/corral documentation built on Feb. 19, 2023, 10:37 p.m.