corralm: Multi-table correspondence analysis (list of matrices)
In laurenhsu1/corral: Correspondence Analysis for Single Cell Data

corralm_matlist

R Documentation

Multi-table correspondence analysis (list of matrices)

Description

This multi-table adaptation of correpondence analysis applies the same scaling technique and enables data alignment by finding a set of embeddings for each dataset within shared latent space.

Usage

corralm_matlist(
  matlist,
  method = c("irl", "svd"),
  ncomp = 30,
  rtype = c("indexed", "standardized", "hellinger", "freemantukey", "pearson"),
  vst_mth = c("none", "sqrt", "freemantukey", "anscombe"),
  rw_contrib = NULL,
  ...
)

corralm_sce(
  sce,
  splitby,
  method = c("irl", "svd"),
  ncomp = 30,
  whichmat = "counts",
  fullout = FALSE,
  rw_contrib = NULL,
  ...
)

corralm(inp, whichmat = "counts", fullout = FALSE, ...)

## S3 method for class 'corralm'
print(x, ...)

Arguments

`matlist`	(for `corralm_matlist`) list of input matrices; input matrices should be counts (raw or log). Matrices should be aligned row-wise by common features (either by sample or by gene)
`method`	character, the algorithm to be used for svd. Default is irl. Currently supports 'irl' for irlba::irlba or 'svd' for stats::svd
`ncomp`	numeric, number of components; Default is 30
`rtype`	character indicating what type of residual should be computed; options are '"indexed"', '"standardized"' (or '"pearson"' is equivalent), '"freemantukey"', and '"hellinger"'; defaults to '"standardized"' for `corral` and '"indexed"' for `corralm`. '"indexed"', '"standardized"', and '"freemantukey"' compute the respective chi-squared residuals and are appropriate for count data. The '"hellinger"' option is appropriate for continuous data.
`vst_mth`	character indicating whether a variance-stabilizing transform should be applied prior to calculating chi-squared residuals; defaults to '"none"'
`rw_contrib`	numeric vector, same length as the matlist. Indicates the weight that each dataset should contribute to the row weights. When set to NULL the row weights are not combined and each matrix is scaled independently (i.e., using their observed row weights, respectively). When set to a vector of all the same values, this is equivalent to taking the mean. Another option is to the number of observations per matrix to create a weighted mean. Regardless of input scale, row weights for each table must sum to 1 and thus are scaled. When this option is specified (i.e., not 'NULL'), the 'rtype' argument will automatically be set to 'standardized', and whatever argument is given will be ignored.
`...`	(additional arguments for methods)
`sce`	(for `corralm_sce`) SingleCellExperiment; containing the data to be integrated. Default is to use the counts, and to include all of the data in the integration. These can be changed by passing additional arguments. See `sce2matlist` function documentation for list of available parameters.
`splitby`	character; name of the attribute from `colData` that should be used to separate the SCE.
`whichmat`	char, when using SingleCellExperiment or other SummarizedExperiment, can be specified. default is 'counts'.
`fullout`	boolean; whether the function will return the full `corralm` output as a list, or a SingleCellExperiment; defaults to SingleCellExperiment (`FALSE`). To get back the `corralm_matlist`-style output, set this to `TRUE`.
`inp`	list of matrices (any type), a `SingleCellExperiment`, list of `SingleCellExperiment`s, list of `SummarizedExperiment`s, or `MultiAssayExperiment`. If using `SingleCellExperiment` or `SummarizedExperiment`, then include the `whichmat` argument to specify which slot to use (defaults to `counts`). Additionally, if it is one `SingleCellExperiment`, then it is also necessary to include the `splitby` argument to specify the batches. For a `MultiAssayExperiment`, it will take the intersect of the features across all the assays, and use those to match the matrices; to use a different subset, select desired subsets then call `corral`
`x`	(print method) corralm object; the list output from `corralm_matlist`

Details

corralm is a wrapper for corralm_matlist and corralm_sce, and can be called on any of the acceptable input types (see inp below).

Value

When run on a list of matrices, a list with the correspondence analysis matrix decomposition result, with indices corresponding to the concatenated matrices (in order of the list):

d: a vector of the diagonal singular values of the input mat (from SVD output)
u: a matrix of with the left singular vectors of mat in the columns (from SVD output)
v: a matrix of with the right singular vectors of mat in the columns. When cells are in the columns, these are the cell embeddings. (from SVD output)
eigsum: sum of the eigenvalues for calculating percent variance explained

For SingleCellExperiment input, returns the SCE with embeddings in the reducedDim slot 'corralm'

For a list of SingleCellExperiments, returns a list of the SCEs with the embeddings in the respective reducedDim slot 'corralm'

Examples

listofmats <- list(matrix(sample(seq(0,20,1),1000,replace = TRUE),nrow = 25),
                   matrix(sample(seq(0,20,1),1000,replace = TRUE),nrow = 25))
result <- corralm_matlist(listofmats)
library(DuoClustering2018)
library(SingleCellExperiment)
sce <- sce_full_Zhengmix4eq()[1:100,sample(1:3500,100,replace = FALSE)]
colData(sce)$Method <- matrix(sample(c('Method1','Method2'),100,replace = TRUE))
result <- corralm_sce(sce, splitby = 'Method')


listofmats <- list(matrix(sample(seq(0,20,1),1000,replace = TRUE),nrow = 20),
                   matrix(sample(seq(0,20,1),1000,replace = TRUE),nrow = 20))
corralm(listofmats)

library(DuoClustering2018)
library(SingleCellExperiment)
sce <- sce_full_Zhengmix4eq()[seq(1,100,1),sample(seq(1,3500,1),100,replace = FALSE)]
colData(sce)$Method <- matrix(sample(c('Method1','Method2'),100,replace = TRUE))
result <- corralm(sce, splitby = 'Method')

# default print method for corralm objects

laurenhsu1/corral documentation built on Feb. 19, 2023, 10:37 p.m.