CAMASest: A and S matrix estimation by CAM

Description Usage Arguments Details Value Examples

View source: R/CAMASest.R

Description

This function estimates A and S matrix based on marker gene clusters detected by CAM.

Usage

1
2
CAMASest(MGResult, PrepResult, data, corner.strategy = 2,
  appro3 = TRUE, generalNMF = FALSE)

Arguments

MGResult

An object of class "CAMMGObj" obtained from CAMMGCluster function.

PrepResult

An object of class "CAMPrepObj" obtained from CAMPrep function.

data

Matrix of mixture expression profiles which need to be the same as the input of CAMPrep. Data frame, SummarizedExperiment or ExpressionSet object will be internally coerced into a matrix. Each row is a gene and each column is a sample. Data should be in non-log linear space with non-negative numerical values (i.e. >= 0). Missing values are not supported. All-zero rows will be removed internally.

corner.strategy

The method to detect corner clusters. 1: minimum sum of margin-of-errors; 2: minimum sum of reconstruction errors. The default is 2.

appro3

Estimate A and S matrix by approach 3 or not. Please see details for further information. The default is TRUE.

generalNMF

If TRUE, the decomposed proportion matrix has no sum-to-one constraint for each row. Without this constraint, the scale ambiguity of each column vector in proportion matrix will not be removed. The default is FALSE.

Details

This function is used internally by CAM function to estimate proportion matrix (A), subpopulation-specific expression matrix (S) and mdl values. It can also be used when you want to perform CAM step by step.

The mdl values are calculated in three approaches: (1) based on data and A matrix in dimension-reduced space; (2) based on original data with A matrix estimated by transforming dimension-reduced A matrix back to original space; (3) based on original data with A directly estimated in original space. A and S matrix in original space estimated from the latter two approaches are returned. mdl is the sum of two terms: code length of data under the model and code length of model. Both mdl value and the first term (code length of data) will be returned.

Value

An object of class "CAMASObj" containing the following components:

Aest

Estimated proportion matrix from Approach 2.

Sest

Estimated subpopulation-specific expression matrix from Approach 2.

Aest.proj

Estimated proportion matrix from Approach 2, before removing scale ambiguity.

Ascale

The estimated scales to remove scale ambiguity of each column vector in Aest. Sum-to-one constraint on each row of Aest is used for scale estimation.

AestO

Estimated proportion matrix from Approach 3.

SestO

Estimated subpopulation-specific expression matrix from Approach 3.

AestO.proj

Estimated proportion matrix from Approach 3, before removing scale ambiguity.

AscaleO

The estimated scales to remove scale ambiguity of each column vector in AestO. Sum-to-one constraint on each row of AestO is used for scale estimation.

datalength

Three values for code length of data. The first is calculated based on dimension-reduced data. The second and third are based on the original data.

mdl

Three mdl values. The first is calculated based on dimension-reduced data. The second and third are based on the original data.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
#obtain data
data(ratMix3)
data <- ratMix3$X

#preprocess data
rPrep <- CAMPrep(data, dim.rdc = 3, thres.low = 0.30, thres.high = 0.95)

#Marker gene cluster detection with a fixed K
rMGC <- CAMMGCluster(3, rPrep)

#A and S matrix estimation
rASest <- CAMASest(rMGC, rPrep, data)

Lululuella/debCAM documentation built on May 14, 2021, 2:45 p.m.