score_matrix: SCONE Evaluation: Evaluate an Expression Matrix

View source: R/scone_eval.R

score_matrixR Documentation

SCONE Evaluation: Evaluate an Expression Matrix

Description

This function evaluates a (normalized) expression matrix using SCONE criteria, producing 8 metrics based on i) Clustering, ii) Correlations and iii) Relative Expression.

Usage

score_matrix(
  expr,
  eval_pcs = 3,
  eval_proj = NULL,
  eval_proj_args = NULL,
  eval_kclust = NULL,
  bio = NULL,
  batch = NULL,
  qc_factors = NULL,
  uv_factors = NULL,
  wv_factors = NULL,
  is_log = FALSE,
  stratified_pam = FALSE,
  stratified_cor = FALSE,
  stratified_rle = FALSE
)

Arguments

expr

matrix. The expression data matrix (genes in rows, cells in columns).

eval_pcs

numeric. The number of principal components to use for evaluation (Default 3). Ignored if !is.null(eval_proj).

eval_proj

function. Projection function for evaluation (see Details). If NULL, PCA is used for projection

eval_proj_args

list. List of arguments passed to projection function as eval_proj_args (see Details).

eval_kclust

numeric. The number of clusters (> 1) to be used for pam tightness (PAM_SIL) evaluation. If an array of integers, largest average silhouette width (tightness) will be reported in PAM_SIL. If NULL, PAM_SIL will be returned NA.

bio

factor. A known biological condition (variation to be preserved), NA is allowed. If NULL, condition ASW, BIO_SIL, will be returned NA.

batch

factor. A known batch variable (variation to be removed), NA is allowed. If NULL, batch ASW, BATCH_SIL, will be returned NA.

qc_factors

Factors of unwanted variation derived from quality metrics. If NULL, qc correlations, EXP_QC_COR, will be returned NA.

uv_factors

Factors of unwanted variation derived from negative control genes (evaluation set). If NULL, uv correlations, EXP_UV_COR, will be returned NA.

wv_factors

Factors of wanted variation derived from positive control genes (evaluation set). If NULL, wv correlations, EXP_WV_COR, will be returned NA.

is_log

logical. If TRUE the expr matrix is already logged and log transformation will not be carried out prior to projection. Default FALSE.

stratified_pam

logical. If TRUE then maximum ASW is separately computed for each biological-cross-batch stratum (accepts NAs), and a weighted average silhouette width is returned as PAM_SIL. Default FALSE.

stratified_cor

logical. If TRUE then cor metrics are separately computed for each biological-cross-batch stratum (accepts NAs), and weighted averages are returned for EXP_QC_COR, EXP_UV_COR, & EXP_WV_COR. Default FALSE.

stratified_rle

logical. If TRUE then rle metrics are separately computed for each biological-cross-batch stratum (accepts NAs), and weighted averages are returned for RLE_MED & RLE_IQR. Default FALSE.

Details

Users may specify their own eval_proj function that will be used to compute Clustering and Correlation metrics. This eval_proj() function must have 2 input arguments:

  • e matrix. log-transformed (+ pseudocount) expression data (genes in rows, cells in columns).

  • eval_proj_args list. additional function arguments, e.g. prior data weights.

and it must output a matrix representation of the original data (cells in rows, factors in columns). The value of eval_proj_args is passed to the user-defined function from the eval_proj_args argument of the main score_matrix() function call.

Value

A list with the following metrics:

  • BIO_SIL Average silhouette width by biological condition.

  • BATCH_SIL Average silhouette width by batch condition.

  • PAM_SIL Maximum average silhouette width from PAM clustering (see stratified_pam argument).

  • EXP_QC_COR Coefficient of determination between expression pcs and quality factors (see stratified_cor argument).

  • EXP_UV_COR Coefficient of determination between expression pcs and negative control gene factors (see stratified_cor argument).

  • EXP_WV_COR Coefficient of determination between expression pcs and positive control gene factors (see stratified_cor argument).

  • RLE_MED The mean squared median Relative Log Expression (RLE) (see stratified_rle argument).

  • RLE_IQR The variance of the inter-quartile range (IQR) of the RLE (see stratified_rle argument).

Examples


set.seed(141)
bio = as.factor(rep(c(1,2),each = 2))
batch = as.factor(rep(c(1,2),2))
log_expr = matrix(rnorm(20),ncol = 4)

scone_metrics = score_matrix(log_expr,
   bio = bio, batch = batch,
   eval_kclust = 2, is_log = TRUE)


YosefLab/scone documentation built on March 12, 2024, 10:48 p.m.