MetaNeighbor: Runs MetaNeighbor

Description Usage Arguments Value See Also Examples

View source: R/MetaNeighbor.R

Description

For each gene set of interest, the function builds a network of rank correlations between all cells. Next,It builds a network of rank correlations between all cells for a gene set. Next, the neighbor voting predictor produces a weighted matrix of predicted labels by performing matrix multiplication between the network and the binary vector indicating cell type membership, then dividing each element by the null predictor (i.e., node degree). That is, each cell is given a score equal to the fraction of its neighbors (including itself), which are part of a given cell type. For cross-validation, we permute through all possible combinations of leave-one-dataset-out cross-validation, and we report how well we can recover cells of the same type as area under the receiver operator characteristic curve (AUROC). This is repeated for all folds of cross-validation, and the mean AUROC across folds is reported. Calls neighborVoting.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
MetaNeighbor(
  dat,
  i = 1,
  experiment_labels,
  celltype_labels,
  genesets,
  bplot = TRUE,
  fast_version = FALSE,
  node_degree_normalization = TRUE,
  batch_size = 10,
  detailed_results = FALSE
)

Arguments

dat

A SummarizedExperiment object containing gene-by-sample expression matrix.

i

default value 1; non-zero index value of assay containing the matrix data

experiment_labels

A vector that indicates the source/dataset of each sample.

celltype_labels

A character vector or one-hot encoded matrix (cells x cell type) that indicates the cell type of each sample.

genesets

Gene sets of interest provided as a list of vectors.

bplot

default true, beanplot is generated

fast_version

default value FALSE; a boolean flag indicating whether to use the fast and low memory version of MetaNeighbor

node_degree_normalization

default value TRUE; a boolean flag indicating whether to normalize votes by dividing through total node degree.

batch_size

Optimization parameter. Gene sets are processed in groups of size batch_size. The count matrix is first subset to all genes from these groups, then to each gene set individually.

detailed_results

Should the function return the average AUROC across all test datasets (default) or a detailed table with the AUROC for each test dataset?

Value

A matrix of AUROC scores representing the mean for each gene set tested for each celltype is returned directly (see neighborVoting). If detailed_results is set to TRUE, the function returns a table of AUROC scores in each test dataset for each gene set.

See Also

neighborVoting

Examples

1
2
3
4
5
6
7
8
data("mn_data")
data("GOmouse")
library(SummarizedExperiment)
AUROC_scores = MetaNeighbor(dat = mn_data,
                            experiment_labels = as.numeric(factor(mn_data$study_id)),
                            celltype_labels = metadata(colData(mn_data))[["cell_labels"]],
                            genesets = GOmouse,
                            bplot = TRUE)

MetaNeighbor documentation built on Nov. 8, 2020, 5:40 p.m.