CosineSimScore: Cosine similarity score
In nmah/CellScore: Tool for Evaluation of Cell Identity from Transcription Profiles

CosineSimScore

R Documentation

Cosine similarity score

Description

This function calculates the cosine similarity for cell transitions.

Usage

CosineSimScore(eset, cell.change, iqr.cutoff = 0.1)

Arguments

`eset`	an ExpressionSet containing data matrices of normalized expression data, present/absent calls, a gene annotation data frame and a phenotype data frame.
`cell.change`	a data frame containing three columns, one for the start (donor) test and target cell type. Each row of the data frame describes one transition from the start to a target cell type.
`iqr.cutoff`	set the threshold for top most variable genes which should be included for the cosine similarity calculation. Default is the top 10 genes, expressed as a fraction. All samples that are annotated as standards will be used for the iqr calculation.

Value

This function returns a list of five objects, as follows:

pdataSub: the phenotype data frame describing the standard samples
esetSub.IQR: the expression value matrix, as filtered by IQR threshold
cosine.general.groups: a numeric matrix of cosine similarity between the centroids of all groups defined by eset@general_cell_types
cosine.subgroups: a numeric matrix of cosine similarity between the centroids of all gsubroups defined by eset@sub_cell_types1
cosine.samples: a numeric matrix of cosine similarity between general groups, subgroups and individual samples.

Examples

## Load the expression set for the standard cell types
library(Biobase)
library(hgu133plus2CellScore) # eset.std

## Locate the external data files in the CellScore package
rdata.path <- system.file("extdata", "eset48.RData", package = "CellScore")
tsvdata.path <- system.file("extdata", "cell_change_test.tsv",
                            package = "CellScore")

if (file.exists(rdata.path) && file.exists(tsvdata.path)) {

   ## Load the expression set with normalized expressions of 48 test samples
   load(rdata.path)

   ## Import the cell change info for the loaded test samples
   cell.change <- read.delim(file= tsvdata.path, sep="\t",
                             header=TRUE, stringsAsFactors=FALSE)

   ## Combine the standards and the test data
   eset <- combine(eset.std, eset48)

   ## Generate cosine similarity for the combined data
   ## NOTE: May take 1-2 minutes on the full eset object
   ## so we subset it for 4 cell types
   pdata <- pData(eset)
   sel.samples <- pdata$general_cell_type %in% c("ESC", "EC", "FIB", "KER")
   eset.sub <- eset[, sel.samples]
   cs <- CosineSimScore(eset.sub, cell.change, iqr.cutoff=0.1)
}

nmah/CellScore documentation built on May 4, 2023, 2:52 p.m.