sracipeHeatmapSimilarity: Calculates the similarity between two gene expression data.
In lusystemsbio/sRACIPE: Systems biology tool to simulate gene regulatory circuits

sracipeHeatmapSimilarity

R Documentation

Calculates the similarity between two gene expression data.

Description

Comparison is done across columns, i.e., how similar are the columns in the two dataset. For gene expression data, format data so that gene names are in rows and samples in columns.

Usage

sracipeHeatmapSimilarity(
  dataReference,
  dataSimulation,
  clusterCut = NULL,
  nClusters = 3,
  pValue = 0.05,
  permutedVar,
  permutations = 1000,
  corMethod = "spearman",
  clusterMethod = "ward.D2",
  method = "pvalue",
  buffer = 0.001,
  permutMethod = "simulation",
  returnData = FALSE
)

Arguments

`dataReference`	Matrix. The reference data matrix, for example, the experimental gene expression values
`dataSimulation`	Matrix. The data matrix to be compared.
`clusterCut`	(optional) Integer vector. Clsuter numbers assigned to reference data. If clusterCut is missing, hierarchical clustering using /codeward.D2 and /codedistance = (1-cor(x, method = "spear"))/2 will be used to cluster the reference data.
`nClusters`	(optional) Integer. The number of clusters in which the reference data should be clustered for comparison. Not needed if clusterCut is provided.
`pValue`	(optional) Numeric. p-value to consider two gene expression sets as belonging to same cluster. Ward's method with spearman correlation is used to determine if a model belongs to a specific cluster.
`permutedVar`	(optional) Similarity scores computed after permutations.
`permutations`	(optional) Integer. Default `1000`. Number of gene permutations to generate the null distibution.
`corMethod`	(optional) Correlation method. Default method is "spearman". For single cell data, use "kendall"
`clusterMethod`	(optional) Character - default `ward.D2`, other options include `complete`. Clustering method to be used to cluster the experimental data. `hclust` for other options.
`method`	(optional) character. Method to compare the gene expressions. Default `pvalue`. One can use `variance` as well which assigns clusters based on the cluster whose samples have minimum variance with the simulated sample.
`buffer`	(optional) Numeric. Default `0.001`. The fraction of models to be assigned to clusters to which no samples could be assigned. For example, a minimum of 1 ghost sample in reference is assigned to NULL cluster.
`permutMethod`	"sample" or "reference"
`returnData`	(optional) Logical. Default `FALSE`. Whether to return the sorted and clustered data.

Value

A list containing the KL distance of new cluster distribution from reference data and the probability of each cluster in the reference and simulated data.