GroupDistance: Calculates the distance between Categories of single cells
In sbrn3/disscat: Calculates Distances Between Single Cell RNA Seq Categories

Description Usage Arguments Details Value

Takes in an Seurat s4 object with categorically labelled cells and calculates the 'distance' between the cells in each category through a choice of different methods. In this method each category is defined as a cluster and hierarchical clustering methods are used to quantify the distances between the clusters.

This includes multiple methods to measure distances between the groups/cateories themselves(centroid vs average distance) as well as more fundementally in terms of the distance between the points (euclidean vs manhattan distsance).

1 2	GroupDistance(seurat_object, group, reduction = "pca", dims = 1:30, method, distance = "euclidean", split_by = NULL, output = "seurat")

`seurat_object`	A seurat object
`group`	Seurat Categories or groups for which the distances between are calculated. Cell wise data.
`reduction`	Dimensionality reduction data to use
`dims`	Which dimensions to use
`method`	Cluster Distance methods to use. Options are "single", "complete", "average", "centroid", "ward", "mahalanobis". Further explanation for these methods is given in the details.
`distance`	Point to point distance to use. Options are "euclidean", "maximum", "manhattan", "canberra", "binary", "minkowski"
`split_by`	Second seurat category to split the calculations over.
`output`	Output type. Default is "seurat". Options are "seurat" (Seurat S4 object), "table" (Table of distance data), "list" (List of tables by split_by), "seurat_list" (List of Seurat objects). Seurat objects returned have the distance data stored internally.

It is possible if calculating distances split by a factor to input a list of Seurat S4 objects.

Cluster distance methods available for method:

single: Shortest between any two points in each cluster
compelte: Longest distance between any two points in each cluster
average: The average distance between any two points in each cluster
centroid: The distance between the centroid of each cluster
ward: Distance is defined as the increase in variance if two clusters are merged
mahalanobis: Similar to Mahalanobis distance. Calculates distance between means of each cluster, weighted by the covariance matricies.

Point distance methods available for distance:

euclidean: Usual distance between the two vectors sqrt(sum((x_i - y_i)^2))
maximum: Maximum distance between two components of x and y
manhattan: Absolute distance between two vectors
canberra: sum(|x_i - y_i| / (|x_i| + |y_i|))
binary: The vectors are regarded as binary bits, so non-zero elements are ‘on’ and zero elements are ‘off’. The distance is the proportion of bits in which only one is on amongst those in which at least one is on.
minkowski: The p norm, the pth root of the sum of the pth powers of the differences of the components.