View source: R/neighborDistances.R
neighborDistances | R Documentation |
Calculate the distances in high-dimensional space to the neighboring cells.
neighborDistances(
prepared,
neighbors = 50,
downsample = 50,
as.tol = TRUE,
num.threads = 1
)
prepared |
A List object produced by |
neighbors |
An integer scalar specifying the number of neighbours. |
downsample |
An integer scalar specifying the frequency with which cells are examined. |
as.tol |
A logical scalar specifying if the distances should be reported as tolerance values. |
num.threads |
Integer scalar specifying the number of threads to use. |
This function examines each cell at the specified downsampling frequency, and computes the Euclidean distances to its nearest neighbors.
If as.tol=TRUE
, these distances are reported on the same scale as tol
in countCells
.
This allows users to choose a value for tol
based on the output of this function.
Otherwise, the distances are reported without modification.
To visualize the distances/tolerances, one option is to use boxplots, as shown below. Each boxplot represents the distribution of tolerances required for hyperspheres to contain a certain number of cells. For example, assume that at least 20 cells in each hypersphere are needed to have sufficient power for hypothesis testing. Now, consider all hyperspheres that are large enough to include the 19th nearest neighbour. The average distance required to do so would be the median of the boxplot generated from the 19th column of the output.
Another option is to examine the distribution of counts at a given tolerance/distance.
This is done by counting the number of hyperspheres with a particular number of nearest neighbors closer than the specified tolerance.
In this manner, the expected count distribution from setting a particular tolerance can be determined.
Note that the histogram is capped at neighbors
to save time.
Note that, for each examined cell, its neighbors are identified from the full set of cells. Downsampling only changes the rate at which cells are examined, for the sake of computational efficiency. Neighbors are not identified from the downsampled set as this will inflate the reported distances.
A numeric matrix of distances where each row corresponds to an examined cell and each column i
corresponds to the i
th closest neighbor.
Aaron Lun
prepareCellData
, to generate the prepared
object.
countCells
, where the choice of tol
can be guided by the distance distributions.
example(prepareCellData, echo=FALSE)
distances <- neighborDistances(cd, as.tol=FALSE)
boxplot(distances, xlab="Neighbor", ylab="Distance")
# Making a plot to choose 'tol' in countCells().
distances <- neighborDistances(cd, as.tol=TRUE)
boxplot(distances, xlab="Neighbor", ylab="Tolerance")
required.count <- 20 # 20 cells per hypersphere
med <- median(distances[,required.count-1])
segments(-10, med, required.count-1, col="dodgerblue")
segments(required.count-1, med, y1=0, col="dodgerblue")
# Examining the distribution of counts at a given 'tol' of 0.7.
# (Adding 1 to account for the cell at the centre of the hypersphere.)
counts <- rowSums(distances <= 0.7) + 1
hist(counts, xlab="Count per hypersphere")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.