getLineages: Infer Lineage Structure from Clustered Samples

getLineagesR Documentation

Infer Lineage Structure from Clustered Samples

Description

This function constructs the minimum spanning tree(s) on clusters of cells, the first step in Slingshot's trajectory inference procedure. Paths through the MST from an origin cluster to leaf node clusters are interpreted as lineages.

Usage

getLineages(data, clusterLabels, ...)

## S4 method for signature 'matrix,matrix'
getLineages(
  data,
  clusterLabels,
  reducedDim = NULL,
  start.clus = NULL,
  end.clus = NULL,
  dist.method = "slingshot",
  use.median = FALSE,
  omega = FALSE,
  omega_scale = 1.5,
  times = NULL,
  ...
)

## S4 method for signature 'matrix,character'
getLineages(data, clusterLabels, ...)

## S4 method for signature 'matrix,ANY'
getLineages(data, clusterLabels, ...)

## S4 method for signature 'SlingshotDataSet,ANY'
getLineages(data, clusterLabels, ...)

## S4 method for signature 'PseudotimeOrdering,ANY'
getLineages(data, clusterLabels, ...)

## S4 method for signature 'data.frame,ANY'
getLineages(data, clusterLabels, ...)

## S4 method for signature 'matrix,numeric'
getLineages(data, clusterLabels, ...)

## S4 method for signature 'matrix,factor'
getLineages(data, clusterLabels, ...)

## S4 method for signature 'SingleCellExperiment,ANY'
getLineages(data, clusterLabels, reducedDim = NULL, ...)

Arguments

data

a data object containing the matrix of coordinates to be used for lineage inference. Supported types include matrix, SingleCellExperiment, SlingshotDataSet, and PseudotimeOrdering.

clusterLabels

each cell's cluster assignment. This can be a single vector of labels, or a #cells by #clusters matrix representing weighted cluster assignment. Either representation may optionally include a "-1" group meaning "unclustered."

...

Additional arguments to specify how lineages are constructed from clusters.

reducedDim

(optional) the dimensionality reduction to be used. Can be a matrix or a character identifying which element of reducedDim(data) is to be used. If multiple dimensionality reductions are present and this argument is not provided, the first element will be used by default.

start.clus

(optional) character, indicates the starting cluster(s) from which lineages will be drawn.

end.clus

(optional) character, indicates which cluster(s) will be forced to be leaf nodes in the graph.

dist.method

(optional) character, specifies the method for calculating distances between clusters. Default is "slingshot", see createClusterMST for details.

use.median

logical, whether to use the median (instead of mean) when calculating cluster centroid coordinates.

omega

(optional) numeric or logical, this granularity parameter determines the distance between every real cluster and the artificial cluster, .OMEGA. In practice, this makes omega the maximum allowable distance between two connected clusters. By default, omega = Inf. If omega = TRUE, the maximum edge length will be set to the median edge length of the unsupervised MST times a scaling factor (omega_scale, default = 1.5). This value is provided as a potentially useful rule of thumb for datasets with outlying clusters or multiple, distinct trajectories. See outgroup in createClusterMST.

omega_scale

(optional) numeric, scaling factor to use when omega = TRUE. The maximum edge length will be set to the median edge length of the unsupervised MST times omega_scale (default = 3). See outscale in createClusterMST.

times

numeric, vector of external times associated with either clusters or cells. See defineMSTPaths for details.

Details

Given a reduced-dimension data matrix n by p and a set of cluster identities (potentially including a "-1" group for "unclustered"), this function infers a tree (or forest) structure on the clusters. This work is now mostly handled by the more general function, createClusterMST.

The graph of this structure is learned by fitting a (possibly constrained) minimum-spanning tree on the clusters, plus the artificial cluster, .OMEGA, which is a fixed distance away from every real cluster. This effectively limits the maximum branch length in the MST to the chosen distance, meaning that the output may contain multiple trees.

Once the graph is known, lineages are identified in any tree with at least two clusters. For a given tree, if there is an annotated starting cluster, every possible path out of a starting cluster and ending in a leaf that isn't another starting cluster will be returned. If no starting cluster is annotated, one will be chosen by a heuristic method, but this is not recommended.

Value

An object of class PseudotimeOrdering. Although the final pseudotimes have not yet been calculated, the assay slot of this object contains two elements: pseudotime, a matrix of NA values; and weights, a preliminary matrix of lineage assignment weights. The reducedDim and clusterLabels matrices will be stored in the cellData. Additionally, the metadata slot will contain an igraph object named mst, a list of parameter values named slingParams, and a list of lineages (ordered sets of clusters) named lineages.

Examples

data("slingshotExample")
rd <- slingshotExample$rd
cl <- slingshotExample$cl
pto <- getLineages(rd, cl, start.clus = '1')

# plotting
sds <- as.SlingshotDataSet(pto)
plot(rd, col = cl, asp = 1)
lines(sds, type = 'l', lwd = 3)


kstreet13/slingshot documentation built on April 6, 2023, 11:12 p.m.