getLineages: Infer Lineage Structure from Clustered Samples

Description Usage Arguments Details Value Examples

Description

Given a reduced-dimension data matrix n by p and a vector of cluster identities (potentially including -1's for "unclustered"), this function infers a forest structure on the clusters and returns paths through the forest that can be interpreted as lineages.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
getLineages(data, clusterLabels, ...)

## S4 method for signature 'matrix,matrix'
getLineages(
  data,
  clusterLabels,
  reducedDim = NULL,
  start.clus = NULL,
  end.clus = NULL,
  dist.fun = NULL,
  omega = NULL,
  omega_scale = 3
)

## S4 method for signature 'matrix,character'
getLineages(
  data,
  clusterLabels,
  reducedDim = NULL,
  start.clus = NULL,
  end.clus = NULL,
  dist.fun = NULL,
  omega = NULL,
  omega_scale = 3
)

## S4 method for signature 'matrix,ANY'
getLineages(
  data,
  clusterLabels,
  reducedDim = NULL,
  start.clus = NULL,
  end.clus = NULL,
  dist.fun = NULL,
  omega = NULL,
  omega_scale = 3
)

## S4 method for signature 'SlingshotDataSet,ANY'
getLineages(
  data,
  clusterLabels,
  reducedDim = NULL,
  start.clus = NULL,
  end.clus = NULL,
  dist.fun = NULL,
  omega = NULL,
  omega_scale = 3
)

## S4 method for signature 'data.frame,ANY'
getLineages(
  data,
  clusterLabels,
  reducedDim = NULL,
  start.clus = NULL,
  end.clus = NULL,
  dist.fun = NULL,
  omega = NULL,
  omega_scale = 3
)

## S4 method for signature 'matrix,numeric'
getLineages(
  data,
  clusterLabels,
  reducedDim = NULL,
  start.clus = NULL,
  end.clus = NULL,
  dist.fun = NULL,
  omega = NULL,
  omega_scale = 3
)

## S4 method for signature 'matrix,factor'
getLineages(
  data,
  clusterLabels,
  reducedDim = NULL,
  start.clus = NULL,
  end.clus = NULL,
  dist.fun = NULL,
  omega = NULL,
  omega_scale = 3
)

## S4 method for signature 'SingleCellExperiment,ANY'
getLineages(
  data,
  clusterLabels,
  reducedDim = NULL,
  start.clus = NULL,
  end.clus = NULL,
  dist.fun = NULL,
  omega = NULL,
  omega_scale = 3
)

Arguments

data

a data object containing the matrix of coordinates to be used for lineage inference. Supported types include matrix, SingleCellExperiment, and SlingshotDataSet.

clusterLabels

character, a vector of length n denoting cluster labels, optionally including -1's for "unclustered." If reducedDim is a SlingshotDataSet, cluster labels will be taken from it.

...

Additional arguments to specify how lineages are constructed from clusters.

reducedDim

(optional) identifier to be used if reducedDim(data) contains multiple elements. Otherwise, the first element will be used by default.

start.clus

(optional) character, indicates the cluster(s) *from* which lineages will be drawn.

end.clus

(optional) character, indicates the cluster(s) which will be forced leaf nodes in their trees.

dist.fun

(optional) function, method for calculating distances between clusters. Must take two matrices as input, corresponding to points in reduced-dimensional space. If the minimum cluster size is larger than the number dimensions, the default is to use the joint covariance matrix to find squared distance between cluster centers. If not, the default is to use the diagonal of the joint covariance matrix.

omega

(optional) numeric, this granularity parameter determines the distance between every real cluster and the artificial cluster, .OMEGA. In practice, this makes omega the maximum allowable distance between two connected clusters. By default, omega = Inf. If omega = TRUE, the maximum edge length will be set to the median edge length of the unsupervised MST times a scaling factor (omega_scale, default = 3). This value is provided as a potentially useful rule of thumb for datasets with outlying clusters or multiple, distinct trajectories, but it is not otherwise recommended.

omega_scale

(optional) numeric, scaling factor to use when omega = TRUE. The maximum edge length will be set to the median edge length of the unsupervised MST times omega_scale (default = 3).

Details

The connectivity matrix is learned by fitting a (possibly constrained) minimum-spanning tree on the clusters and the artificial cluster, .OMEGA, which is a fixed distance away from every real cluster. This effectively limits the maximum branch length in the MST to the chosen distance, meaning that the output may contain multiple trees.

Once the connectivity is known, lineages are identified in any tree with at least two clusters. For a given tree, if there is an annotated starting cluster, every possible path out of a starting cluster and ending in a leaf that isn't another starting cluster will be returned. If no starting cluster is annotated, every leaf will be considered as a potential starting cluster and whichever configuration produces the longest average lineage length (in terms of number of clusters included) will be returned.

Value

An object of class SlingshotDataSet containing the arguments provided to getLineages as well as the following new elements:

Examples

1
2
3
4
5
6
7
data("slingshotExample")
rd <- slingshotExample$rd
cl <- slingshotExample$cl
sds <- getLineages(rd, cl, start.clus = '1')

plot(rd, col = cl, asp = 1)
lines(sds, type = 'l', lwd = 3)

slingshot documentation built on Nov. 8, 2020, 5:51 p.m.