compute.backbone.tree: Backbone Tree construction

Description Usage Arguments Details Value Examples

View source: R/cellTree.R

Description

Builds a ‘backbone tree’ from a fitted LDA model.

Usage

1
2
3
4
compute.backbone.tree(lda.results, grouping = NULL,
  start.group.label = NULL, absolute.width = 0, width.scale.factor = 1.2,
  outlier.tolerance.factor = 0.1, rooting.method = NULL, only.mst = FALSE,
  grouping.colors = NULL, merge.sequential.backbone = FALSE)

Arguments

lda.results

A fitted LDA model, as returned by compute.lda

grouping

An (optional) vector of labels for each cell in the lda.results object. E.g. a sampling times (numeric) or tissue categories.

start.group.label

If a grouping parameter is provided, you can optionally specify the starting group. If no start.group.label is specified and the grouping vector is numeric, the lowest value will automatically be selected. Otherwise, the group with lowest mean-squared-distance between cells is selected.

absolute.width

Numeric (optional). Distance threshold below which a cell vertex is considered to be attached to a backbone vertex (see paper for more details). By default, this threshold is computed dynamically, based on the distance distribution for each branch.

width.scale.factor

Numeric (optional). A scaling factor for the dynamically-computed distance threshold (ignored if absolute.width is provided). Higher values will result in less branches in the backbone tree, while lower values might lead to a large number of backbone branches.

outlier.tolerance.factor

Numeric (optional). Proportion of vertices, out of the total number of vertices divided by the total number of branches, that can be left at the end of the backbone tree-building algorithm.

rooting.method

String (optional). Method used to root the backbone tree. Must be either NULL or one of: ‘longest.path’, ‘center.start.group’ or ‘average.start.group’. ‘longest.path' picks one end of the longest shortest-path between two vertices. 'center.start.group’ picks the vertex in the starting group with lowest mean-square-distance to the others. ‘average.start.group’ creates a new artificial vertex, as the average of all cells in the starting group. If no value is provided, the best method is picked based on the type of grouping and start group information available.

only.mst

If TRUE, returns a simple rooted minimum-spanning tree, instead of a backbone tree.

grouping.colors

(Optional) vector of RGB colors to be used for each grouping.

merge.sequential.backbone

(Optional) whether to merge sequential backbone vertices that are close enough. This will produce a more compact backbone tree, but at the cost of extra computing time.

Details

In order to easily visualise the structural and temporal relationship between cells, we introduced a special type of tree structure dubbed ‘backbone tree’, defined as such:

Considering a set of vertices V and a distance function over all pairs of vertices: d: V <c3><97> V -> R+, we call backbone tree a graph, T with backbone B, such that:

In this instance, we relax the last condition to cover only ‘most’ non-backbone vertices, allowing for a variable proportion of outliers at distance > δ from any vertices in V_B.

We can then define the ‘optimal’ backbone tree to be a backbone tree such that the sum of weighted edges in the backbone subtree E_B is minimal. Finding such a tree can be easily shown to be NP-Complete (by reduction to the Vertex Cover problem), but we developed a fast heuristic relying on Minimum Spanning Tree to produce a reasonable approximation.

The resulting quasi-optimal backbone tree (simply referred to as ‘the’ backbone tree thereafter) gives a clear hierarchical representation of the cell relationship: the objective function puts pressure on finding a (small) group of prominent cells (the backbone) that are good representatives of major steps in the cell evolution (in time or space), while remaining cells are similar enough to their closest representative for their difference to be ignored. Such a tree provides a very clear visualisation of overall cell differentiation paths (including potential differentiation into sub-types).

Value

A igraph object with either a minimum rooted spanning-tree (if only.mst is TRUE) or a quasi-optimal backbone tree connecting all input cells. Cell topic distribution, distances and branch order are added as vertex/edge/graph attributes.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# Load pre-computed LDA model for skeletal myoblast RNA-Seq data from HSMMSingleCell package:
data(HSMM_lda_model)

# Recover sampling time (in days) for each cell:
library(HSMMSingleCell)
data(HSMM_sample_sheet)
days.factor = HSMM_sample_sheet$Hours
days = as.numeric(levels(days.factor))[days.factor]

# Compute near-optimal backbone tree:
b.tree = compute.backbone.tree(HSMM_lda_model, days)
# Plot resulting tree with sampling time as a vertex group colour:
ct.plot.grouping(b.tree)

Example output

Loading required package: topGO
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, sd, var, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, basename, cbind, colMeans, colSums, colnames,
    dirname, do.call, duplicated, eval, evalq, get, grep, grepl,
    intersect, is.unsorted, lapply, lengths, mapply, match, mget,
    order, paste, pmax, pmax.int, pmin, pmin.int, rank, rbind,
    rowMeans, rowSums, rownames, sapply, setdiff, sort, table, tapply,
    union, unique, unsplit, which, which.max, which.min

Loading required package: graph
Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: GO.db
Loading required package: AnnotationDbi
Loading required package: stats4
Loading required package: IRanges
Loading required package: S4Vectors

Attaching package: 'S4Vectors'

The following object is masked from 'package:base':

    expand.grid


Loading required package: SparseM

Attaching package: 'SparseM'

The following object is masked from 'package:base':

    backsolve


groupGOTerms: 	GOBPTerm, GOMFTerm, GOCCTerm environments built.

Attaching package: 'topGO'

The following object is masked from 'package:IRanges':

    members

Loading required namespace: maptpx
Using start group: 0 (1)
Using rooting method: center.start.group
Using root vertex: 4
Adding branch #1:
 [1]  65  53  45   2  55  47  57  48  44   7  19  25  69  66   9  63  18  62  51
[20]  56  16  70 136 133 143  89  78 140  94 100 177 194 141 199 201 181 161 204
[39] 225 236 255 247 246 233 229 259 258 146 235 159 185 191 216 166 149  83 168
[58] 158   8
Using branch width: 0.927 (width.scale.factor: 1.2)
Outliers: 1
Total number of branches: 1 (forks: 0)
Backbone fork merge (width: 0.927): 60 ->  60 
Ranking all cells...
Computing tree layout...
IGRAPH 6c9ebac DNW- 271 270 -- 
+ attr: grouping.colors (g/c), grouping.labels (g/n), ordered.branches
| (g/x), ratio (g/n), is.backbone (v/l), is.root (v/l), name (v/n),
| color (v/c), group.idx (v/n), grouping.label (v/n), pie (v/x), shape
| (v/c), cell.name (v/c), x (v/n), y (v/n), size (v/n), weight (e/n),
| arrow.mode (e/n)
+ edges from 6c9ebac (vertex names):
 [1]  2-> 20  2-> 26  2-> 50  2-> 55  4->  1  4-> 10  4-> 13  4-> 15  4-> 24
[10]  4-> 28  4-> 30  4-> 31  4-> 34  4-> 40  4-> 41  4-> 49  4-> 58  4-> 59
[19]  4-> 60  4-> 64  4-> 65  4->150  4->196  4->234  4->242  4->266  4->267
[28]  7-> 19  7-> 73  7->211  8-> 42  9-> 27  9-> 63  9-> 68 16-> 70 18-> 22
+ ... omitted several edges

cellTree documentation built on Nov. 8, 2020, 5:05 p.m.