treeForceDirectedLayout: Generate force-directed layout using tip-walk data.

Description Usage Arguments Details Examples

View source: R/tree-force-layout.R

Description

This function generates a k-nearest neighbor network for cells, based on their visitation frequency by the biased random walks from different tips (and optionally pseudotime), and uses it as input into a force-directed layout (powered by igraph). The force-directed layout is generated in 2 dimensions, and pseudotime is used as a third dimension.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
treeForceDirectedLayout(
  object,
  num.nn = NULL,
  method = c("fr", "drl", "kk"),
  cells.to.do = NULL,
  cell.minimum.walks = 1,
  cut.outlier.cells = NULL,
  cut.outlier.edges = NULL,
  max.pseudotime.diff = NULL,
  cut.unconnected.segments = 2,
  min.final.neighbors = 2,
  remove.duplicate.cells = T,
  tips = object@tree$tips,
  coords = "auto",
  start.temp = NULL,
  n.iter = NULL,
  density.neighbors = 10,
  plot.outlier.cuts = F,
  verbose = F
)

Arguments

object

An URD object

num.nn

(Numeric) Number of nearest neighbors to use. (NULL will use the square root of the number of cells as default.)

method

(Character: "fr", "drl", or "kk") Which force-directed layout algorithm to use (Fruchterman-Reingold, DrL, or Kamada-Kawai)

cells.to.do

(Character vector) Cells to use in the layout (default NULL is all cells in the tree.)

cell.minimum.walks

(Numeric) Minimum number of times a cell must have been visited by random walks in order to be included in the force-directed layout (Cells that have been visited only a few times are more likely to be assigned incorrectly or poorly.)

cut.outlier.cells

(Numeric) If desired, omit cells with unusual second nearest neighbor distances (i.e. those that are likely outliers only well connected to one other cell). Parameter is given as a factor of the interquartile range calculated across all cells' distance to their second nearest neighbor. (Default is not to omit any cells.)

cut.outlier.edges

(Numeric) If desired, cut edges in the nearest neighbor graph with unusually long distances. Parameters is given as a factor of the interquartile range calculated across all edges in the graph. (Default is not to cut any edges based on their length.)

max.pseudotime.diff

(Numeric) If desired, cut edges in the nearest neighbor graph between cells with longer difference in pseudotime. (Default is not to cut any edges based on their pseudotime.)

cut.unconnected.segments

(Numeric) Cut connections in the nearest-neighbor graph to cells that are more segments away in the dendrogram structure. For instance, the most aggressive setting (1) will only maintain that are at most 1 segment away (so it will maintain connections within a segment and to that segment's parent or children. Higher values permit more distant connections. The default value is 2, which permits connections up to two segments away (i.e. connections within a segment, to that segment's parent, grandparent, children, grandchildren, and siblings.) NA or NULL disable this setting and permit all connections.

min.final.neighbors

(Numeric) After trimming outlier and unconnected connections in the nearest neighbor graph, remove any cells that remain connected to fewer than this many other cells.

remove.duplicate.cells

(Logical) If cells have exactly the same visitation during the random walks from each tip, this can create problems in the force-directed layout where cells are pushed to the outside of the layout. To avoid this if remove.duplicate.cells=T, only one cell from each group of cells with duplicated coordinates will be used in the layout.

tips

(Character vector) Tips for which walk visitation data should be used in the construction of the nearest neighbor graph. (Default is all tips )

coords

(Matrix: Cells as rows, 2 columns) Starting coordinates for the force directed layout. Default ("auto") takes them from the cell layout of the dendrogram.

start.temp

(Numeric) Starting temperature for the force-directed layout (if method="fr"), which controls how much cells can move in the initial iterations of the algorithm. Default (NULL) is the square root of the number of cells.

n.iter

(Numeric or NULL) Sets the number (or maximum number) of iterations used in the force-directed layout; if NULL, observes defaults suggested by igraph.

density.neighbors

(Numeric) Distance to this nearest neighbor (default is 10th nearest neighbor) is used as a proxy for local density in the force-directed layout. This is used by plotTreeForce if density.alpha=T for increasing transparency in more high density regions of the layout. This can be adjusted after generating the layout by re-running the fdlDensity function.

plot.outlier.cuts

(Logical) If cut.outlier.cells=T or cut.outlier.edges=T, this displays the

verbose

(Logical) Print progress and time stamps?

Details

Several settings adjust the k-nearest neighbor network to assist in the layout. Outlier cells (that are only closely connected to a single other cells) or outlier edges (with unusually long distances) can be eliminated, though these parameters are disabled by default. However, the dendrogram structure recovered by URD is used by default to refine the k-nearest neighbor network, breaking links between cells that are distant in the dendrogram. This emphasizes producing a parseable layout at the expense of representing rare transitions in the data.

Parameters that greatly affect the quality of the layout are the number of nearest neighbors used in the graph (num.nn) and the aggresiveness of the cut.unconnected.segments parameter. Additionally, the layout is largely reproducible if given the same starting conditions, but minor changes can affect it quite a bit, so it is sometimes worth varying num.nn by 1 across a small range (i.e. try 5 layouts with num.nn=120-125) and choosing the best one. Finally, sometimes different branches can overlap after they have separated, and for a final publication-ready layout, it can make sense to hand-tune the layout some. This can be done using the treeForceRotateCoords and treeForceTranslateCoords functions to move or rotate sections of the tree. (For instance, in the zebrafish layout, we used these functions to move the completely disconnected EVL and PGC cells into place and to increase the angular distance at the first branchpoint in the blastoderm so that the more fine-grained branches from the different germ layers didn't overlap.)

Thanks for Dorde Relic for debugging and creating the remove.duplicate.cells parameter and Simon Cai for debugging that led to the cell.minimum.walks parameter.

Examples

1
object.built <- treeForceDirectedLayout(object.built, num.nn=120, pseudotime="pseudotime", method = "fr", dim = 2, cells.to.do = robustly.visited.cells, tips=final.tips, cut.unconnected.segments = 2, min.final.neighbors=4, verbose=T)

farrellja/URD documentation built on June 17, 2020, 4:48 a.m.