In jefferis/nat: NeuroAnatomy Toolbox for Analysis of 3D Image Data

Introduction

Neurons can helpfully be treated as graphs (branching trees) in which nodes connected by edges define the morphology of the neuron. The nat package provides a number of built-in functions that allow you to analyse the branching structure of graphs

https://natverse.org/nat/reference/index.html

or use aspects of the branching structure to manipulate graphs. Examples of such functions include the strahler_order function which calculates the Strahler branch order for each node or segment in the neuron or the spine function that extracts the longest path across the neuron.

More sophisticated analysis and manipulations can be carried out by converting neuron objects into ngraph objects.

Neurons as Graphs

A neuron will typically be a graph in the form of a binary tree. We follow the convention of the Matlab trees toolbox by Hermann Cuntz and colleagues in treating the root (typically the soma) as the origin of the graph and then having directed edges leaving the root.

neuron-as-graph

Nodes will be one of a:

root
branch point
end point
continuation point

Each node will have a numeric identifier or label (an arbitrary integer stored the in the neuron's PointNo field that may have come from an external source) as well an index (an integer starting at 1 and increasing without gaps). Although these two identifiers may often be the same, code should never rely on this being the case. In the wild, one frequently encounters cases where e.g. the numeric labels

have gaps because a neuron was edited
are not in the same order as the vertices
have some other significance e.g. are globally unique across all neurons in a database and therefore have large values (which may be challenging for R to to represent give its maxint of 2^31 - 1=r 2^31 - 1)

library(nat)
n=Cell07PNs[[1]]
summary(n)

We can extract the points as follows:

rootpoints(n)
branchpoints(n)
endpoints(n)

Segments are unbranched connected sequences of nodes that terminate in a branch point or end point.

Built-in neuron graph functions

We will give a few examples of the use of the built-in functions that treat neurons as graphs.

Strahler Order

The branching structure of a neuron is commonly summarised by calculating the Strahler Order.

n=Cell07PNs[[1]]
so=strahler_order(n)
orders=1:max(so$points)
for (i in orders) {
  plot(subset(n, so$points==i), col=i, add = i!=1, boundingbox = boundingbox(n))
}

Note the use of multiple calls to plot.neuron using the add=TRUE argument for all but the first plot. Note also the use of the boundingbox argument/function in order to ensure that the plot is set up with appropriate axes for the whole neuron even if only part of it is plotted in the first call to plot.

Spine

You can find the longest path across a neuron using the spine function.

n=Cell07PNs[[1]]
sp=spine(n)
plot(n, col='grey')
plot(sp, add=T, col='blue')

spine has a variety of options that you can use to control the results.

Segment graph

You can use the segmentgraph function to make a simplified representation of the branching structure of the neuron. In this object (which has class igraph associated with the powerful igraph package) each unbranched segment in the original neuron (which might have contained many vertices) is collapsed to a single edge joining the branch points (which are retained).

sg=segmentgraph(Cell07PNs[[1]])
plot(sg)

It can be useful to plot the graph with a tree layout:

plot(sg, layout=igraph::layout_as_tree, edge.arrow.size=.3, vertex.size=15)

Note that the root of the neuron is placed at the top of the plot (point number 1 in the graph above) and that successive branching orders and leaves are placed on levels further down the plot. Note also that the labels on the plot correspond to the identifiers of the points in the original neuron (aka the PointNo field, see first section).

If you need to work with the original identifiers of the points in the segmentgraph object, they are stored as igraph node attributes. You can access them like this:

igraph::V(sg)$label
igraph::V(sg)$vid

label encodes the PointNo column and vid the raw integer index of the point in the node array. If and only if the PointNo identifier is a sequentially increasing integer starting at 1, then these will be identical. The SWC format is a little vague about whether they should indeed be the same, but is normally understood to imply it. However there are many SWC files in the wild that violate this assumption.

You can also use the endpoints() and rootpoints() functions with the segmentgraph objects as well as any function that expects an igraph object. We can use this to find the branchpoints upstream of all end points using the igraph::adjacent_vertices() function. We use mode="all" to handle the situation where the root node is also an end point (i.e. a leaf node).

endpoints(sg)
ups=unlist(igraph::adjacent_vertices(sg, endpoints(sg), mode='all'))
# this maps the segmentgraph node indices back to indices for the neuron
igraph::V(sg)$vid[ups]

Here we plot those terminal branchpoints in red while internal branches are displayed in blue.

plot(n, WithNodes = F)
terminal_branches=igraph::V(sg)$vid[ups]
other_branches=setdiff(branchpoints(n), terminal_branches)
points(xyzmatrix(n)[terminal_branches,1:2], col='red')
points(xyzmatrix(n)[other_branches,1:2], col='blue')

ngraph objects

The nat package provides a bridge for neurons to the rich cross-platform igraph library. We provide a class ngraph that is a thin wrapper for the igraph class. This looks after things that we might need to know about a neuron (like the 3D coordinates of each node) while still giving access to all of the graph functions in the igraph package.

g=as.ngraph(Cell07PNs[[1]])
class(g)
g

You can use functions such as

igraph::diameter(g)

to find the length of the longest path across the neuron. This is defined in terms of the number of intervening nodes. You can also make a graph in which the edge weights are the euclidean distance between the connected 3D nodes:

gw=as.ngraph(Cell07PNs[[1]], weights=TRUE)
igraph::diameter(gw)

This gives you the longest path length (geodesic) across the graph in units of µm in this case.

Note that although you can do library(igraph), it adds a lot of functions to the search path, some of which have name clashes, so I often just use the package name (igraph::) prepended to the function that I want to call.

Walking along ngraph objects

You can use the graph representation of neurons e.g. to find the path between nodes.

g=as.ngraph(Cell07PNs[[1]], weights=TRUE)
eg=endpoints(g)

p=igraph::shortest_paths(g, from=1, to=180)
p$vpath[[1]]

# fails
p2=igraph::shortest_paths(g, from=180, to=1)
p2$vpath[[1]]

# mode all will find the path irrespective of direction of links, which are
# directed from the soma
p3=igraph::shortest_paths(g, from=180, to=1, mode = 'all')
p3$vpath[[1]]
all.equal(p$vpath[[1]], rev(p3$vpath[[1]]))

# just the distances - by default uses mode=all
igraph::distances(g, v=1, to=180)
igraph::distances(g, v=180, to=1)

Node identifiers and indices

Note that in the previous code block, nodes are identified by their index (i.e. an integer starting at 1). As already discussed, some neurons have an arbitrary numeric identifier for each node (this can be a large integer from a database table e.g. for CATMAID neurons). You access this identifier in all of the above calls by quoting it. For example:

# using raw indices
igraph::distances(g, v=180, to=1)
# using node identifiers
igraph::distances(g, v='180', to='1')

igraph::shortest_paths(g, from='1', to='180')$vpath[[1]]

In this instance, the results are identical since the node identifiers are the same as the raw indices. If we manipulate the node identifiers to add 1000 to each

# make a copy of `ngraph` object and add 1000 to each identifier
g2=g
igraph::V(g2)$name <- igraph::V(g2)$name+1000
# make a neuron with thoose identifiers to see what happened to its structure:
n2=as.neuron(g2)
head(n2$d)

then find path by identifier:

p2=igraph::shortest_paths(g2, from='1001', to='1180')$vpath[[1]]
p2

Note that when the path is printed it shows the node identifiers. But when using the path, it may be necessary to convert to integers. This results in raw indices again.

as.integer(p2)
names(p2)

Downstream nodes

You can also ask for nodes upstream or downstream of a given starting node. For example the neurons in the Cell07PNs set have a tag called AxonLHEP that defines the entry point of the axon into the lateral horn neuropil of the fly brain. Here we defined

n=Cell07PNs[[1]]
g=as.ngraph(n)
# find the nodes distal to this point
# nb you must set unreachable=F if you only want to get downstream nodes
igraph::dfs(g, mode='out', unreachable = FALSE, root=n$AxonLHEP)
# the proximal nodes back to the soma (including any branches)
igraph::dfs(g, mode='in', unreachable = FALSE, root=n$AxonLHEP)

Note that dfs (depth first search) provides a good way to visit all the nodes of the neuron

Let's use this to make a function that prunes neurons downstream of this axon entry point:

prune_from_lhep <- function(n, ...) {
  g=as.ngraph(n)
  downstream_indices=igraph::dfs(g, root = n$AxonLHEP, unreachable = FALSE)$order
  prune_vertices(n, verticestoprune = downstream_indices, invert = TRUE)
}
pruned=nlapply(Cell07PNs[1:3], prune_from_lhep)
plot(Cell07PNs[1:3], col='grey')
plot(pruned, lwd=2, add = T)

The pruned neurons show up in red, green, and blue in the above plot.

Distal nodes

As of nat v1.10.0 there is a distal_to() function to provide a simpler access to nodes defined by graph position.

n=Cell07PNs[[1]]
distal_to(n, node.idx = n$AxonLHEP)

One can find the complement (i.e. all the nodes that are not in the distal set) by comparing with the indices of all vertices.

setdiff(seq_len(nvertices(n)), distal_to(n, node.idx = n$AxonLHEP))

distal_to() allows you to write a slightly simpler version of the function above:

prune_from_lhep2 <- function(n, ...) {
  downstream_indices=distal_to(n, node.idx = n$AxonLHEP)
  prune_vertices(n, verticestoprune = downstream_indices, invert = TRUE)
}

all.equal(nlapply(Cell07PNs[1:3], prune_from_lhep2), pruned)

Notice that distal_to() will help with the situation where you need to find nodes using their identifiers rather than indices. For example tags in neurons loaded from the CATMAID reconstruction tool are defined by ids not indices.

tokeep=distal_to(dl1neuron, node.pointno = dl1neuron$tags$SCHLEGEL_LH)
plot(dl1neuron, WithNodes = F, soma=2000)
plot(prune_vertices(dl1neuron, tokeep, invert = T), add=T, WithNodes = F, col='red', lwd=2)