Identify clusters in a collection of positions or intervals
This function uses tools in the intervals package to quickly identify clusters – contiguous collections of positions or intervals which are separated by no more than a given distance from their neighbors to either side.
1 2 3 4 5
An appropriate object.
Maximum permitted distance between a cluster member and its neighbors to either side.
Should indices into the
A cluster is defined to be a maximal collection, with at least two
members, of components of
x which are separated by no more than
w. Note that when
x represents intervals, an interval
must actually contain a point at distance
w or less from
a neighboring interval to be assigned to the same cluster. If the ends
of both intervals in question are open and exactly at distance
w, they will not be deemed to be cluster co-members. See the
A list whose components are the clusters. Each component is thus a
x, or, if
which == TRUE, a vector of
indices into the
x object. (The indices correspond to row
x is of class
Implementation is by a call to
reduce followed by a call
clusters methods are
included to illustrate the utility of the core functions in the
intervals package, although they are also useful in their own
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38
# Numeric method w <- 20 x <- sample( 1000, 100 ) c1 <- clusters( x, w ) # Check results sapply( c1, function( x ) all( diff(x) <= w ) ) d1 <- diff( sort(x) ) all.equal( as.numeric( d1[ d1 <= w ] ), unlist( sapply( c1, diff ) ) ) # Intervals method, starting with a reduced object so we know that all # intervals are disjoint and sorted. B <- 100 left <- runif( B, 0, 1e4 ) right <- left + rexp( B, rate = 1/10 ) y <- reduce( Intervals( cbind( left, right ) ) ) gaps <- function(x) x[-1,1] - x[-nrow(x),2] hist( gaps(y), breaks = 30 ) w <- 200 c2 <- clusters( y, w ) head( c2 ) sapply( c2, function(x) all( gaps(x) <= w ) ) # Clusters and open end points. See "Details". z <- Intervals( matrix( 1:4, 2, 2, byrow = TRUE ), closed = c( TRUE, FALSE ) ) z clusters( z, 1 ) closed(z) <- FALSE z clusters( z, 1 )
Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.