clusters | R Documentation |
This function uses tools in the intervals package to quickly identify clusters – contiguous collections of positions or intervals which are separated by no more than a given distance from their neighbors to either side.
## S4 method for signature 'numeric'
clusters(x, w, which = FALSE, check_valid = TRUE)
## S4 method for signature 'Intervals_virtual'
clusters(x, w, which = FALSE, check_valid = TRUE)
x |
An appropriate object. |
w |
Maximum permitted distance between a cluster member and its neighbors to either side. |
which |
Should indices into the |
check_valid |
Should |
A cluster is defined to be a maximal collection, with at least two
members, of components of x
which are separated by no more than
w
. Note that when x
represents intervals, an interval
must actually contain a point at distance w
or less from
a neighboring interval to be assigned to the same cluster. If the ends
of both intervals in question are open and exactly at distance
w
, they will not be deemed to be cluster co-members. See the
example below.
A list whose components are the clusters. Each component is thus a
subset of x
, or, if which == TRUE
, a vector of
indices into the x
object. (The indices correspond to row
numbers when x
is of class "Intervals_virtual"
.)
Implementation is by a call to reduce
followed by a call
to interval_overlap
. The clusters
methods are
included to illustrate the utility of the core functions in the
intervals package, although they are also useful in their own
right.
# Numeric method
w <- 20
x <- sample( 1000, 100 )
c1 <- clusters( x, w )
# Check results
sapply( c1, function( x ) all( diff(x) <= w ) )
d1 <- diff( sort(x) )
all.equal(
as.numeric( d1[ d1 <= w ] ),
unlist( sapply( c1, diff ) )
)
# Intervals method, starting with a reduced object so we know that all
# intervals are disjoint and sorted.
B <- 100
left <- runif( B, 0, 1e4 )
right <- left + rexp( B, rate = 1/10 )
y <- reduce( Intervals( cbind( left, right ) ) )
gaps <- function(x) x[-1,1] - x[-nrow(x),2]
hist( gaps(y), breaks = 30 )
w <- 200
c2 <- clusters( y, w )
head( c2 )
sapply( c2, function(x) all( gaps(x) <= w ) )
# Clusters and open end points. See "Details".
z <- Intervals(
matrix( 1:4, 2, 2, byrow = TRUE ),
closed = c( TRUE, FALSE )
)
z
clusters( z, 1 )
closed(z)[1] <- FALSE
z
clusters( z, 1 )
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.