treebin: Tree-based Binning

Description Usage Arguments Value Examples

Description

treebin bins the provided data using the tree-based binning method, as described in Rahman & Oldford (2018).

Usage

1
2
3
4
5
treebin(X, stopCriteria = gapStop, binMeasure = gapMeasure,
  boundaryTest = gapBoundaryTest, selectBin = gapSelect,
  splitBin = gapSplit, makePoint = gapPoints, binInfo = list(binRange =
  matrix(rep(c(-Inf, Inf), ncol(X)), 2, ncol(X))), inputs = list(tau = 1,
  numbins = floor(nrow(X)/2)))

Arguments

X

The point configuration to be binned

stopCriteria

A user supplied function to compute the stopping criteria of the function

binMeasure

A user supplied function to compute the measure associated with each bin

boundaryTest

A user supplied function to test if a given point is contained in a given bin

selectBin

A user supplied function for choosing between bins to be split

splitBin

A user supplied function for splitting a bin

makePoint

A user supplied function to turn the contents of a bin into a single point

binInfo

Additional information to be supplied to the first bin. Default is NULL.

inputs

A list containing additional input parameters required by user supplied functions. Default is NULL.

Value

The return value is an object of class treebinr, which contains the following

points

A matrix containing the reduced point configuration

counts

A vector containing the number of points in each bin

bins

A list containing bin objects, which detail the contents of each bin

tree

An undirected graph object for the binning tree

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
set.seed(1324567)
X <- data.frame(x = rnorm(2000), y = rnorm(1000))
out <- treebin(X, inputs = list(tau = 1, numbins = 500))
Xreduced <- as.data.frame(out@points, col.names = c("x", "y"))
Xsampled <- X[sample(1:nrow(X), 500, replace = FALSE),]

savePar <- par(mfrow = c(1,3))
xlim <- extendrange(X$x); ylim <- extendrange(X$y)
plot(X, 
     main = paste0("original data (", nrow(X)," points)"),
     xlim = xlim, ylim = ylim)
plot(Xreduced, 
     main = paste0("reduced data (", nrow(Xreduced)," points)"),
     xlim = xlim, ylim = ylim)
plot(Xsampled, 
     main = paste0("sampled data (", nrow(Xsampled)," points)"),
     xlim = xlim, ylim = ylim)
par(savePar)

rwoldford/treebinr documentation built on May 12, 2019, 4:38 a.m.