The objective here is to reduce the number, n, of data points in any p-dimensional point configuration to a specified (and much smaller) number, m, of the same dimensionality while preserving much of both the geometric shape of the point configuration and the density of the points in the p dimensional space. Preserving one comes at some cost to preserving the other. The treebinr package uses a recursive tree-binning approach to try to preserve the geometry at some cost to the density; this contrasts with a simple random sampling approach which tries to preserve the density but at considerable cost to the geometry. In addition to particular methods, treebinr is also extendible by others who may wish to develop or implement other methods. Obvious applications are reducing the data size to permit more complex calculations or data visualizations where the accuracy of high-density regions may be less important than, say, preserving outlying and unusual geometry.
Package details |
|
---|---|
Author | Adam Rahman [aut, cre], Wayne Oldford [aut, cre] |
Maintainer | Wayne Oldford <rwoldford@uwaterloo.ca> |
License | GPL-2 | GPL-3 |
Version | 0.1.0.9000 |
URL | https://github.com/rwoldford/treebinr |
Package repository | View on GitHub |
Installation |
Install the latest version of this package by entering the following in R:
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.