median: Finding Median Values

Description Usage Arguments Details Value AUTO Author(s) References

Description

Computes the median value of the given columns using a recursive binning algorithm.

Usage

1
2
3
4
median.data(x, ...) Median(x, ...)

Median(data, number.bins = 1000, sort.threshold = 1000, inputs = AUTO,
outputs = result)

Arguments

x, data

an object of class "data".

number.bins

the number of bins to use in the binning algorithm.

sort.threshold

the threshold for halting the binning algorithm.

inputs

the names of the columns that the GLA is performed on.

outputs

the name of the result. If not length 1, an error is thrown.

Details

For the purposes of consistency with the base library implementation of median, an S3 method version of median was created, with the default simply referring to the built-in version; median.data merely passes the call to Median and should not be used within other GLAs such as GroupBy.

The worst case complexity of the algorithm is O(n * log(n / s) / log(b) + s log s), where n is the number of rows of x; b, number.bins; s, sort.threshold. The average complexity is O(n + s log s). Additionally, the space complexity of the algorithm is O(c * b), where c is the number of attributes used.

Value

An object of class "data" with a single attribute.

AUTO

In the case of inputs = AUTO, all attributes of the data are used.

Author(s)

Jon Claus, <jonterainsights@gmail.com>, Tera Insights LLC

References

Tibshirani, R. J. (2008) Fast Computation of the Median by Successive Binning. Stanford University.


tera-insights/gtBase documentation built on May 31, 2019, 8:35 a.m.