bins: Binning of Data

bins-methodsR Documentation

Binning of Data

Description

Returns the list of data frames containing bin means \bar{\bm{y}}_{1}, \ldots, \bar{\bm{y}}_{v} and frequencies k_{1}, \ldots, k_{v} for the histogram preprocessing.

Usage

## S4 method for signature 'list'
bins(Dataset = list(), K = matrix(),
     ymin = numeric(), ymax = numeric(), ...)
## ... and for other signatures

Arguments

Dataset

a list of length n_{\mathrm{D}} of data frames of size n \times d containing d-dimensional datasets. Each of the d columns represents one random variable. Numbers of observations n equal the number of rows in the datasets.

K

a matrix of size n_{\mathrm{D}} \times d containing numbers of bins v_{1}, \ldots, v_{d} for the histogram. If, e.g., K = matrix(c(10, 15, 18, 5, 7, 9), byrow = TRUE, ncol = 3) than d = 3 and the list Dataset contains n_{\mathrm{D}} = 2 data frames. Hence, different numbers of bins can be assigned to y_{1}, \ldots, y_{d}. The default value is matrix().

ymin

a vector of length d containing minimum observations. The default value is numeric().

ymax

a vector of length d containing maximum observations. The default value is numeric().

...

currently not used.

Methods

signature(x = "list")

a list of data frames.

Author(s)

Branislav Panic, Marko Nagode

References

M. Nagode. Finite mixture modeling via REBMIX. Journal of Algorithms and Optimization, 3(2):14-28, 2015. https://repozitorij.uni-lj.si/Dokument.php?id=127674&lang=eng.

Examples

# Generate multivariate normal datasets.

n <- c(7, 10)

Theta <- new("RNGMVNORM.Theta", c = 2, d = 2)

a.theta1(Theta, 1) <- c(8, 6)
a.theta1(Theta, 2) <- c(6, 8)
a.theta2(Theta, 1) <- c(8, 2, 2, 4)
a.theta2(Theta, 2) <- c(2, 1, 1, 4)

sim2d <- RNGMIX(model = "RNGMVNORM", 
  Dataset.name = paste("sim2d_", 1:2, sep = ""),
  rseed = -1,
  n = n,
  Theta = a.Theta(Theta))

# Calculate optimal numbers of bins.

opt.k <- optbins(Dataset = sim2d@Dataset,
  Rule = "Knuth equal",
  kmin = 1, 
  kmax = 20)

opt.k

Y <- bins(Dataset = sim2d@Dataset, K = opt.k)

Y

opt.k <- optbins(Dataset = sim2d@Dataset,
  Rule = "Knuth unequal",
  kmin = 1, 
  kmax = 20)

opt.k

Y <- bins(Dataset = sim2d@Dataset, K = opt.k)

Y

rebmix documentation built on Sept. 11, 2024, 6:30 p.m.