ruvk: Random generation from univariate kernel density

View source: R/univar-kd.R

ruvkR Documentation

Random generation from univariate kernel density

Description

Random generation from univariate kernel density

Usage

ruvk(
  n,
  y,
  bw = bw.nrd0(y),
  kernel = c("gaussian", "epanechnikov", "rectangular", "triangular", "biweight",
    "cosine", "optcosine"),
  weights = NULL,
  adjust = 1,
  shrinked = FALSE
)

Arguments

n

number of observations. If length(n) > 1, the length is taken to be the number required.

y

numeric vector.

bw

the smoothing bandwidth to be used. The kernels are scaled such that this is the standard deviation of the smoothing kernel (see density for details).

kernel

a character string giving the smoothing kernel to be used. This must partially match one of "gaussian", "rectangular", "triangular", "epanechnikov", "biweight", "cosine" or "optcosine", with default "gaussian", and may be abbreviated.

weights

numeric vector of length equal to length(y); must be non-negative.

adjust

scalar; the bandwidth used is actually adjust*bw. This makes it easy to specify values like 'half the default' bandwidth.

shrinked

if TRUE random generation algorithm preserves mean and variance of the original sample.

Details

Univariate kernel density estimator is defined as

\hat{f_h}(x) = \sum_{i=1}^n w_i \, K_h(x-y_i)

where w is a vector of weights such that all w_i \ge 0 and \sum_i w_i = 1 (by default uniform 1/n weights are used), K_h = K(x/h)/h is kernel K parametrized by bandwidth h and y is a vector of data points used for estimating the kernel density.

For estimating kernel densities use the density function.

The random generation algorithm is described in the documentation of kernelboot function.

References

Deng, H. and Wickham, H. (2011). Density estimation in R. http://vita.had.co.nz/papers/density-estimation.pdf

See Also

kernelboot, density

Examples


# ruvk() produces samples from kernel densities as estimated using
# density() function from base R

hist(ruvk(1e5, mtcars$mpg), 100, freq = FALSE, xlim = c(5, 40))
lines(density(mtcars$mpg, bw = bw.nrd0(mtcars$mpg)), col = "red")

# when using 'shrinked = TRUE', the samples differ from density() estimates
# since they are shrinked to have the same variance as the underlying data

hist(ruvk(1e5, mtcars$mpg, shrinked = TRUE), 100, freq = FALSE, xlim = c(5, 40))
lines(density(mtcars$mpg, bw = bw.nrd0(mtcars$mpg)), col = "red")

# Comparison of different univariate kernels under standard parametrization

kernels <- c("gaussian", "epanechnikov", "rectangular", "triangular",
             "biweight", "cosine", "optcosine")

partmp <- par(mfrow = c(2, 4), mar = c(3, 3, 3, 3))
for (k in kernels) {
  hist(ruvk(1e5, 0, 1, kernel = k), 25, freq = FALSE, main = k)
  lines(density(0, 1, kernel = k), col = "red")
}
par(partmp)


kernelboot documentation built on April 14, 2023, 5:14 p.m.