wkde: Compute a Binned Kernel Density Estimate for Weighted Data

View source: R/wkde.R

wkdeR Documentation

Compute a Binned Kernel Density Estimate for Weighted Data


Returns x and y coordinates of the binned kernel density estimate of the probability density of the weighted data.


 wkde(x, w, bandwidth, freq=FALSE, gridsize = 401L, range.x, 
 	 truncate = TRUE, na.rm = TRUE)



vector of observations from the distribution whose density is to be estimated. Missing values are not allowed.


The weights of x. The weight w_i of any observation x_i should be non-negative. If x_i=0, x_i will be removed from the analysis.


the kernel bandwidth smoothing parameter. Larger values of bandwidth make smoother estimates, smaller values of bandwidth make less smooth estimates. Automatic bandwidth selectors are developed. Options include wnrd, wnrd0, wmise,blscv, and awmise.


An indicator showing whether w is a vector of frequecies (counts) or weights.


the number of equally spaced points at which to estimate the density.


vector containing the minimum and maximum values of x at which to compute the estimate. The default is the minimum and maximum data values, extended by the support of the kernel.


logical flag: if TRUE, data with x values outside the range specified by range.x are ignored.


logical flag: if TRUE, NA values will be ignored; otherwise, the program will be halted with error information.


The default bandwidth, "wnrd0", is computed using a rule-of-thumb for choosing the bandwidth of a Gaussian kernel density estimator based on weighted data. It defaults to 0.9 times the minimum of the standard deviation and the interquartile range divided by 1.34 times the sample size to the negative one-fifth power (= Silverman's ‘rule of thumb’, Silverman (1986, page 48, eqn (3.31)) _unless_ the quartiles coincide when a positive result will be guaranteed.

"wnrd" is the more common variation given by Scott (1992), using factor 1.06.

"wmise" is a completely automatic optimal bandwidth selector using the least-squares cross-validation (LSCV) method by minimizing the integrated squared errors (ISE).


a list containing the following components:


vector of sorted x values at which the estimate was computed.


vector of density estimates at the corresponding x.


optimal bandwidth.


sensitivity parameter, none NA if adaptive bandwidth selector is used.


Wand, M. P. and Jones, M. C. (1995). Kernel Smoothing. Chapman and Hall, London.


 mu = 34.5; s=1.5; n = 3000
 x = round(rnorm(n, mu, s),1)
 x0 = seq(min(x)-s,max(x)+s, length=100)
 f0 = dnorm(x0,mu, s)

 xt = table(x); n = length(x)
 x1 = as.numeric(names(xt))
 w1 = as.numeric(xt)
 (h1 <- bw.wnrd0(x1, w1))
 (h2 <- bw.wnrd0(x1,w1,n=n))
 est1 <- wkde(x1,w1, bandwidth=h1)
 est2 <- wkde(x1,w1, bandwidth=h2)
 est3 <- wkde(x1,w1, bandwidth='awmise')
 est4 <- wkde(x1,w1, bandwidth='wmise')
 est5 <- wkde(x1,w1, bandwidth='blscv')

 est0 = density(x1,bw="SJ",weights=w1/sum(w1)); 

 plot(f0~x0, xlim=c(min(x),max(x)), ylim=c(0,.30), type="l")
 lines(est0, col=2, lty=2, lwd=2)

 lines(est1, col=2)
 lines(est2, col=3)
 lines(est3, col=4)
 lines(est4, col=5)
 lines(est5, col=6)
  legend=c("N(34.5,1.5)", "SJ", "wnrd0",
  col = c(1,2,2,3,4,5,6), lty=c(1,2,1,1,1,1,1),

bda documentation built on May 29, 2024, 2:56 a.m.

Related to wkde in bda...