npuniden.sc | R Documentation |
npuniden.sc
computes shape constrained kernel univariate
unconditional density estimates given a vector of continuously
distributed training data and a bandwidth. Lower and upper bounds
[a
,b
] can be supplied (default is [0,1]) and if a
is set to -Inf
there is only one bound on the right, while if
b
is set to Inf
there is only one bound on the left.
npuniden.sc(X = NULL, Y = NULL, h = NULL, a = 0, b = 1, lb = NULL, ub = NULL, extend.range = 0, num.grid = 0, function.distance = TRUE, integral.equal = FALSE, constraint = c("density", "mono.incr", "mono.decr", "concave", "convex", "log-concave", "log-convex"))
X |
a required numeric vector of training data lying in [a,b] |
Y |
an optional numeric vector of evaluation data lying in [a,b] |
h |
a bandwidth (>0) |
a |
an optional lower bound on the support of |
b |
an optional upper bound on the support of |
lb |
a scalar lower bound (≥ 0) to be used in conjunction with
|
ub |
a scalar upper bound (≥ 0 and ≥ |
extend.range |
number specifying the fraction by which the range of the training data
should be extended for the additional grid points (passed to the
function |
num.grid |
number of additional grid points (in addition to |
function.distance |
a logical value that, if |
integral.equal |
a logical value, that, if |
constraint |
a character string indicating whether the estimate is to be
constrained to be monotonically increasing
( |
Typical usages are (see below for a complete list of options and also the examples at the end of this help file)
model <- npuniden.sc(X,a=-2,b=3)
npuniden.sc
implements a methods for estimating a univariate
density function defined over a continuous random variable in the
presence of bounds subject to a variety of shape constraints. The
bounded estimates use the truncated Gaussian kernel function.
Note that for the log-constrained estimates, the derivative estimate returned is that for the log-constrained estimate not the non-log value of the estimate returned by the function. See Example 5 below hat manually plots the log-density and returned derivative (no transformation is needed when plotting the density estimate itself).
If the quadratic program solver fails to find a solution, the
unconstrained estimate is returned with an immediate warning. Possible
causes to be investigated are undersmoothing, sparsity, and the
presence of non-sample grid points. To investigate the possibility of
undersmoothing try using a larger bandwidth, to investigate sparsity
try decreasing extend.range
, and to investigate non-sample grid
points try setting num.grid
to 0
.
Mean square error performance seems to improve generally when using
additional grid points in the empirical support of X
and
Y
(i.e., in the observed range of the data sample) but appears
to deteriorate when imposing constraints beyond the empirical support
(i.e., when extend.range
is positive). Increasing the number of
additional points beyond a hundred or so appears to have a limited
impact.
The option function.distance=TRUE
appears to perform better for
imposing convexity, concavity, log-convexity and log-concavity, while
function.distance=FALSE
appears to perform better for imposing
monotonicity, whether increasing or decreasing (based on simulations
for the Beta(s1,s2) distribution with sample size n=100).
A list with the following elements:
f |
unconstrained density estimate |
f.sc |
shape constrained density estimate |
se.f |
asymptotic standard error of the unconstrained density estimate |
se.f.sc |
asymptotic standard error of the shape constrained density estimate |
f.deriv |
unconstrained derivative estimate (of order 1 or 2 or log thereof) |
f.sc.deriv |
shape constrained derivative estimate (of order 1 or 2 or log thereof) |
F |
unconstrained distribution estimate |
F.sc |
shape constrained distribution estimate |
integral.f |
the integral of the unconstrained estimate over |
integral.f.sc |
the integral of the constrained estimate over |
solve.QP |
logical, if |
attempts |
number of attempts when |
Jeffrey S. Racine racinej@mcmaster.ca
Du, P. and C. Parmeter and J. Racine, “Shape Constrained Kernel Density Estimation”, manuscript.
The logcondens, LogConDEAD, and scdensity packages,
and the function npuniden.boundary
.
## Not run: n <- 100 set.seed(42) ## Example 1: N(0,1), constrain the density to lie within lb=.1 and ub=.2 X <- sort(rnorm(n)) h <- npuniden.boundary(X,a=-Inf,b=Inf)$h foo <- npuniden.sc(X,h=h,constraint="density",a=-Inf,b=Inf,lb=.1,ub=.2) ylim <- range(c(foo$f.sc,foo$f)) plot(X,foo$f.sc,type="l",ylim=ylim,xlab="X",ylab="Density") lines(X,foo$f,col=2,lty=2) rug(X) legend("topleft",c("Constrained","Unconstrained"),lty=1:2,col=1:2,bty="n") ## Example 2: Beta(5,1), DGP is monotone increasing, impose valid ## restriction X <- sort(rbeta(n,5,1)) h <- npuniden.boundary(X)$h foo <- npuniden.sc(X=X,h=h,constraint=c("mono.incr")) par(mfrow=c(1,2)) ylim <- range(c(foo$f.sc,foo$f)) plot(X,foo$f.sc,type="l",ylim=ylim,xlab="X",ylab="Density") lines(X,foo$f,col=2,lty=2) rug(X) legend("topleft",c("Constrained","Unconstrained"),lty=1:2,col=1:2,bty="n") ylim <- range(c(foo$f.sc.deriv,foo$f.deriv)) plot(X,foo$f.sc.deriv,type="l",ylim=ylim,xlab="X",ylab="First Derivative") lines(X,foo$f.deriv,col=2,lty=2) abline(h=0,lty=2) rug(X) legend("topleft",c("Constrained","Unconstrained"),lty=1:2,col=1:2,bty="n") ## Example 3: Beta(1,5), DGP is monotone decreasing, impose valid ## restriction X <- sort(rbeta(n,1,5)) h <- npuniden.boundary(X)$h foo <- npuniden.sc(X=X,h=h,constraint=c("mono.decr")) par(mfrow=c(1,2)) ylim <- range(c(foo$f.sc,foo$f)) plot(X,foo$f.sc,type="l",ylim=ylim,xlab="X",ylab="Density") lines(X,foo$f,col=2,lty=2) rug(X) legend("topleft",c("Constrained","Unconstrained"),lty=1:2,col=1:2,bty="n") ylim <- range(c(foo$f.sc.deriv,foo$f.deriv)) plot(X,foo$f.sc.deriv,type="l",ylim=ylim,xlab="X",ylab="First Derivative") lines(X,foo$f.deriv,col=2,lty=2) abline(h=0,lty=2) rug(X) legend("topleft",c("Constrained","Unconstrained"),lty=1:2,col=1:2,bty="n") ## Example 4: N(0,1), DGP is log-concave, impose invalid concavity ## restriction X <- sort(rnorm(n)) h <- npuniden.boundary(X,a=-Inf,b=Inf)$h foo <- npuniden.sc(X=X,h=h,a=-Inf,b=Inf,constraint=c("concave")) par(mfrow=c(1,2)) ylim <- range(c(foo$f.sc,foo$f)) plot(X,foo$f.sc,type="l",ylim=ylim,xlab="X",ylab="Density") lines(X,foo$f,col=2,lty=2) rug(X) legend("topleft",c("Constrained","Unconstrained"),lty=1:2,col=1:2,bty="n") ylim <- range(c(foo$f.sc.deriv,foo$f.deriv)) plot(X,foo$f.sc.deriv,type="l",ylim=ylim,xlab="X",ylab="Second Derivative") lines(X,foo$f.deriv,col=2,lty=2) abline(h=0,lty=2) rug(X) legend("topleft",c("Constrained","Unconstrained"),lty=1:2,col=1:2,bty="n") ## Example 45: Beta(3/4,3/4), DGP is convex, impose valid restriction X <- sort(rbeta(n,3/4,3/4)) h <- npuniden.boundary(X)$h foo <- npuniden.sc(X=X,h=h,constraint=c("convex")) par(mfrow=c(1,2)) ylim <- range(c(foo$f.sc,foo$f)) plot(X,foo$f.sc,type="l",ylim=ylim,xlab="X",ylab="Density") lines(X,foo$f,col=2,lty=2) rug(X) legend("topleft",c("Constrained","Unconstrained"),lty=1:2,col=1:2,bty="n") ylim <- range(c(foo$f.sc.deriv,foo$f.deriv)) plot(X,foo$f.sc.deriv,type="l",ylim=ylim,xlab="X",ylab="Second Derivative") lines(X,foo$f.deriv,col=2,lty=2) abline(h=0,lty=2) rug(X) legend("topleft",c("Constrained","Unconstrained"),lty=1:2,col=1:2,bty="n") ## Example 6: N(0,1), DGP is log-concave, impose log-concavity ## restriction X <- sort(rnorm(n)) h <- npuniden.boundary(X,a=-Inf,b=Inf)$h foo <- npuniden.sc(X=X,h=h,a=-Inf,b=Inf,constraint=c("log-concave")) par(mfrow=c(1,2)) ylim <- range(c(log(foo$f.sc),log(foo$f))) plot(X,log(foo$f.sc),type="l",ylim=ylim,xlab="X",ylab="Log-Density") lines(X,log(foo$f),col=2,lty=2) rug(X) legend("topleft",c("Constrained-log","Unconstrained-log"),lty=1:2,col=1:2,bty="n") ylim <- range(c(foo$f.sc.deriv,foo$f.deriv)) plot(X,foo$f.sc.deriv,type="l",ylim=ylim,xlab="X",ylab="Second Derivative of Log-Density") lines(X,foo$f.deriv,col=2,lty=2) abline(h=0,lty=2) rug(X) legend("topleft",c("Constrained-log","Unconstrained-log"),lty=1:2,col=1:2,bty="n") ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.