npuniden.reflect: Kernel Bounded Univariate Density Estimation Via...

View source: R/npuniden.reflect.R

npuniden.reflectR Documentation

Kernel Bounded Univariate Density Estimation Via Data-Reflection

Description

npuniden.reflect computes kernel univariate unconditional density estimates given a vector of continuously distributed training data and, optionally, a bandwidth (otherwise likelihood cross-validation is used for its selection). Lower and upper bounds [a,b] can be supplied (default is [0,1]) and if a is set to -Inf there is only one bound on the right, while if b is set to Inf there is only one bound on the left.

Usage

npuniden.reflect(X = NULL,
                 Y = NULL,
                 h = NULL,
                 a = 0,
                 b = 1,
                 ...)

Arguments

X

a required numeric vector of training data lying in [a,b]

Y

an optional numeric vector of evaluation data lying in [a,b]

h

an optional bandwidth (>0)

a

an optional lower bound (defaults to 0)

b

an optional upper bound (defaults to 1)

...

optional arguments passed to npudensbw and npudens

Details

Typical usages are (see below for a complete list of options and also the examples at the end of this help file)

    model <- npuniden.reflect(X,a=-2,b=3)
  

npuniden.reflect implements the data-reflection method for estimating a univariate density function defined over a continuous random variable in the presence of bounds.

Note that data-reflection imposes a zero derivative at the boundary, i.e., f'(a)=f'(b)=0.

Value

npuniden.reflect returns the following components:

f

estimated density at the points X

F

estimated distribution at the points X (numeric integral of f)

sd.f

asymptotic standard error of the estimated density at the points X

sd.F

asymptotic standard error of the estimated distribution at the points X

h

bandwidth used

nmulti

number of multi-starts used

Author(s)

Jeffrey S. Racine racinej@mcmaster.ca

References

Boneva, L. I., Kendall, D., and Stefanov, I. (1971). “Spline transformations: Three new diagnostic aids for the statistical data- analyst,” Journal of the Royal Statistical Society. Series B (Methodological), 33(1):1-71.

Cline, D. B. H. and Hart, J. D. (1991). “Kernel estimation of densities with discontinuities or discontinuous derivatives,” Statistics, 22(1):69-84.

Hall, P. and Wehrly, T. E. (1991). “A geometrical method for removing edge effects from kernel- type nonparametric regression estimators,” Journal of the American Statistical Association, 86(415):665-672.

See Also

The Ake, bde, and Conake packages and the function npuniden.boundary.

Examples

## Not run: 
## Example 1: f(0)=0, f(1)=1, plot boundary corrected density,
## unadjusted density, and DGP
set.seed(42)
n <- 100
X <- sort(rbeta(n,5,1))
dgp <- dbeta(X,5,1)
model <- npuniden.reflect(X)
model.unadjusted <- npuniden.boundary(X,a=-Inf,b=Inf)
ylim <- c(0,max(c(dgp,model$f,model.unadjusted$f)))
plot(X,model$f,ylab="Density",ylim=ylim,type="l")
lines(X,model.unadjusted$f,lty=2,col=2)
lines(X,dgp,lty=3,col=3)
rug(X)
legend("topleft",c("Data-Reflection","Unadjusted","DGP"),col=1:3,lty=1:3,bty="n")

## Example 2: f(0)=0, f(1)=0, plot density, distribution, DGP, and
## asymptotic point-wise confidence intervals
set.seed(42)
X <- sort(rbeta(100,5,3))
model <- npuniden.reflect(X)
par(mfrow=c(1,2))
ylim=range(c(model$f,model$f+1.96*model$sd.f,model$f-1.96*model$sd.f,dbeta(X,5,3)))
plot(X,model$f,ylim=ylim,ylab="Density",type="l",)
lines(X,model$f+1.96*model$sd.f,lty=2)
lines(X,model$f-1.96*model$sd.f,lty=2)
lines(X,dbeta(X,5,3),col=2)
rug(X)
legend("topleft",c("Density","DGP"),lty=c(1,1),col=1:2,bty="n")

plot(X,model$F,ylab="Distribution",type="l")
lines(X,model$F+1.96*model$sd.F,lty=2)
lines(X,model$F-1.96*model$sd.F,lty=2)
lines(X,pbeta(X,5,3),col=2)
rug(X)
legend("topleft",c("Distribution","DGP"),lty=c(1,1),col=1:2,bty="n")


## Example 3: Age for working age males in the cps71 data set bounded
## below by 21 and above by 65
data(cps71)
attach(cps71)
model <- npuniden.reflect(age,a=21,b=65)
par(mfrow=c(1,1))
hist(age,prob=TRUE,main="",ylim=c(0,max(model$f)))
lines(age,model$f)
lines(density(age,bw=model$h),col=2)
legend("topright",c("Data-Reflection","Unadjusted"),lty=c(1,1),col=1:2,bty="n")
detach(cps71)

## End(Not run) 

np documentation built on March 31, 2023, 9:41 p.m.