DeconCdf: Estimating cumulative distribution function from data with...
In XFW/decon: Deconvolution Estimation in Measurement Error Models

Description Usage Arguments Details Value Author(s) References See Also Examples

To compute the cumulative distribution function from data coupled with measurement error. The measurement errors can be either homoscedastic or heteroscedastic.

1 2	DeconCdf(y,sig,x,error="normal",bw="dboot1",adjust=1, n=512,from,to,cut=3,na.rm=FALSE,grid=100,ub=2,...)

`y`	The observed data. It is a vector of length at least 3.
`sig`	The standard deviations σ. If homoscedastic errors, sig is a single value. If heteroscedastic errors, sig is a vector of standard deviations having the same length as y.
`x`	x is user-defined grids where the CDF will be evaluated. FFT method is not applicable if x is given.
`error`	Error distribution types: (1) 'normal' for normal errors; (2) 'laplacian' for Laplacian errors; (3) 'snormal' for a special case of small normal errors.
`bw`	Specifies the bandwidth. It can be a single numeric value which has been pre-determined; or computed with the specific bandwidth selector: 'dnrd' to compute the rule-of-thumb plugin bandwidth as suggested by Fan (1991); 'dmise' to compute the plugin bandwidth by minimizing MISE; 'dboot1' to compute the bootstrap bandwidth selector without resampling (Delaigle and Gijbels, 2004a), which minimizing the MISE bootstrap bandwidth selectors; 'boot2' to compute the smoothed bootstrap bandwidth selector with resampling.
`adjust`	adjust the range there the CDF is to be evaluated. By default, adjust=1.
`n`	number of points where the CDF is to be evaluated.
`from`	the starting point where the CDF is to be evaluated.
`to`	the starting point where the CDF is to be evaluated.
`cut`	used to adjust the starting end ending points where the CDF is to be evaluated.
`na.rm`	is set to FALSE by default: no NA value is allowed.
`grid`	the grid number to search the optimal bandwidth when a bandwidth selector was specified in bw. Default value "grid=100".
`ub`	the upper boundary to search the optimal bandwidth, default value is "ub=2".
`...`	control

FFT is currently not supported for CDF computing.

An object of class “Decon”.

X.F. Wang wangx6@ccf.org

B. Wang bwang@jaguar1.usouthal.edu

Delaigle, A. and Gijbels, I. (2004). Bootstrap bandwidth selection in kernel density estimation from a contaminated sample. Annals of the Institute of Statistical Mathematics, 56(1), 19-47.

Fan, J. (1991). On the optimal rates of convergence for nonparametric deconvolution problems. The Annals of Statistics, 19, 1257-1272.

Fan, J. (1992). Deconvolution with supersmooth distributions. The Canadian Journal of Statistics, 20, 155-169.

Hall, P. and Lahiri, S.N. (2008). Estimation of distributions, moments and quantiles in deconvolution problems. Annals of Statistics, 36(5), 2110-2134.

Stefanski L.A. and Carroll R.J. (1990). Deconvoluting kernel density estimators. Statistics, 21, 169-184.

Wang, X.F., Fan, Z. and Wang, B. (2010). Estimating smooth distribution function in the presence of heterogeneous measurement errors. Computational Statistics and Data Analysis, 54, 25-36.

Wang, X.F. and Wang, B. (2011). Deconvolution estimation in measurement error models: The R package decon. Journal of Statistical Software, 39(10), 1-24.

DeconPdf, DeconNpr.

#####################
## the R function to estimate the smooth distribution function
SDF <- function (x, bw = bw.nrd0(x), n = 512, lim=1){
        dx <- lim*sd(x)/20 
        xgrid <- seq(min(x)-dx, max(x)+dx, length = n)
        Fhat <- sapply(x, function(x) pnorm((xgrid-x)/bw)) 
        return(list(x = xgrid, y = rowMeans(Fhat)))
    }

## Case study: homoscedastic normal errors
n2 <- 1000
x2 <- c(rnorm(n2/2,-3,1),rnorm(n2/2,3,1))
sig2 <- .8
u2 <- rnorm(n2, sd=sig2)
w2 <- x2+u2
# estimate the bandwidth with the bootstrap method with resampling
bw2 <- bw.dboot2(w2,sig=sig2, error="normal")
# estimate the distribution function with measurement error
F2 <-  DeconCdf(w2,sig2,error='normal',bw=bw2)

plot(F2,  col="red", lwd=3, lty=2, xlab="x", ylab="F(x)", main="")
lines(SDF(x2), lwd=3, lty=1)
lines(SDF(w2), col="blue", lwd=3, lty=3)