reg_deconvolve: Compute the measurement error version of the Nadaraya-Watson...

Description Usage Arguments Details Value Warnings References Author(s) Examples

Description

Estimates m(x) = E[Y | X = x] from data (W, Y) where W = X + U.

Usage

1
2
3
4
reg_deconvolve(Y, W1, W2 = NULL, xx = seq(min(W1), max(W1), length.out
  = 100), errortype = NULL, sd_U = NULL, phiU = NULL, bw = NULL,
  rho = NULL, n_cores = NULL, kernel_type = c("default", "normal",
  "sinc"), seed = NULL, use_alt_SIMEX_rep_opt = FALSE)

Arguments

Y

A vector of the response data Y_1, ..., Y_n.

W1

A vector of size n containing the univariate contaminated data.

W2

(optional) A vector of size n containing replicate measurements for the same n individuals (in the same order) as W1. If supplied, then the error distribution will be estimated using the replicates only if phiU, and both of errortype and sd_U are not provided.

xx

A vector of x values on which to compute the regression estimator.

errortype

A single string giving the distribution of U, either "laplace" or "normal". If you define the error distribution this way then you must also provide sd_U but should not provide phiU. Argument is case-insensitive and partially matched.

sd_U

The standard deviation of U. This does not need to be provided if you define your error using phiU and provide bw and rho.

phiU

A function giving the characteristic function of U. You should only define the errors this way if you also provide bw and rho. If you define the errors this way then you should not provide errortype.

bw

The bandwidth to use. If you provide this then you should also provide rho.

rho

The ridge parameter to use. If you provide this then you should also provide bw.

n_cores

Number of cores to use when calculating the bandwidth. If NULL, the number of cores to use will be automatically detected.

kernel_type

The deconvolution kernel to use. The default kernel has characteristic function (1-t^2)^3 for t \in [-1,1]. The normal kernel is the standard normal density. The sinc kernel has characteristic function equal to 1 for t \in [-1,1]

seed

Set seed for SIMEX. Allows for reproducible results using SIMEX. Otherwise a default seed will be automatically set.

use_alt_SIMEX_rep_opt

Only used with SIMEX using replicates. If TRUE, performs SIMEX on W = (W1 + W2)/2 and samples U* from (W1 - W2). The default performs SIMEX on W = (W1, W2) and and samples U* from (W1 - W2)/√ 2.

Details

#' The function reg_deconvolve chooses from one of two different methods depending on how the error distribution is defined.

Error from Replicates: If both W1 and W2 are supplied then the error is calculated using replicates. This method was prototyped in Delaigle, Hall, and Meister 2008 and then further refined in Delaigle and Hall 2016, and Camirand, Carroll, and Delaigle 2018.

Homoscedastic Error: If the errors are defined by either a single function phiU, or a single value sd_U along with its errortype then the method used is as described in Fan and Truong 1993.

The order in which we choose the methods is as follows:

  1. If provided, use phiU to define the errors, otherwise

  2. If provided use errortype and sd_u to define the errors, otherwise

  3. If provided, use the vector of replicates W2 to estimate the error distribution.

Note that in both 1 and 2, if a vector of replicates W2 is provided we augment the data in W1 with that in W2.

Value

An object of class deconvolve containing the regression estimator, as well as the bandwidth and ridge parameter rho. Using SIMEX to choose smoothing-parameters. See Delaigle and Hall 2008.

Warnings

References

Camirand, F., Carroll, R.J., and Delaigle, A. (2018). Estimating the distribution of episodically consumed food measured with errors. Manuscript.

Delaigle, A. and Gijbels, I. (2007). Frequent problems in calculating integrals and optimizing objective functions: a case study in density deconvolution. Statistics and Computing, 17, 349 - 355.

Delaigle, A. and Hall, P. (2008). Using SIMEX for smoothing-parameter choice in errors-in-variables problems. Journal of the American Statistical Association, 103, 481, 280-287

Delaigle, A. and Hall, P. (2016). Methodology for non-parametric deconvolution when the error distribution is unknown. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 78, 1, 231-252.

Delaigle, A., Hall, P., and Meister, A. (2008). On Deconvolution with repeated measurements. Annals of Statistics, 36, 665-685

Fan, J., and Truong, Y. K. (1993), Nonparametric Regression With Errors in Variables, The Annals of Statistics. 21, 1900-1925.

Author(s)

Aurore Delaigle, Timothy Hyndman, Tianying Wang

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
## Not run: 
# Error from replicates --------------------------------------------------------
W1 <- (framingham$SBP21 + framingham$SBP22)/2
W2 <- (framingham$SBP31 + framingham$SBP32)/2
Y <- framingham$FIRSTCHD
h <- 1.120537 #Precalculated using SIMEX option from bandwidth()
rho <- 0.0103959 #Precalculated using SIMEX option from bandwidth()
output <- reg_deconvolve(Y, W1, W2, bw = h, rho = rho)

# Error known ------------------------------------------------------------------
n <- 50
X <- stats::rchisq(n, 3)
Y <- 2*X

sd_U = 0.2
U <- stats::rnorm(n, sd = sd_U)

W <- X + U

output <- reg_deconvolve(W, Y, errortype = "norm", sd_U = 0.2, n_cores = 2)

## End(Not run)

TimothyHyndman/deconvolve documentation built on May 13, 2019, 11:51 p.m.