prepareKernel | R Documentation |
Checks if the order is 2, 4, or 6, transforms the objects into matrices, checks the dimensions, provides the bandwidth, creates default arguments to pass to the C++ functions, carries out de-duplication for speed-up etc.
prepareKernel(
x,
y = NULL,
xout = NULL,
weights = NULL,
bw = NULL,
kernel = c("gaussian", "uniform", "triangular", "epanechnikov", "quartic"),
order = 2,
convolution = FALSE,
sparse = FALSE,
deduplicate.x = TRUE,
deduplicate.xout = TRUE,
no.dedup = FALSE,
PIT = FALSE
)
x |
A numeric vector, matrix, or data frame containing observations. For density, the points used to compute the density. For kernel regression, the points corresponding to explanatory variables. |
y |
Optional: a vector of dependent variable values. |
xout |
A vector or a matrix of data points with |
weights |
A numeric vector of observation weights (typically counts) to
perform weighted operations. If null, |
bw |
Bandwidth for the kernel: a scalar or a vector of the same length as |
kernel |
Character describing the desired kernel type. NB: due to limited machine precision, even Gaussian has finite support. |
order |
An integer: 2, 4, or 6. Order-2 kernels are the standard kernels that are positive everywhere. Orders 4 and 6 produce some negative values, which reduces bias but may hamper density estimation. |
convolution |
Logical: if FALSE, returns the usual kernel. If TRUE, returns the convolution kernel that is used in density cross-validation. |
sparse |
Logical: TODO (ignored) |
deduplicate.x |
Logical: if TRUE, full duplicates in the input |
deduplicate.xout |
Logical: if TRUE, full duplicates in the input |
no.dedup |
Logical: if TRUE, sets |
PIT |
If TRUE, the Probability Integral Transform (PIT) is applied to all columns
of |
A list of arguments that are taken by [kernelDensity()] and [kernelSmooth()].
# De-duplication facilities
set.seed(1) # Creating a data set with many duplicates
n.uniq <- 10000
n <- 60000
inds <- ceiling(runif(n, 0, n.uniq))
x.uniq <- matrix(rnorm(n.uniq*10), ncol = 10)
x <- x.uniq[inds, ]
y <- runif(n.uniq)[inds]
xout <- x.uniq[ceiling(runif(n.uniq*3, 0, n.uniq)), ]
w <- runif(n)
print(system.time(a1 <- prepareKernel(x, y, xout, w, bw = 0.5)))
print(system.time(a2 <- prepareKernel(x, y, xout, w, bw = 0.5,
deduplicate.x = FALSE, deduplicate.xout = FALSE)))
print(c(object.size(a1), object.size(a2)) / 1024) # Kilobytes used
# Speed-memory trade-off: 4 times smaller, takes 0.2 s, but reduces the
# number of matrix operations by a factor of
1 - prod(1 - a1$duplicate.stats[1:2]) # 95% fewer operations
sum(a1$weights) - sum(a2$weights) # Should be 0 or near machine epsilon
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.