Description Usage Arguments Value References Examples
This is the main function in the CVE
package. It creates objects of
class "cve"
to estimate the mean subspace. Helper functions that
require a "cve"
object can then be applied to the output from this
function.
Conditional Variance Estimation (CVE) is a sufficient dimension reduction (SDR) method for regressions studying E(Y|X), the conditional expectation of a response Y given a set of predictors X. This function provides methods for estimating the dimension and the subspace spanned by the columns of a p x k matrix B of minimal rank k such that
E(Y|X) = E(Y|B'X)
or, equivalently,
Y = g(B'X) + ε
where X is independent of ε with positive definite variance-covariance matrix Var(X) = Σ_X. ε is a mean zero random variable with finite Var(ε) = E(ε^2), g is an unknown, continuous non-constant function, and B = (b_1,..., b_k) is a real p x k matrix of rank k <= p.
Both the dimension k and the subspace span(B) are unknown. The CVE method makes very few assumptions.
A kernel matrix Bhat is estimated such that the column space of Bhat should be close to the mean subspace span(B). The primary output from this method is a set of orthonormal vectors, Bhat, whose span estimates span(B).
The method central implements the Ensemble Conditional Variance Estimation
(ECVE) as described in [2]. It augments the CVE method by applying an
ensemble of functions (parameter func_list
) to the response to
estimate the central subspace. This corresponds to the generalization
F(Y|X) = F(Y|B'X)
or, equivalently,
Y = g(B'X, ε)
where F is the conditional cumulative distribution function.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | cve.call(
X,
Y,
method = c("mean", "weighted.mean", "central", "weighted.central"),
func_list = NULL,
nObs = sqrt(nrow(X)),
h = NULL,
min.dim = 1L,
max.dim = 10L,
k = NULL,
momentum = 0,
tau = 1,
tol = 0.001,
slack = 0,
gamma = 0.5,
V.init = NULL,
max.iter = 50L,
attempts = 10L,
nr.proj = 1L,
logger = NULL
)
|
X |
Design predictor matrix. |
Y |
n-dimensional vector of responses. |
method |
This character string specifies the method of fitting. The options are
|
func_list |
a list of functions applied to |
nObs |
parameter for choosing bandwidth |
h |
bandwidth or function to estimate bandwidth, defaults to internaly estimated bandwidth. |
min.dim |
lower bounds for |
max.dim |
upper bounds for |
k |
Dimension of lower dimensional projection, if |
momentum |
number of [0, 1) giving the ration of momentum for
eucledian gradient update with a momentum term. |
tau |
Initial step-size. |
tol |
Tolerance for break condition. |
slack |
Positive scaling to allow small increases of the loss while
optimizing, i.e. |
gamma |
step-size reduction multiple. If gradient step with step size
|
V.init |
Semi-orthogonal matrix of dimensions '(ncol(X), ncol(X) - k)
used as starting value in the optimization. (If supplied,
|
max.iter |
maximum number of optimization steps. |
attempts |
If |
nr.proj |
The number of projection used for projective resampling for multivariate response Y (under active development, ignored for univariate response). |
logger |
a logger function (only for advanced users, slows down the computation). |
an S3 object of class cve
with components:
design matrix of predictor vector used for calculating cve-estimate,
n-dimensional vector of responses used for calculating cve-estimate,
Name of used method,
the matched call,
list of components V, L, B, loss, h
for
each k = min.dim, ..., max.dim
. If k
was supplied in the
call min.dim = max.dim = k
.
B
is the cve-estimate with dimension
p x k.
V
is the orthogonal complement of B.
L
is the loss for each sample seperatels such that
it's mean is loss
.
loss
is the value of the target function that is
minimized, evaluated at V.
h
bandwidth parameter used to calculate
B, V, loss, L
.
[1] Fertl, L. and Bura, E. (2021) "Conditional Variance Estimation for Sufficient Dimension Reduction" <arXiv:2102.08782>
[2] Fertl, L. and Bura, E. (2021) "Ensemble Conditional Variance Estimation for Sufficient Dimension Reduction" <arXiv:2102.13435>
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | # create B for simulation (k = 1)
B <- rep(1, 5) / sqrt(5)
set.seed(21)
# creat predictor data X ~ N(0, I_p)
X <- matrix(rnorm(500), 100, 5)
# simulate response variable
# Y = f(B'X) + err
# with f(x1) = x1 and err ~ N(0, 0.25^2)
Y <- X %*% B + 0.25 * rnorm(100)
# calculate cve with method 'simple' for k = 1
set.seed(21)
cve.obj.simple1 <- cve(Y ~ X, k = 1)
# same as
set.seed(21)
cve.obj.simple2 <- cve.call(X, Y, k = 1)
# extract estimated B's.
coef(cve.obj.simple1, k = 1)
coef(cve.obj.simple2, k = 1)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.