Description Usage Arguments Details Value Author(s) References See Also Examples
Method to estimate the central subspace, using inverse conditional mean and conditional variance functions.
1 |
X |
Data matrix with |
y |
Response vector of |
numdir |
Integer between 1 and p. It is the number of directions of the reduction to estimate. If not provided then it will equal the number of distinct values of the categorical response. |
nslices |
Integer number of slices. It must be provided if |
numdir.test |
Boolean. If |
... |
Other arguments to pass to |
Consider a regression in which the response Y is discrete with support S_Y=\{1,2,...,h\}. Following standard practice, continuous response can be sliced into finite categories to meet this condition. Let X_y \in R^p denote a random vector of predictors distributed as X|(Y=y) and assume that X_y \sim N(μ_y, Δ_y), y \in S_Y. Let μ=E(X) and Σ=\mathrm{Var}(X) denote the marginal mean and variance of X and let Δ=E(Δ_Y) denote the average covariance matrix. Given n_y independent observations of X_y, y \in S_{Y}, the goal is to obtain the maximum likelihood estimate of the d-dimensional central subspace \mathcal{S}_{Y|X}, which is defined informally as the smallest subspace such that Y is independent of X given its projection P_{\mathcal{S}_{Y|X}}X onto \mathcal{S}_{Y|X}.
Let \tilde{Σ} denote the sample covariance matrix of X, let \tilde{Δ}_y denote the sample covariance matrix for the data with Y=y, and let \tilde{Δ}=∑_{y=1}^{h} m_y \tilde{Δ}_y where m_y is the fraction of cases observed with Y=y. The maximum likelihood estimator of \mathcal{S}_{Y|X} maximizes over \mathcal{S} \in \mathcal{G}_{(d,p)} the log-likelihood function
L(\mathcal{S})=\frac{n}{2}\log|P_{\mathcal{S}} \tilde{Σ} P_{\mathcal{S}}|_0 - \frac{n}{2}\log|\tilde{Σ}| - \frac{1}{2}∑_{y=1}^{h} n_y \log|P_{\mathcal{S}} \tilde{Δ}_y P_{\mathcal{S}}|_0,
where |A|_0 indicates the product of the non-zero eigenvalues of a positive semi-definite symmetric matrix A, P_{\mathcal{S}} indicates the projection onto the subspace \mathcal{S} in the usual inner product, and \mathcal{G}_{(d,p)} is the set of all d-dimensional subspaces in R^p, called Grassmann manifold. The desired reduction is then \hat{Γ}^{T}X. Once the dimension of the reduction subspace is estimated, the columns of \hat{Γ} are a basis for the maximum likelihood estimate of \mathcal{S}_{Y|X}.
The dimension d of the sufficient reduction is to be estimated. A sequential likelihood ratio test, and information criteria (AIC, BIC) are implemented, following Cook and Forzani (2009).
This command returns a list object of class ldr
. The output depends on the argument numdir.test
. If numdir.test=TRUE
, a list of matrices is provided corresponding to the numdir
values (1 through numdir
) for each of the parameters Γ, Δ, and Δ_y; otherwise, a single list of matrices for a single value of numdir
.
The output of loglik
, aic
, bic
, numpar
are vectors of numdir
elements if numdir.test=TRUE
, and scalars otherwise. Following are the components returned:
R |
The reduction data-matrix of X obtained using the centered data-matrix X. The centering of the data-matrix of X is such that each column vector is centered around its sample mean. |
Gammahat |
Estimate of Γ |
Deltahat |
Estimate of Δ |
Deltahat_y |
Estimate of Δ_y |
loglik |
Maximized value of the LAD log-likelihood. |
aic |
Akaike information criterion value. |
bic |
Bayesian information criterion value. |
numpar |
Number of parameters in the model. |
Kofi Placid Adragni <kofi@umbc.edu>
Cook RD, Forzani L (2009). Likelihood-based Sufficient Dimension Reduction, J. of the American Statistical Association, Vol. 104, No. 485, 197–208.
1 2 3 4 |
Loading required package: GrassmannOptim
Loading required package: Matrix
Call:
lad(X = flea[, -1], y = flea[, 1], numdir = 2, numdir.test = TRUE)
Estimated Basis Vectors for Central Subspace:
Dir1 Dir2
tars1 0.2638 -0.3022
tars2 -0.1372 0.2769
head -0.3662 -0.2546
aede1 -0.2082 0.8174
aede2 0.8503 0.2523
aede3 -0.1053 0.1880
Information Criterion:
d=0 d=1 d=2
aic 2843.332 2641.641 2535.786
bic 2905.542 2724.587 2639.469
Large sample likelihood ratio test
Stat df p.value
0D vs >= 1D 343.5466 18 0
1D vs >= 2D 123.8550 9 0
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.