Description Usage Arguments Value M1 M2 M3 M4 M5 M6 M7 References
Provides sample datasets M1-M7 used in the paper Conditional variance estimation for sufficient dimension reduction, Lukas Fertl, Efstathia Bura. The general model is given by:
Y = g(B'X) + ε
1 |
name |
One of |
n |
number of samples. |
p |
Dimension of random variable X. |
sd |
standard diviation for error term ε. |
... |
Additional parameters only for "M2" (namely |
List with elements
Xdata, a n x p matrix.
Yresponse.
Bthe dim-reduction matrix
nameName of the dataset (name parameter)
The predictors are distributed as X ~ N_p(0, Σ) with Σ_ij = 0.5^|i - j| for i, j = 1,..., p for a subspace dimension of k = 1 with a default of n = 100 data points. p = 20, b_1 = (1,1,1,1,1,1,0,...,0)' / sqrt(6), and Y is given as
Y = cos(b_1'X) + ε
where ε is distributed as generalized normal distribution with location 0, shape-parameter 0.5, and the scale-parameter is chosen such that Var(ε) = 0.5.
The predictors are distributed as X ~ Z 1_p λ + N_p(0, I_p). with Z~2Binom(pmix)-1 where 1_p is the p-dimensional vector of one's, for a subspace dimension of k = 1 with a default of n = 100 data points. p = 20, b_1 = (1,1,1,1,1,1,0,...,0)' / sqrt(6), and Y is
Y = cos(b_1'X) + 0.5ε
where ε is
standard normal.
Defaults for pmix
is 0.3 and lambda
defaults to 1.
The predictors are distributed as X~N_p(0, I_p) for a subspace dimension of k = 1 with a default of n = 100 data points. p = 20, b_1 = (1,1,1,1,1,1,0,...,0)' / sqrt(6), and Y is
Y = 2 log(|b_1'X| + 2) + 0.5ε
where ε is standard normal.
The predictors are distributed as X~N_p(0,Σ) with Σ_ij = 0.5^|i - j| for i, j = 1,..., p for a subspace dimension of k = 2 with a default of n = 100 data points. p = 20, b_1 = (1,1,1,1,1,1,0,...,0)' / sqrt(6), b_2 = (1,-1,1,-1,1,-1,0,...,0)' / sqrt(6) and Y is given as
Y = (b_1'X) / (0.5 + (1.5 + b_2'X)^2) + 0.5ε
where ε is standard normal.
The predictors are distributed as X~U([0, 1]^p) where U([0, 1]^p) is the uniform distribution with independent components on the p-dimensional hypercube for a subspace dimension of k = 2 with a default of n = 200 data points. p = 20, b_1 = (1,1,1,1,1,1,0,...,0)' / sqrt(6), b_2 = (1,-1,1,-1,1,-1,0,...,0)' / sqrt(6) and Y is given as
Y = cos(π b_1'X)(b_2'X + 1)^2 + 0.5ε
where ε is standard normal.
The predictors are distributed as X~N_p(0, I_p) for a subspace dimension of k = 3 with a default of n = 200 data point. p = 20, b_1 = e_1, b_2 = e_2, and b_3 = e_p, where e_j is the j-th unit vector in the p-dimensional space. Y is given as
Y = (b_1'X)^2+(b_2'X)^2+(b_3'X)^2+0.5ε
where ε is standard normal.
The predictors are distributed as X~t_3(I_p) where t_3(I_p) is the standard multivariate t-distribution with 3 degrees of freedom, for a subspace dimension of k = 4 with a default of n = 200 data points. p = 20, b_1 = e_1, b_2 = e_2, b_3 = e_3, and b_4 = e_p, where e_j is the j-th unit vector in the p-dimensional space. Y is given as
Y = (b_1'X)(b_2'X)^2+(b_3'X)(b_4'X)+0.5ε
where ε is distributed as generalized normal distribution with location 0, shape-parameter 1, and the scale-parameter is chosen such that Var(ε) = 0.25.
Fertl, L. and Bura, E. (2021) "Conditional Variance Estimation for Sufficient Dimension Reduction" <arXiv:2102.08782>
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.