Description Usage Arguments Value Author(s) References Examples
The first step is to sequentially select input variables via orthogonal greedy algorithm (OGA). The second step is to determine the number of OGA iterations using high-dimensional information criterion (HDIC). The third step is to remove irrelevant variables remaining in the second step using HDIC.
1 2 |
X |
Input matrix of |
y |
Response vector of length |
Kn |
The number of OGA iterations. |
c1 |
The tuning parameter for the number of OGA iterations. Default is |
HDIC_Type |
High-dimensional information criterion. The value must be |
c2 |
The tuning parameter for |
c3 |
The tuning parameter for |
intercept |
Should an intercept be fitted? Default is |
n |
The number of observations. |
p |
The number of input variables. |
Kn |
The number of OGA iterations. |
J_OGA |
The index set of Kn variables sequencially selected by OGA. |
HDIC |
The HDIC values along the OGA path. |
J_HDIC |
The index set of valuables determined by OGA+HDIC. |
J_Trim |
The index set of valuables determined by OGA+HDIC+Trim. |
betahat_HDIC |
The estimated regression coefficients of the model determined by OGA+HDIC. |
betahat_Trim |
The estimated regression coefficients of the model determined by OGA+HDIC+Trim. |
Hai-Tang Chiou, Ching-Kang Ing and Tze Leung Lai.
Ing, C.-K. and Lai, T. L. (2011). A stepwise regression method and consistent model selection for high-dimensional sparse linear models. Statistica Sinica, 21, 1473–1513.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | # Example setup (Example 3 in Section 5 of Ing and Lai (2011))
n = 400
p = 4000
q = 10
beta_1q = c(3, 3.75, 4.5, 5.25, 6, 6.75, 7.5, 8.25, 9, 9.75)
b = sqrt(3/(4 * q))
x_relevant = matrix(rnorm(n * q), n, q)
d = matrix(rnorm(n * (p - q), 0, 0.5), n, p - q)
x_relevant_sum = apply(x_relevant, 1, sum)
x_irrelevant = apply(d, 2, function(a) a + b * x_relevant_sum)
X = cbind(x_relevant, x_irrelevant)
epsilon = rnorm(n)
y = as.vector((x_relevant %*% beta_1q) + epsilon)
# Fit a high-dimensional linear regression model via OGA+HDIC+Trim
Ohit(X, y, intercept = FALSE)
|
$n
[1] 400
$p
[1] 4000
$Kn
[1] 34
$J_OGA
[1] 976 2911 10 9 8 7 6 5 4 3 2 1 432 900 1867
[16] 282 77 3532 275 3190 508 2978 1895 37 3937 3792 2457 2254 1841 557
[31] 559 3418 823 3464
$HDIC
[1] 1977.3699 1855.2058 1835.4618 1808.4484 1776.9847 1752.5538 1673.9168
[8] 1600.5852 1536.5116 1421.7832 1286.3834 620.9388 655.9772 692.4024
[15] 729.9630 769.1924 808.4527 848.8927 889.5049 929.5755 966.6255
[22] 1007.0329 1044.2767 1083.2249 1122.3615 1160.7090 1200.2530 1239.7055
[29] 1279.8540 1320.4601 1360.0681 1397.7032 1436.6435 1476.5672
$J_HDIC
[1] 1 2 3 4 5 6 7 8 9 10 976 2911
$J_Trim
[1] 1 2 3 4 5 6 7 8 9 10
$betahat_HDIC
Call:
lm(formula = y ~ . - 1, data = X_HDIC)
Residuals:
Min 1Q Median 3Q Max
-2.9917 -0.6678 -0.0493 0.5657 3.1895
Coefficients:
Estimate Std. Error t value Pr(>|t|)
X1 2.93495 0.06669 44.006 <2e-16 ***
X2 3.58349 0.06738 53.183 <2e-16 ***
X3 4.40725 0.06602 66.760 <2e-16 ***
X4 5.15546 0.07035 73.282 <2e-16 ***
X5 5.97555 0.07099 84.172 <2e-16 ***
X6 6.71770 0.07092 94.718 <2e-16 ***
X7 7.42118 0.07208 102.955 <2e-16 ***
X8 8.11738 0.07111 114.152 <2e-16 ***
X9 8.90003 0.07423 119.900 <2e-16 ***
X10 9.63881 0.07427 129.776 <2e-16 ***
X976 0.08101 0.11937 0.679 0.498
X2911 0.16786 0.11148 1.506 0.133
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1.048 on 388 degrees of freedom
Multiple R-squared: 0.9979, Adjusted R-squared: 0.9979
F-statistic: 1.551e+04 on 12 and 388 DF, p-value: < 2.2e-16
$betahat_Trim
Call:
lm(formula = y ~ . - 1, data = X_Trim)
Residuals:
Min 1Q Median 3Q Max
-2.8830 -0.7174 -0.0498 0.6022 3.3112
Coefficients:
Estimate Std. Error t value Pr(>|t|)
X1 2.99865 0.04986 60.14 <2e-16 ***
X2 3.64529 0.05154 70.73 <2e-16 ***
X3 4.46990 0.05033 88.82 <2e-16 ***
X4 5.22402 0.05173 101.00 <2e-16 ***
X5 6.03686 0.05555 108.68 <2e-16 ***
X6 6.78602 0.05266 128.88 <2e-16 ***
X7 7.49202 0.05303 141.27 <2e-16 ***
X8 8.19060 0.05191 157.77 <2e-16 ***
X9 8.97783 0.05265 170.52 <2e-16 ***
X10 9.71184 0.05316 182.68 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1.049 on 390 degrees of freedom
Multiple R-squared: 0.9979, Adjusted R-squared: 0.9979
F-statistic: 1.859e+04 on 10 and 390 DF, p-value: < 2.2e-16
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.