bestglm-package: bestglm: Best Subset GLM

Description Details Author(s) References See Also Examples

Description

Provides new information criterion BICq as well as AIC, BIC and EBIC for selecting the best model. Additionally, various CV algorithms are also provided.

Details

Package: bestglm
Type: Package
Version: 0.33
Date: 2011-11-03
License: GLP 2.0 or greater
LazyData: yes
LazyLoad: yes

bestglm is the main function. All other functions are utility functions and are not normally invoked.

Many examples are provided in the vignettes accompanying this package. The vignettes are produced using the R package Sweave and so R scripts can easily be extracted.

The R package xtable is needed for the vignette in SimExperimentBICq.Rnw.

Author(s)

A.I. McLeod and Changjiang Xu

References

Xu, C. and McLeod, A.I. (2009). Bayesian Information Criterion with Bernouilli Prior.

See Also

leaps

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
## Not run: 
data(zprostate)
train<-(zprostate[zprostate[,10],])[,-10]
#Best subset using AIC
bestglm(train, IC="AIC")
#Best subset using BIC
bestglm(train, IC="BIC")
#Best subset using EBIC
bestglm(train, IC="BICg")
#Best subset using BICg with g=0.5 (tuning parameter)
bestglm(train, IC="BICg", t=0.5)
#Best subset using BICq. Note BICq with q=0.25 is default.
bestglm(train, IC="BICq")
#Best subset using BICq with q=0.5 (equivalent to BIC)
bestglm(train, IC="BICq", t=0.5)
#Remark: set seed since CV depends on it
set.seed(123321123)
bestglm(train, IC="CV", t=10)
#using HTF method
bestglm(train, IC="CV", CVArgs=list(Method="HTF", K=10, REP=1))
#Best subset, logistic regression
data(SAheart)
bestglm(SAheart, IC="BIC", family=binomial)
#Best subset, factor variables with more than 2 levels
data(AirQuality)
#subset
bestglm(AirQuality, IC="BICq")

## End(Not run)

Example output

Loading required package: leaps
AIC
BICq equivalent for q in (0.708764213288625, 0.889919748490004)
Best Model:
              Estimate Std. Error   t value     Pr(>|t|)
(Intercept)  2.4668675 0.08760022 28.160516 6.632457e-36
lcavol       0.6764486 0.12383666  5.462426 9.883880e-07
lweight      0.2652760 0.09363348  2.833132 6.298761e-03
age         -0.1450300 0.09756540 -1.486490 1.424742e-01
lbph         0.2095349 0.10128348  2.068796 4.295574e-02
svi          0.3070936 0.12190105  2.519204 1.449125e-02
lcp         -0.2872242 0.15300241 -1.877253 6.543004e-02
pgg45        0.2522850 0.11562030  2.182013 3.310324e-02
BIC
BICq equivalent for q in (0.0176493852011195, 0.512566675362625)
Best Model:
             Estimate Std. Error   t value     Pr(>|t|)
(Intercept) 2.4773573 0.09304738 26.624687 2.475214e-36
lcavol      0.7397137 0.09318316  7.938277 4.141615e-11
lweight     0.3163282 0.08830716  3.582135 6.576173e-04
BICg(g = 1)
BICq equivalent for q in (0.0176493852011195, 0.512566675362625)
Best Model:
             Estimate Std. Error   t value     Pr(>|t|)
(Intercept) 2.4773573 0.09304738 26.624687 2.475214e-36
lcavol      0.7397137 0.09318316  7.938277 4.141615e-11
lweight     0.3163282 0.08830716  3.582135 6.576173e-04
BICg(g = 0.5)
BICq equivalent for q in (0.0176493852011195, 0.512566675362625)
Best Model:
             Estimate Std. Error   t value     Pr(>|t|)
(Intercept) 2.4773573 0.09304738 26.624687 2.475214e-36
lcavol      0.7397137 0.09318316  7.938277 4.141615e-11
lweight     0.3163282 0.08830716  3.582135 6.576173e-04
BICq(q = 0.25)
BICq equivalent for q in (0.0176493852011195, 0.512566675362625)
Best Model:
             Estimate Std. Error   t value     Pr(>|t|)
(Intercept) 2.4773573 0.09304738 26.624687 2.475214e-36
lcavol      0.7397137 0.09318316  7.938277 4.141615e-11
lweight     0.3163282 0.08830716  3.582135 6.576173e-04
BICq(q = 0.5)
BICq equivalent for q in (0.0176493852011195, 0.512566675362625)
Best Model:
             Estimate Std. Error   t value     Pr(>|t|)
(Intercept) 2.4773573 0.09304738 26.624687 2.475214e-36
lcavol      0.7397137 0.09318316  7.938277 4.141615e-11
lweight     0.3163282 0.08830716  3.582135 6.576173e-04
CVd(d = 47, REP = 10)
No BICq equivalent
Best Model:
             Estimate Std. Error   t value     Pr(>|t|)
(Intercept) 2.4627121 0.08901202 27.667185 3.167240e-36
lcavol      0.5566392 0.11360017  4.899985 7.408246e-06
lweight     0.2415963 0.09467037  2.551974 1.323253e-02
lbph        0.1989292 0.10187183  1.952740 5.544293e-02
svi         0.2393565 0.11734589  2.039752 4.571228e-02
pgg45       0.1221447 0.10256941  1.190849 2.383261e-01
CV(K = 10, REP = 1)
BICq equivalent for q in (0.0176493852011195, 0.512566675362626)
Best Model:
             Estimate Std. Error   t value     Pr(>|t|)
(Intercept) 2.4773573 0.09304738 26.624687 2.475214e-36
lcavol      0.7397137 0.09318316  7.938277 4.141615e-11
lweight     0.3163282 0.08830716  3.582135 6.576173e-04
Morgan-Tatar search since family is non-gaussian.
BIC
BICq equivalent for q in (0.190525988534164, 0.90158316218744)
Best Model:
                  Estimate Std. Error   z value     Pr(>|z|)
(Intercept)    -6.44644451 0.92087165 -7.000372 2.552830e-12
tobacco         0.08037533 0.02587968  3.105731 1.898095e-03
ldl             0.16199164 0.05496893  2.946967 3.209074e-03
famhistPresent  0.90817526 0.22575844  4.022774 5.751659e-05
typea           0.03711521 0.01216676  3.050542 2.284290e-03
age             0.05046038 0.01020606  4.944159 7.647325e-07
Morgan-Tatar search since factors present with more than 2 levels.
BICq(q = 0.25)
Best Model:
             Df Sum Sq Mean Sq F value   Pr(>F)    
Wind          1  45694   45694   96.78  < 2e-16 ***
Temp          1  25119   25119   53.20 5.29e-11 ***
Residuals   108  50989     472                     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

bestglm documentation built on March 26, 2020, 7:25 p.m.