groupfs: Select a model with forward stepwise.

Description Usage Arguments Value See Also Examples

View source: R/funs.groupfs.R

Description

This function implements forward selection of linear models almost identically to step with direction = "forward". The reason this is a separate function from fs is that groups of variables (e.g. dummies encoding levels of a categorical variable) must be handled differently in the selective inference framework.

Usage

1
2
groupfs(x, y, index, maxsteps, sigma = NULL, k = 2, intercept = TRUE,
  center = TRUE, normalize = TRUE, aicstop = 0, verbose = FALSE)

Arguments

x

Matrix of predictors (n by p).

y

Vector of outcomes (length n).

index

Group membership indicator of length p. Check that sort(unique(index)) = 1:G where G is the number of distinct groups.

maxsteps

Maximum number of steps for forward stepwise.

sigma

Estimate of error standard deviation for use in AIC criterion. This determines the relative scale between RSS and the degrees of freedom penalty. Default is NULL corresponding to unknown sigma. When NULL, link{groupfsInf} performs truncated F inference instead of truncated χ. See extractAIC for details on the AIC criterion.

k

Multiplier of model size penalty, the default is k = 2 for AIC. Use k = log(n) for BIC, or k = 2log(p) for RIC (best for high dimensions, when p > n). If G < p then RIC may be too restrictive and it would be better to use log(G) < k < 2log(p).

intercept

Should an intercept be included in the model? Default is TRUE. Does not count as a step.

center

Should the columns of the design matrix be centered? Default is TRUE.

normalize

Should the design matrix be normalized? Default is TRUE.

aicstop

Early stopping if AIC increases. Default is 0 corresponding to no early stopping. Positive integer values specify the number of times the AIC is allowed to increase in a row, e.g. with aicstop = 2 the algorithm will stop if the AIC criterion increases for 2 steps in a row. The default of step corresponds to aicstop = 1.

verbose

Print out progress along the way? Default is FALSE.

Value

An object of class "groupfs" containing information about the sequence of models in the forward stepwise algorithm. Call the function groupfsInf on this object to compute selective p-values.

See Also

groupfsInf, factorDesign.

Examples

1
2
3
4
5
6
x = matrix(rnorm(20*40), nrow=20)
index = sort(rep(1:20, 2))
y = rnorm(20) + 2 * x[,1] - x[,4]
fit = groupfs(x, y, index, maxsteps = 5)
out = groupfsInf(fit)
out

Example output

Loading required package: glmnet
Loading required package: Matrix
Loaded glmnet 4.0-2
Loading required package: intervals

Attaching package:intervalsThe following object is masked frompackage:Matrix:

    expand

Loading required package: survival
Loading required package: adaptMCMC
Loading required package: parallel
Loading required package: coda
Loading required package: MASS
Step 1/5: computing P-value for group 1 
Step 2/5: computing P-value for group 2 
Step 3/5: computing P-value for group 10 
Step 4/5: computing P-value for group 19 
Step 5/5: computing P-value for group 6 
  Group Pvalue     TF df   Size Ints    Min    Max
1     1  0.228 38.087  2 21.622    1 27.772 49.394
2     2  0.141  7.528  2  3.122    1  5.214  8.336
3    10  0.949  1.089  2  6.268    1  1.026  7.294
4    19  0.093  2.385  2  0.874    1  1.624  2.498
5     6  0.264  1.380  2  0.750    1  0.884  1.635

Ints is the number of intervals in the truncated chi selection region and Size is the sum of their lengths. Min and Max are the lowest and highest endpoints of the truncation region. No confidence intervals are reported by groupfsInf.

selectiveInference documentation built on Sept. 7, 2019, 9:02 a.m.