estimateSigma: Estimate the noise standard deviation in regression

Description Usage Arguments Details Value Author(s) References Examples

View source: R/funs.common.R

Description

Estimates the standard deviation of the noise, for use in the selectiveInference package

Usage

1
estimateSigma(x, y, intercept=TRUE, standardize=TRUE)

Arguments

x

Matrix of predictors (n by p)

y

Vector of outcomes (length n)

intercept

Should glmnet be run with an intercept? Default is TRUE

standardize

Should glmnet be run with standardized predictors? Default is TRUE

Details

This function estimates the standard deviation of the noise, in a linear regresion setting. A lasso regression is fit, using cross-validation to estimate the tuning parameter lambda. With sample size n, yhat equal to the predicted values and df being the number of nonzero coefficients from the lasso fit, the estimate of sigma is sqrt(sum((y-yhat)^2) / (n-df-1)). Important: if you are using glmnet to compute the lasso estimate, be sure to use the settings for the "intercept" and "standardize" arguments in glmnet and estimateSigma. Same applies to fs or lar, where the argument for standardization is called "normalize".

Value

sigmahat

The estimate of sigma

df

The degrees of freedom of lasso fit used

Author(s)

Ryan Tibshirani, Rob Tibshirani, Jonathan Taylor, Joshua Loftus, Stephen Reid

References

Stephen Reid, Jerome Friedman, and Rob Tibshirani (2014). A study of error variance estimation in lasso regression. arXiv:1311.5274.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
set.seed(33)
n = 50
p = 10
sigma = 1
x = matrix(rnorm(n*p),n,p)
beta = c(3,2,rep(0,p-2))
y = x%*%beta + sigma*rnorm(n)

# run forward stepwise
fsfit = fs(x,y)

# estimate sigma
sigmahat = estimateSigma(x,y)$sigmahat

# run sequential inference with estimated sigma
out = fsInf(fsfit,sigma=sigmahat)
out

Example output

Loading required package: glmnet
Loading required package: Matrix
Loading required package: foreach
Loaded glmnet 2.0-16

Loading required package: intervals

Attaching package: 'intervals'

The following object is masked from 'package:Matrix':

    expand

Loading required package: survival

Call:
fsInf(obj = fsfit, sigma = sigmahat)

Standard deviation of noise (specified or estimated) sigma = 1.041

Sequential testing results with alpha = 0.100
 Step Var   Coef Z-score P-value LowConfPt UpConfPt LowTailArea UpTailArea
    1   1  2.317  13.230   0.000     2.014    2.607       0.049      0.049
    2   2  1.703  12.826   0.000     1.484    1.925       0.049      0.049
    3   9 -0.265  -1.660   0.492    -0.796    1.187       0.049      0.050
    4   8 -0.175  -1.140   0.261    -4.888    1.578       0.050      0.050
    5  10  0.173   1.061   0.755   -12.527    3.133       0.050      0.050
    6   4 -0.178  -1.125   0.407   -11.350    7.634       0.050      0.050
    7   7  0.158   0.966   0.764    -9.478    2.189       0.050      0.050
    8   5  0.128   0.884   0.839    -6.922    0.752       0.050      0.050
    9   6 -0.036  -0.222   0.303      -Inf      Inf       0.000      0.000
   10   3  0.037   0.252   0.121    -1.519      Inf       0.050      0.000

Estimated stopping point from ForwardStop rule = 2

selectiveInference documentation built on Sept. 7, 2019, 9:02 a.m.