# estimateSigma: Estimate the noise standard deviation in regression In selectiveInference: Tools for Post-Selection Inference

## Description

Estimates the standard deviation of the noise, for use in the selectiveInference package

## Usage

 `1` ```estimateSigma(x, y, intercept=TRUE, standardize=TRUE) ```

## Arguments

 `x` Matrix of predictors (n by p) `y` Vector of outcomes (length n) `intercept` Should glmnet be run with an intercept? Default is TRUE `standardize` Should glmnet be run with standardized predictors? Default is TRUE

## Details

This function estimates the standard deviation of the noise, in a linear regresion setting. A lasso regression is fit, using cross-validation to estimate the tuning parameter lambda. With sample size n, yhat equal to the predicted values and df being the number of nonzero coefficients from the lasso fit, the estimate of sigma is `sqrt(sum((y-yhat)^2) / (n-df-1))`. Important: if you are using glmnet to compute the lasso estimate, be sure to use the settings for the "intercept" and "standardize" arguments in glmnet and estimateSigma. Same applies to fs or lar, where the argument for standardization is called "normalize".

## Value

 `sigmahat` The estimate of sigma `df` The degrees of freedom of lasso fit used

## Author(s)

Ryan Tibshirani, Rob Tibshirani, Jonathan Taylor, Joshua Loftus, Stephen Reid

## References

Stephen Reid, Jerome Friedman, and Rob Tibshirani (2014). A study of error variance estimation in lasso regression. arXiv:1311.5274.

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17``` ```set.seed(33) n = 50 p = 10 sigma = 1 x = matrix(rnorm(n*p),n,p) beta = c(3,2,rep(0,p-2)) y = x%*%beta + sigma*rnorm(n) # run forward stepwise fsfit = fs(x,y) # estimate sigma sigmahat = estimateSigma(x,y)\$sigmahat # run sequential inference with estimated sigma out = fsInf(fsfit,sigma=sigmahat) out ```

### Example output

```Loading required package: glmnet

Attaching package: 'intervals'

The following object is masked from 'package:Matrix':

expand

Call:
fsInf(obj = fsfit, sigma = sigmahat)

Standard deviation of noise (specified or estimated) sigma = 1.041

Sequential testing results with alpha = 0.100
Step Var   Coef Z-score P-value LowConfPt UpConfPt LowTailArea UpTailArea
1   1  2.317  13.230   0.000     2.014    2.607       0.049      0.049
2   2  1.703  12.826   0.000     1.484    1.925       0.049      0.049
3   9 -0.265  -1.660   0.492    -0.796    1.187       0.049      0.050
4   8 -0.175  -1.140   0.261    -4.888    1.578       0.050      0.050
5  10  0.173   1.061   0.755   -12.527    3.133       0.050      0.050
6   4 -0.178  -1.125   0.407   -11.350    7.634       0.050      0.050
7   7  0.158   0.966   0.764    -9.478    2.189       0.050      0.050
8   5  0.128   0.884   0.839    -6.922    0.752       0.050      0.050
9   6 -0.036  -0.222   0.303      -Inf      Inf       0.000      0.000
10   3  0.037   0.252   0.121    -1.519      Inf       0.050      0.000

Estimated stopping point from ForwardStop rule = 2
```

selectiveInference documentation built on Sept. 7, 2019, 9:02 a.m.