fixedLassoInf: Inference for the lasso, with a fixed lambda

Description Usage Arguments Details Value Author(s) References Examples

View source: R/funs.fixed.R

Description

Compute p-values and confidence intervals for the lasso estimate, at a fixed value of the tuning parameter lambda

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
fixedLassoInf(x, 
              y, 
              beta, 
              lambda, 
              family = c("gaussian", "binomial", "cox"),
              intercept=TRUE, 
              add.targets=NULL, 
              status=NULL, 
              sigma=NULL, 
              alpha=0.1,
              type=c("partial","full"), 
              tol.beta=1e-5, 
              tol.kkt=0.1,
              gridrange=c(-100,100), 
              bits=NULL, 
              verbose=FALSE, 
              linesearch.try=10) 

Arguments

x

Matrix of predictors (n by p);

y

Vector of outcomes (length n)

beta

Estimated lasso coefficients (e.g., from glmnet). This is of length p (so the intercept is not included as the first component).

Be careful! This function uses the "standard" lasso objective

1/2 \|y - x β\|_2^2 + λ \|β\|_1.

In contrast, glmnet multiplies the first term by a factor of 1/n. So after running glmnet, to extract the beta corresponding to a value lambda, you need to use beta = coef(obj, s=lambda/n)[-1], where obj is the object returned by glmnet (and [-1] removes the intercept, which glmnet always puts in the first component)

lambda

Value of lambda used to compute beta. See the above warning

family

Response type: "gaussian" (default), "binomial", or "cox" (for censored survival data)

sigma

Estimate of error standard deviation. If NULL (default), this is estimated using the mean squared residual of the full least squares fit when n >= 2p, and using the standard deviation of y when n < 2p. In the latter case, the user should use estimateSigma function for a more accurate estimate. Not used for family= "binomial", or "cox"

alpha

Significance level for confidence intervals (target is miscoverage alpha/2 in each tail)

intercept

Was the lasso problem solved (e.g., by glmnet) with an intercept in the model? Default is TRUE. Must be TRUE for "binomial" family. Not used for 'cox" family, where no intercept is assumed.

add.targets

Optional vector of predictors to be included as targets of inference, regardless of whether or not they are selected by the lasso. Default is NULL.

status

Censoring status for Cox model; 1=failurem 0=censored

type

Contrast type for p-values and confidence intervals: default is "partial"—meaning that the contrasts tested are the partial population regression coefficients, within the active set of predictors; the alternative is "full"—meaning that the full population regression coefficients are tested. The latter does not make sense when p > n.

tol.beta

Tolerance for determining if a coefficient is zero

tol.kkt

Tolerance for determining if an entry of the subgradient is zero

gridrange

Grid range for constructing confidence intervals, on the standardized scale

bits

Number of bits to be used for p-value and confidence interval calculations. Default is NULL, in which case standard floating point calculations are performed. When not NULL, multiple precision floating point calculations are performed with the specified number of bits, using the R package Rmpfr (if this package is not installed, then a warning is thrown, and standard floating point calculations are pursued). Note: standard double precision uses 53 bits so, e.g., a choice of 200 bits uses about 4 times double precision. The confidence interval computation is sometimes numerically challenging, and the extra precision can be helpful (though computationally more costly). In particular, extra precision might be tried if the values in the output columns of tailarea differ noticeably from alpha/2.

verbose

Print out progress along the way? Default is FALSE

linesearch.try

When running type="full" (i.e. debiased LASSO) how many attempts in the line search?

Details

This function computes selective p-values and confidence intervals for the lasso, given a fixed value of the tuning parameter lambda. Three different response types are supported: gaussian, binomial and Cox. The confidence interval construction involves numerical search and can be fragile: if the observed statistic is too close to either end of the truncation interval (vlo and vup, see references), then one or possibly both endpoints of the interval of desired coverage cannot be computed, and default to +/- Inf. The output tailarea gives the achieved Gaussian tail areas for the reported intervals—these should be close to alpha/2, and can be used for error-checking purposes.

Important!: Before running glmnet (or some other lasso-solver) x should be centered, that is x <- scale(X,TRUE,FALSE). In addition, if standardization of the predictors is desired, x should be scaled as well: x <- scale(x,TRUE,TRUE). Then when running glmnet, set standardize=F. See example below.

The penalty.factor facility in glmmet– allowing different penalties lambda for each predictor, is not yet implemented in fixedLassoInf. However you can finesse this— see the example below. One caveat- using this approach, a penalty factor of zero (forcing a predictor in) is not allowed.

Note that the coefficients and standard errors reported are unregularized. Eg for the Gaussian, they are the usual least squares estimates and standard errors for the model fit to the active set from the lasso.

Value

type

Type of coefficients tested (partial or full)

lambda

Value of tuning parameter lambda used

pv

One-sided P-values for active variables, uses the fact we have conditioned on the sign.

ci

Confidence intervals

tailarea

Realized tail areas (lower and upper) for each confidence interval

vlo

Lower truncation limits for statistics

vup

Upper truncation limits for statistics

vmat

Linear contrasts that define the observed statistics

y

Vector of outcomes

vars

Variables in active set

sign

Signs of active coefficients

alpha

Desired coverage (alpha/2 in each tail)

sigma

Value of error standard deviation (sigma) used

call

The call to fixedLassoInf

Author(s)

Ryan Tibshirani, Rob Tibshirani, Jonathan Taylor, Joshua Loftus, Stephen Reid

References

Jason Lee, Dennis Sun, Yuekai Sun, and Jonathan Taylor (2013). Exact post-selection inference, with application to the lasso. arXiv:1311.6238.

Jonathan Taylor and Robert Tibshirani (2016) Post-selection inference for L1-penalized likelihood models. arXiv:1602.07358

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
set.seed(43)
n = 50
p = 10
sigma = 1

x = matrix(rnorm(n*p),n,p)
x = scale(x,TRUE,TRUE)

beta = c(3,2,rep(0,p-2))
y = x%*%beta + sigma*rnorm(n)

# first run glmnet
gfit = glmnet(x,y,standardize=FALSE)

# extract coef for a given lambda; note the 1/n factor!
# (and we don't save the intercept term)
lambda = .8
beta = coef(gfit, x=x, y=y, s=lambda/n, exact=TRUE)[-1]

# compute fixed lambda p-values and selection intervals
out = fixedLassoInf(x,y,beta,lambda,sigma=sigma)
out


## as above, but use lar function instead to get initial 
## lasso fit (should get same results)
lfit = lar(x,y,normalize=FALSE)
beta = coef(lfit, s=lambda, mode="lambda")
out2 = fixedLassoInf(x, y, beta, lambda, sigma=sigma)
out2

## mimic different penalty factors by first scaling x
 set.seed(43)
n = 50
p = 10
sigma = 1

x = matrix(rnorm(n*p),n,p)
x=scale(x,TRUE,TRUE)

beta = c(3,2,rep(0,p-2))
y = x%*%beta + sigma*rnorm(n)
pf=c(rep(1,7),rep(.1,3))  #define penalty factors
pf=p*pf/sum(pf)   # penalty factors should be rescaled so they sum to p
xs=scale(x,FALSE,pf) #scale cols of x by penalty factors
# first run glmnet
gfit = glmnet(xs, y, standardize=FALSE)

# extract coef for a given lambda; note the 1/n factor!
# (and we don't save the intercept term)
lambda = .8
beta_hat = coef(gfit, x=xs, y=y, s=lambda/n, exact=TRUE)[-1]

# compute fixed lambda p-values and selection intervals
out = fixedLassoInf(xs,y,beta_hat,lambda,sigma=sigma)

#rescale conf points to undo the penalty factor
out$ci=t(scale(t(out$ci),FALSE,pf[out$vars]))
out

#logistic model
set.seed(43)

n = 50
p = 10
sigma = 1

x = matrix(rnorm(n*p),n,p)
x=scale(x,TRUE,TRUE)

beta = c(3,2,rep(0,p-2))
y = x%*%beta + sigma*rnorm(n)
y=1*(y>mean(y))
# first run glmnet
gfit = glmnet(x,y,standardize=FALSE,family="binomial")

# extract coef for a given lambda; note the 1/n factor!
# (and here  we DO  include the intercept term)
lambda = .8
beta_hat = coef(gfit, x=x, y=y, s=lambda/n, exact=TRUE)

# compute fixed lambda p-values and selection intervals
out = fixedLassoInf(x,y,beta_hat,lambda,family="binomial")
out


# Cox model

set.seed(43)
n = 50
p = 10
sigma = 1

x = matrix(rnorm(n*p), n, p)
x = scale(x, TRUE, TRUE)

beta = c(3,2,rep(0,p-2))
tim = as.vector(x%*%beta + sigma*rnorm(n))
tim= tim-min(tim)+1
status=sample(c(0,1),size=n,replace=TRUE)
# first run glmnet


y = Surv(tim,status)
gfit = glmnet(x, y, standardize=FALSE, family="cox")

# extract coef for a given lambda; note the 1/n factor!

lambda = 1.5
beta_hat = as.numeric(coef(gfit, x=x, y=y, s=lambda/n, exact=TRUE))

# compute fixed lambda p-values and selection intervals
out = fixedLassoInf(x, tim, beta_hat, lambda, status=status, family="cox")
out

# Debiased lasso or "full"

n = 50
p = 100
sigma = 1

x = matrix(rnorm(n*p),n,p)
x = scale(x,TRUE,TRUE)

beta = c(3,2,rep(0,p-2))
y = x%*%beta + sigma*rnorm(n)

# first run glmnet
gfit = glmnet(x, y, standardize=FALSE, intercept=FALSE)

# extract coef for a given lambda; note the 1/n factor!
# (and we don't save the intercept term)
lambda = 2.8
beta = coef(gfit, x=x, y=y, s=lambda/n, exact=TRUE)[-1]

# compute fixed lambda p-values and selection intervals
out = fixedLassoInf(x, y, beta, lambda, sigma=sigma, type='full', intercept=FALSE)
out

# When n > p and "full" we use the full inverse
# instead of Javanmard and Montanari's approximate inverse

n = 200
p = 50
sigma = 1

x = matrix(rnorm(n*p),n,p)
x = scale(x,TRUE,TRUE)

beta = c(3,2,rep(0,p-2))
y = x%*%beta + sigma*rnorm(n)

# first run glmnet
gfit = glmnet(x, y, standardize=FALSE, intercept=FALSE)

# extract coef for a given lambda; note the 1/n factor!
# (and we don't save the intercept term)
lambda = 2.8
beta = coef(gfit, x=x, y=y, s=lambda/n, exact=TRUE)[-1]

# compute fixed lambda p-values and selection intervals
out = fixedLassoInf(x, y, beta, lambda, sigma=sigma, type='full', intercept=FALSE)
out

selectiveInference documentation built on Sept. 7, 2019, 9:02 a.m.