glm.regu.cv: Regularied M-estimation for fitting generalized linear models...

Description Usage Arguments Details Value References Examples

View source: R/regu-est-c.r

Description

This function implements regularized M-estimation for fitting generalized linear models with binary or contiunous responses based on cross validation.

Usage

1
2
3
glm.regu.cv(fold, nrho = NULL, rho.seq = NULL, y, x, iw = NULL,
  loss = "cal", n.iter = 100, eps = 1e-06, tune.fac = 0.5,
  tune.cut = TRUE, ann.init = TRUE, nz.lab = NULL, permut = NULL)

Arguments

fold

A fold number used for cross validation.

nrho

The number of tuning parameters searched in cross validation.

rho.seq

A vector of tuning parameters searched in cross validation. If both nrho and rho.seq are specified, then rho.seq overrides nrho.

y

An n x 1 response vector.

x

An n x p matix of covariates, excluding a constant.

iw

An n x 1 weight vector.

loss

A loss function, which can be specified as "gaus" for continuous responses, or "ml" or "cal" for binary respones.

n.iter

The maximum number of iterations allowed as in glm.regu.

eps

The tolerance used to declare convergence as in glm.regu.

tune.fac

The multiplier (factor) used to define rho.seq if only nrho is specified.

tune.cut

Logical; if TRUE, all smaller tuning parameters are skipped once non-convergence is found with a tuning parameter.

ann.init

Logical; if TRUE, the estimates from the previous tuning parameter are used as the inital values when fitting with the current tuning parameter.

nz.lab

A p x 1 logical vector (useful for simulations), indicating which covariates are included when calculating the number of nonzero coefficients.

permut

An n x 1 vector, giving a random permutation of the integers from 1 to n, which is used in cross validation.

Details

Cross validation is performed as described in Tan (2020a, 2020b). If not specified by users, the sequence of tuning parameters searched is defined as a geometric series of length nrho, starting from the value which yields a zero solution, and then decreasing by a factor tune.fac successively.

After cross validation, two tuning parameters are selected. The first and default choice is the value yielding the smallest average test loss. The second choice is the largest value giving the average test loss within one standard error of the first choice (Hastie, Tibshirani, and Friedman 2016).

Value

permut

An n x 1 vector, giving the random permutation used in cross validation.

rho

The vector of tuning parameters, searched in cross validation.

non.conv

A vector indicating the non-convergene status found or imputed if tune.cut=TRUE, for the tuning parmaters in cross validation. For each tuning parameter, 0 indicates convergence, 1 non-convergence if exceeding n.iter, 2 non-convergence if exceeding bt.lim.

err.ave

A vector giving the averages of the test losses in cross validation.

err.sd

A vector giving the standard deviations of the test losses in cross validation.

sel.rho

A vector of two selected tuning parameters by cross validation; see Details.

sel.nz

A vector of numbers of nonzero coefficients estimated for the selected tuning parameters.

sel.bet

The (p+1) x 2 vector of estimated intercept and coefficients.

sel.fit

The n x 2 vector of fitted values.

References

Hastie, T., Tibshirani, R., and Friedman. J. (2016) The Elements of Statistical Learning (second edition), Springer: New York.

Tan, Z. (2020a) Regularized calibrated estimation of propensity scores with model misspecification and high-dimensional data, Biometrika, 107, 137<e2><80><93>158.

Tan, Z. (2020b) Model-assisted inference for treatment effects using regularized calibrated estimation with high-dimensional data, Annals of Statistics, 48, 811<e2><80><93>837.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
data(simu.data)
n <- dim(simu.data)[1]
p <- dim(simu.data)[2]-2

y <- simu.data[,1]
tr <- simu.data[,2]
x <- simu.data[,2+1:p]
x <- scale(x)

### Example 1: Regularized maximum likelihood estimation of propensity scores
ps.cv.rml <- glm.regu.cv(fold=5, nrho=1+10, y=tr, x=x, loss="ml")
ps.cv.rml$rho
ps.cv.rml$err.ave
ps.cv.rml$err.sd
ps.cv.rml$sel.rho
ps.cv.rml$sel.nz

fp.cv.rml <- ps.cv.rml $sel.fit[,1]
check.cv.rml <- mn.ipw(x, tr, fp.cv.rml)
check.cv.rml$est

### Example 2: Regularized calibrated estimation of propensity scores
ps.cv.rcal <- glm.regu.cv(fold=5, nrho=1+10, y=tr, x=x, loss="cal")
ps.cv.rcal$rho
ps.cv.rcal$err.ave
ps.cv.rcal$err.sd
ps.cv.rcal$sel.rho
ps.cv.rcal$sel.nz

fp.cv.rcal <- ps.cv.rcal $sel.fit[,1]

check.cv.rcal <- mn.ipw(x, tr, fp.cv.rcal)
check.cv.rcal$est

RCAL documentation built on Nov. 8, 2020, 4:22 p.m.