Fits a generalised linear model with a LASSO penalty, using an iteratively reweighted local linearisation approach, given a value of the penalty parameter (lamb). Can handle negative binomial family, even with overdispersion parameter unknown, as well as other GLM families.

1 2 3 |

`y` |
A vector of values for the response variable. |

`X` |
A design matrix of p explanatory variables. |

`family` |
The family of the response variable, see |

`lambda` |
The penalty parameter applied to slope parameters. Different penalties can be specified for different parameters by specifying lamb as a vector, whose length is the number of columns of X. If scalar, this penalty is applied uniformly across all parameters except for the first (assuming that it is an intercept) |

`weights` |
Observation weights. These might be useful if you want to fit a Poisson point process model... |

`b.init` |
Initial slope estimate. Must be a vector of the same length as the number of columns in X. |

`phi.init` |
Initial estimate of the negative binomial overdispersion parameter. Must be scalar. |

`phi.method` |
Method of estimating overdispersion. |

`tol` |
A vector of two values, specifying convergence tolerance, and the value to truncate fitted values at. |

`n.iter` |
Number of iterations to attempt before bailing. |

`phi.iter` |
Number of iterations estimating the negative binomial overdispersion parameter (if applicable) before returning to slope estimation. Default is one step, i.e. iterating between one-step estimates of beta and phi. |

This function fits a generalised linear model with a LASSO penalty, sometimes referred to as an L1 penalty or L1 norm, hence the name glm1. The model is fit using a local linearisation approach as in Osborne et al (2000), nested inside iteratively reweighted (penalised) least squares. Look it's not the fastest thing going around, try `glmnet`

if you want something faster (and possibly rougher as an approximation). The main advantage of the `glm1`

function is that it has been written to accept any glm family argument (although not yet tested beyond discrete data!), and also the negative binomial distribution, which is especially useful for modelling overdispersed counts.

For negative binomial with unknown overdispersion use `"negative.binomial"`

, or if overdispersion is to be specified, use `negative.binomial(theta)`

as in the `MASS`

package. Note that the output refers to phi=1/theta, i.e. the overdispersion is parameterised such that the variance is mu+phi*mu^2. Hence values of phi close to zero suggest little overdispersion, values over one suggest a lot.

`coefficients` |
Vector of parameter estimates |

`fitted.values` |
Vector of predicted values (on scale of the original response) |

`logLs` |
Vector of log-likelihoods at each iteration of the model. The last entry is the log-likelihood for the final fit. |

`phis` |
Estimated overdispersion parameter at each iteration, for a negative binomial fit. |

`phi` |
Final estimate of the overdispersion parameter, for a negative binomial fit. |

`score` |
Vector of score equation values for each parameter in the model. |

`counter` |
Number of iterations until convergence. Set to Inf for a model that didn't converge. |

`check` |
Logical for whether the Kuhn-KArush-Tucker conditions are saitsfied. |

David I. Warton <David.Warton@unsw.edu.au>, Ian W. Renner and Luke Wilson.

Osborne, M.R., Presnell, B. and Turlach, B.A. (2000) On the LASSO and its dual. Journal of Computational and Graphical Statistics, 9, 319-337.

1 2 3 4 5 6 7 8 |

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

Please suggest features or report bugs with the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.