# brglmFit: Fitting function for 'glm()' for reduced-bias estimation and... In brglm2: Bias Reduction in Generalized Linear Models

 brglmFit R Documentation

## Fitting function for `glm()` for reduced-bias estimation and inference

### Description

`brglmFit()` is a fitting method for `glm()` that fits generalized linear models using implicit and explicit bias reduction methods (Kosmidis, 2014), and other penalized maximum likelihood methods. Currently supported methods include the mean bias-reducing adjusted scores approach in Firth (1993) and Kosmidis & Firth (2009), the median bias-reduction adjusted scores approach in Kenne Pagui et al. (2017), the correction of the asymptotic bias in Cordeiro & McCullagh (1991), the mixed bias-reduction adjusted scores approach in Kosmidis et al (2020), maximum penalized likelihood with powers of the Jeffreys prior as penalty, and maximum likelihood. Estimation is performed using a quasi Fisher scoring iteration (see `vignette("iteration", "brglm2")`, which, in the case of mean-bias reduction, resembles an iterative correction of the asymptotic bias of the Fisher scoring iterates.

### Usage

```brglmFit(
x,
y,
weights = rep(1, nobs),
start = NULL,
etastart = NULL,
mustart = NULL,
offset = rep(0, nobs),
family = gaussian(),
control = list(),
intercept = TRUE,
fixed_totals = NULL,
singular.ok = TRUE
)

brglm_fit(
x,
y,
weights = rep(1, nobs),
start = NULL,
etastart = NULL,
mustart = NULL,
offset = rep(0, nobs),
family = gaussian(),
control = list(),
intercept = TRUE,
fixed_totals = NULL,
singular.ok = TRUE
)
```

### Arguments

 `x` a design matrix of dimension `n * p`. `y` a vector of observations of length `n`. `weights` an optional vector of ‘prior weights’ to be used in the fitting process. Should be `NULL` or a numeric vector. `start` starting values for the parameters in the linear predictor. If `NULL` (default) then the maximum likelihood estimates are calculated and used as starting values. `etastart` applied only when start is not `NULL`. Starting values for the linear predictor to be passed to `glm.fit()` when computing starting values using maximum likelihood. `mustart` applied only when start is not `NULL`. Starting values for the vector of means to be passed to `glm.fit()` when computing starting values using maximum likelihood. `offset` this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be `NULL` or a numeric vector of length equal to the number of cases. One or more `offset` terms can be included in the formula instead or as well, and if more than one is specified their sum is used. See `model.offset`. `family` a description of the error distribution and link function to be used in the model. For `glm` this can be a character string naming a family function, a family function or the result of a call to a family function. For `glm.fit` only the third option is supported. (See `family` for details of family functions.) `control` a list of parameters controlling the fitting process. See `brglmControl()` for details. `intercept` logical. Should an intercept be included in the null model? `fixed_totals` effective only when `family` is `poisson()`. Either `NULL` (no effect) or a vector that indicates which counts must be treated as a group. See Details for more information and `brmultinom()`. `singular.ok` logical. If `FALSE`, a singular model is an error.

### Details

A detailed description of the supported adjustments and the quasi Fisher scoring iteration is given in the iteration vignette (see, `vignette("iteration", "brglm2")` or Kosmidis et al, 2020). A shorter description of the quasi Fisher scoring iteration is also given in one of the vignettes of the enrichwith R package (see, https://cran.r-project.org/package=enrichwith/vignettes/bias.html). Kosmidis and Firth (2010) describe a parallel quasi Newton-Raphson iteration with the same stationary point.

In the special case of generalized linear models for binomial, Poisson and multinomial responses, the adjusted score equation approaches for `type = "AS_mixed"`, `type = "AS_mean"`, and `type = "AS_median"` (see below for what methods each `type` corresponds) return estimates with improved frequentist properties, that are also always finite, even in cases where the maximum likelihood estimates are infinite (e.g. complete and quasi-complete separation in multinomial regression). See, Kosmidis and Firth (2021) for a proof for binomial-response GLMs with Jeffreys-prior penalties to the log-likelihood, which is equivalent to mean bias reduction for logistic regression. See, also, `detectseparation::detect_separation()` and `detectseparation::check_infinite_estimates()` for pre-fit and post-fit methods for the detection of infinite estimates in binomial response generalized linear models.

The type of score adjustment to be used is specified through the `type` argument (see `brglmControl()` for details). The available options are

• `type = "AS_mixed"`: the mixed bias-reducing score adjustments in Kosmidis et al (2020) that result in mean bias reduction for the regression parameters and median bias reduction for the dispersion parameter, if any; default.

• `type = "AS_mean"`: the mean bias-reducing score adjustments in Firth, 1993 and Kosmidis & Firth, 2009. `type = "AS_mixed"` and `type = "AS_mean"` will return the same results when `family` is `binomial()` or `poisson()`, i.e. when the dispersion is fixed

• `type = "AS_median"`: the median bias-reducing score adjustments in Kenne Pagui et al. (2017)

• `type = "MPL_Jeffreys"`: maximum penalized likelihood with powers of the Jeffreys prior as penalty.

• `type = "ML"`: maximum likelihood.

• `type = "correction"`: asymptotic bias correction, as in Cordeiro & McCullagh (1991).

The null deviance is evaluated based on the fitted values using the method specified by the `type` argument (see `brglmControl()`).

The `family` argument of the current version of `brglmFit()` can accept any combination of `"family"` objects and link functions, including families with user-specified link functions, `mis()` links, and `power()` links, but excluding `quasi()`, `quasipoisson()` and `quasibinomial()` families.

The description of `method` argument and the `Fitting functions` section in `glm()` gives information on supplying fitting methods to `glm()`.

`fixed_totals` specifies groups of observations for which the sum of the means of a Poisson model will be held fixed to the observed count for each group. This argument is used internally in `brmultinom()` and `bracl()` for baseline-category logit models and adjacent category logit models, respectively.

`brglm_fit()` is an alias to `brglmFit()`.

### Author(s)

Ioannis Kosmidis `[aut, cre]` ioannis.kosmidis@warwick.ac.uk, Euloge Clovis Kenne Pagui `[ctb]` kenne@stat.unipd.it

### References

Kosmidis I, Firth D (2021). Jeffreys-prior penalty, finiteness and shrinkage in binomial-response generalized linear models. Biometrika, 108, 71-82. doi: 10.1093/biomet/asaa052.

Kosmidis I, Kenne Pagui E C, Sartori N (2020). Mean and median bias reduction in generalized linear models. Statistics and Computing, 30, 43-59. doi: 10.1007/s11222-019-09860-6.

Cordeiro G M, McCullagh P (1991). Bias correction in generalized linear models. Journal of the Royal Statistical Society. Series B (Methodological), 53, 629-643. doi: 10.1111/j.2517-6161.1991.tb01852.x.

Firth D (1993). Bias reduction of maximum likelihood estimates. Biometrika. 80, 27-38. doi: 10.2307/2336755.

Kenne Pagui E C, Salvan A, Sartori N (2017). Median bias reduction of maximum likelihood estimates. Biometrika, 104, 923–938. doi: 10.1093/biomet/asx046.

Kosmidis I, Firth D (2009). Bias reduction in exponential family nonlinear models. Biometrika, 96, 793-804. doi: 10.1093/biomet/asp055.

Kosmidis I, Firth D (2010). A generic algorithm for reducing bias in parametric estimation. Electronic Journal of Statistics, 4, 1097-1112. doi: 10.1214/10-EJS579.

Kosmidis I (2014). Bias in parametric estimation: reduction and useful side-effects. WIRE Computational Statistics, 6, 185-196. doi: 10.1002/wics.1296.

`brglmControl()`, `glm.fit()`, `glm()`

### Examples

```## The lizards example from ?brglm::brglm
data("lizards")
# Fit the model using maximum likelihood
lizardsML <- glm(cbind(grahami, opalinus) ~ height + diameter +
light + time, family = binomial(logit), data = lizards,
method = "glm.fit")
# Mean bias-reduced fit:
lizardsBR_mean <- glm(cbind(grahami, opalinus) ~ height + diameter +
light + time, family = binomial(logit), data = lizards,
method = "brglmFit")
# Median bias-reduced fit:
lizardsBR_median <- glm(cbind(grahami, opalinus) ~ height + diameter +
light + time, family = binomial(logit), data = lizards,
method = "brglmFit", type = "AS_median")
summary(lizardsML)
summary(lizardsBR_median)
summary(lizardsBR_mean)

# Maximum penalized likelihood with Jeffreys prior penatly
lizards_Jeffreys <- glm(cbind(grahami, opalinus) ~ height + diameter +
light + time, family = binomial(logit), data = lizards,
method = "brglmFit", type = "MPL_Jeffreys")
# lizards_Jeffreys is the same fit as lizardsBR_mean (see Firth, 1993)
all.equal(coef(lizardsBR_mean), coef(lizards_Jeffreys))

# Maximum penalized likelihood with powers of the Jeffreys prior as
# penalty. See Kosmidis & Firth (2021) for the finiteness and
# shrinkage properties of the maximum penalized likelihood
# estimators in binomial response models

a <- seq(0, 20, 0.5)
coefs <- sapply(a, function(a) {
out <- glm(cbind(grahami, opalinus) ~ height + diameter +
light + time, family = binomial(logit), data = lizards,
method = "brglmFit", type = "MPL_Jeffreys", a = a)
coef(out)
})
# Illustration of shrinkage as a grows
matplot(a, t(coefs), type = "l", col = 1, lty = 1)
abline(0, 0, col = "grey")

## Another example from
## King, Gary, James E. Alt, Nancy Elizabeth Burns and Michael Laver
## (1990).  "A Unified Model of Cabinet Dissolution in Parliamentary
## Democracies", _American Journal of Political Science_, **34**, 846-870

data("coalition", package = "brglm2")
# The maximum likelihood fit with log link
coalitionML <- glm(duration ~ fract + numst2, family = Gamma, data = coalition)
# The mean bias-reduced fit
coalitionBR_mean <- update(coalitionML, method = "brglmFit")
# The bias-corrected fit
coalitionBC <- update(coalitionML, method = "brglmFit", type = "correction")
# The median bias-corrected fit
coalitionBR_median <- update(coalitionML, method = "brglmFit", type = "AS_median")

## An example with offsets from Venables & Ripley (2002, p.189)
data("anorexia", package = "MASS")

anorexML <- glm(Postwt ~ Prewt + Treat + offset(Prewt),
family = gaussian, data = anorexia)
anorexBC <- update(anorexML, method = "brglmFit", type = "correction")
anorexBR_mean <- update(anorexML, method = "brglmFit")
anorexBR_median <- update(anorexML, method = "brglmFit", type = "AS_median")

# All methods return the same estimates for the regression
# parameters because the maximum likelihood estimator is normally
# distributed around the `true` value under the model (hence, both
# mean and component-wise median unbiased). The Wald tests for
# anorexBC and anorexBR_mean differ from anorexML because the
# bias-reduced estimator of the dispersion is the unbiased, by
# degree of freedom adjustment (divide by n - p), estimator of the
# residual variance. The Wald tests from anorexBR_median are based
# on the median bias-reduced estimator of the dispersion that
# results from a different adjustment of the degrees of freedom
# (divide by n - p - 2/3)
summary(anorexML)
summary(anorexBC)
summary(anorexBR_mean)
summary(anorexBR_median)

## endometrial data from Heinze & Schemper (2002) (see ?endometrial)
data("endometrial", package = "brglm2")
endometrialML <- glm(HG ~ NV + PI + EH, data = endometrial,
family = binomial("probit"))
endometrialBR_mean <- update(endometrialML, method = "brglmFit",
type = "AS_mean")
endometrialBC <- update(endometrialML, method = "brglmFit",
type = "correction")
endometrialBR_median <- update(endometrialML, method = "brglmFit",
type = "AS_median")
summary(endometrialML)
summary(endometrialBC)
summary(endometrialBR_mean)
summary(endometrialBR_median)

```

brglm2 documentation built on Feb. 16, 2023, 7:03 p.m.