betareg | R Documentation |
Fit beta regression models for rates and proportions via maximum likelihood using a parametrization with mean (depending through a link function on the covariates) and precision parameter (called phi).
betareg(formula, data, subset, na.action, weights, offset,
link = c("logit", "probit", "cloglog", "cauchit", "log", "loglog"),
link.phi = NULL, type = c("ML", "BC", "BR"), dist = NULL, nu = NULL,
control = betareg.control(...), model = TRUE,
y = TRUE, x = FALSE, ...)
betareg.fit(x, y, z = NULL, weights = NULL, offset = NULL,
link = "logit", link.phi = "log", type = "ML", control = betareg.control(),
dist = NULL, nu = NULL)
formula |
symbolic description of the model, either of type |
data , subset , na.action |
arguments controlling formula processing
via |
weights |
optional numeric vector of case weights. |
offset |
optional numeric vector with an a priori known component to be
included in the linear predictor for the mean. In |
link |
character specification of the link function in
the mean model (mu). Currently, |
link.phi |
character specification of the link function in
the precision model (phi). Currently, |
type |
character specification of the type of estimator. Currently,
maximum likelihood ( |
dist |
character specification of the response distribution.
Usually, this does not have to be set by the user because by default
the classical |
nu |
numeric. The fixed value of the expected exceedence parameter |
control |
a list of control arguments specified via
|
model , y , x |
logicals. If |
z |
numeric matrix. Regressor matrix for the precision model, defaulting to an intercept only. |
... |
arguments passed to |
Beta regression as suggested by Ferrari and Cribari-Neto (2004) and extended
by Simas, Barreto-Souza, and Rocha (2010) is implemented in betareg
.
It is useful in situations where the dependent variable is continuous and restricted to
the unit interval (0, 1), e.g., resulting from rates or proportions. It is modeled to be
beta-distributed with parametrization using mean and precision parameter (called mu and
phi, respectively). The mean mu is linked, as in generalized linear models (GLMs), to the
explanatory variables through a link function and a linear predictor. Additionally, the
precision parameter phi can be linked to another (potentially overlapping) set of
regressors through a second link function, resulting in a model with variable dispersion
(see Cribari-Neto and Zeileis 2010).
Estimation is performed by default using maximum likelihood (ML) via optim
with
analytical gradients and starting values from an auxiliary linear regression
of the transformed response. Subsequently, the optim
result may be enhanced
by an additional Fisher scoring iteration using analytical gradients and expected information.
Alternative estimation methods are bias-corrected (BC) or bias-reduced (BR)
maximum likelihood (see Grün, Kosmidis, and Zeileis 2012). For ML and BC the Fisher
scoring is just a refinement to move the gradients even closer to zero and can be
disabled by setting fsmaxit = 0
in the control arguments. For BR the Fisher scoring
is needed to solve the bias-adjusted estimating equations.
In the beta regression as introduced by Ferrari and Cribari-Neto (2004), the mean of
the response is linked to a linear predictor described by y ~ x1 + x2
using
a link
function while the precision parameter phi is assumed to be
constant. Simas et al. (2009) suggest to extend this model by linking phi to an
additional set of regressors (z1 + z2
, say): In betareg
this can be
specified in a formula of type y ~ x1 + x2 | z1 + z2
where the regressors
in the two parts can be overlapping. In the precision model (for phi), the link
function link.phi
is used. The default is a "log"
link unless no
precision model is specified. In the latter case (i.e., when the formula is of type
y ~ x1 + x2
), the "identity"
link is used by default for backward
compatibility.
Kosmidis and Zeileis (2024) introduce a generalization of the classic beta regression
model with extended support [0, 1].
Specifically, the extended-support beta distribution ("xbeta"
) leverages an underlying
symmetric four-parameter beta distribution with exceedence parameter nu
to obtain support [-nu, 1 + nu] that is subsequently censored to [0, 1] in order
to obtain point masses at the boundary values 0 and 1. The extended-support
beta mixture distribution ("xbetax"
) is a continuous mixture of extended-support
beta distributions where the exceedence parameter follows an exponential distribution
with mean nu (rather than a fixed value of nu). The latter "xbetax"
specification is used by default in case of boundary observations at 0 and/or 1.
The "xbeta"
specification with fixed nu is mostly for testing and
debugging purposes.
A set of standard extractor functions for fitted model objects is available for
objects of class "betareg"
, including methods to the generic functions
print
, summary
, plot
, coef
,
vcov
, logLik
, residuals
,
predict
, terms
,
model.frame
, model.matrix
,
cooks.distance
and hatvalues
(see influence.measures
),
gleverage
(new generic), estfun
and
bread
(from the sandwich package), and
coeftest
(from the lmtest package).
See predict.betareg
, residuals.betareg
, plot.betareg
,
and summary.betareg
for more details on all methods.
The main parameters of interest are the coefficients in the linear predictor of the mean
model. The additional parameters in the precision model (phi) can either
be treated as full model parameters (default) or as nuisance parameters. In the latter case
the estimation does not change, only the reported information in output from print
,
summary
, or coef
(among others) will be different. See also betareg.control
.
The implemented algorithms for bias correction/reduction follow Kosmidis and Firth (2010).
Technical note: In case, either bias correction or reduction is requested,
the second derivative of the inverse link function is required for link
and
link.phi
. If the two links are specified by their names (as done by default
in betareg
), then the "link-glm"
objects are enhanced automatically
by the required additional d2mu.deta
function. However, if a "link-glm"
object is supplied directly by the user, it needs to have the d2mu.deta
function or, for backward compatibility, dmu.deta
.
The original version of the package was written by Alexandre B. Simas and Andrea V. Rocha (up to version 1.2). Starting from version 2.0-0 the code was rewritten by Achim Zeileis.
betareg
returns an object of class "betareg"
, i.e., a list with components as follows.
For classic beta regressions (dist = "beta"
) several elements are lists with the names "mean"
and "precision"
for the information from the respective submodels. For extended-support
beta regressions (dist = "xbetax"
or "xbeta"
), the corresponding names are
"mu"
and "phi"
because they are not exactly the mean and precision due to the
censoring in the response variable.
betareg.fit
returns an unclassed list with components up to converged
.
coefficients |
a list with elements |
residuals |
a vector of raw residuals (observed - fitted), |
fitted.values |
a vector of fitted means, |
optim |
output from the |
method |
the method argument passed to the |
control |
the control arguments passed to the |
start |
the starting values for the parameters passed to the |
weights |
the weights used (if any), |
offset |
a list of offset vectors used (if any), |
n |
number of observations, |
nobs |
number of observations with non-zero weights, |
df.null |
residual degrees of freedom in the null model (constant mean and dispersion),
i.e., |
df.residual |
residual degrees of freedom in the fitted model, |
phi |
logical indicating whether the precision (phi) coefficients will be
treated as full model parameters or nuisance parameters in subsequent calls to
|
loglik |
log-likelihood of the fitted model, |
vcov |
covariance matrix of all parameters in the model, |
pseudo.r.squared |
pseudo R-squared value (squared correlation of linear predictor and link-transformed response), |
link |
a list with elements |
converged |
logical indicating successful convergence of |
call |
the original function call, |
formula |
the original formula, |
terms |
a list with elements |
levels |
a list with elements |
contrasts |
a list with elements |
model |
the full model frame (if |
y |
the response proportion vector (if |
x |
a list with elements |
Cribari-Neto F, Zeileis A (2010). Beta Regression in R. Journal of Statistical Software, 34(2), 1–24. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v034.i02")}
Ferrari SLP, Cribari-Neto F (2004). Beta Regression for Modeling Rates and Proportions. Journal of Applied Statistics, 31(7), 799–815.
Grün B, Kosmidis I, Zeileis A (2012). Extended Beta Regression in R: Shaken, Stirred, Mixed, and Partitioned. Journal of Statistical Software, 48(11), 1–25. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v048.i11")}
Kosmidis I, Firth D (2010). A Generic Algorithm for Reducing Bias in Parametric Estimation. Electronic Journal of Statistics, 4, 1097–1112.
Kosmidis I, Zeileis A (2024). Extended-Support Beta Regression for [0, 1] Responses. 2409.07233, arXiv.org E-Print Archive. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.48550/arXiv.2409.07233")}
Simas AB, Barreto-Souza W, Rocha AV (2010). Improved Estimators for a General Class of Beta Regression Models. Computational Statistics & Data Analysis, 54(2), 348–366.
summary.betareg
, predict.betareg
, residuals.betareg
,
Formula
options(digits = 4)
## Section 4 from Ferrari and Cribari-Neto (2004)
data("GasolineYield", package = "betareg")
data("FoodExpenditure", package = "betareg")
## Table 1
gy <- betareg(yield ~ batch + temp, data = GasolineYield)
summary(gy)
## Table 2
fe_lin <- lm(I(food/income) ~ income + persons, data = FoodExpenditure)
library("lmtest")
bptest(fe_lin)
fe_beta <- betareg(I(food/income) ~ income + persons, data = FoodExpenditure)
summary(fe_beta)
## nested model comparisons via Wald and LR tests
fe_beta2 <- betareg(I(food/income) ~ income, data = FoodExpenditure)
lrtest(fe_beta, fe_beta2)
waldtest(fe_beta, fe_beta2)
## Section 3 from online supplements to Simas et al. (2010)
## mean model as in gy above
## precision model with regressor temp
gy2 <- betareg(yield ~ batch + temp | temp, data = GasolineYield)
## MLE column in Table 19
summary(gy2)
## LRT row in Table 18
lrtest(gy, gy2)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.