Description Usage Arguments Details Value References See Also Examples
Fit beta regression models for rates and proportions via maximum likelihood using a parametrization with mean (depending through a link function on the covariates) and precision parameter (called phi).
1 2 3 4 5 6 7 8 9 10  betareg(formula, data, subset, na.action, weights, offset,
link = c("logit", "probit", "cloglog", "cauchit", "log", "loglog"),
link.phi = NULL, type = c("ML", "BC", "BR"),
dist = c("beta", "cbeta4", "cbetax"), nu = NULL,
control = betareg.control(...), model = TRUE,
y = TRUE, x = FALSE, ...)
betareg.fit(x, y, z = NULL, weights = NULL, offset = NULL,
link = "logit", link.phi = "log", type = "ML", control = betareg.control(),
dist = "beta", nu = NULL)

formula 
symbolic description of the model (of type 
data, subset, na.action 
arguments controlling formula processing
via 
weights 
optional numeric vector of case weights. 
offset 
optional numeric vector with an a priori known component to be
included in the linear predictor for the mean. In 
link 
character specification of the link function in
the mean model (mu). Currently, 
link.phi 
character specification of the link function in
the precision model (phi). Currently, 
type 
character specification of the type of estimator. Currently,
maximum likelihood ( 
dist 
character specification of the response distribution.
If all response observation are in (0, 1) the default is the standard
beta distribution. In the presence of boundary 0/1 observations, a
censored constrained fourparameter beta distribution ( 
nu 
numeric. The value of the (fixed) 
control 
a list of control arguments specified via

model, y, x 
logicals. If 
z 
numeric matrix. Regressor matrix for the precision model, defaulting to an intercept only. 
... 
arguments passed to 
Beta regression as suggested by Ferrari and CribariNeto (2004) and extended
by Simas, BarretoSouza, and Rocha (2010) is implemented in betareg
.
It is useful in situations where the dependent variable is continuous and restricted to
the unit interval (0, 1), e.g., resulting from rates or proportions. It is modeled to be
betadistributed with parametrization using mean and precision parameter (called phi).
The mean is linked, as in generalized linear models (GLMs), to the responses through a link
function and a linear predictor. Additionally, the precision parameter phi can be linked
to another (potentially overlapping) set of regressors through a second link function,
resulting in a model with variable dispersion.
Estimation is performed by maximum likelihood (ML) via optim
using
analytical gradients and (by default) starting values from an auxiliary linear regression
of the transformed response. Subsequently, the optim
result may be enhanced
by an additional Fisher scoring iteration using analytical gradients and expected information.
This slightly improves the optimization by moving the gradients even closer to zero
(for type = "ML"
and "BC"
) or solving the biasadjusted estimating equations
(for type = "BR"
). For the former two estimators, the optional Fisher scoring
can be disabled by setting fsmaxit = 0
in the control arguments. See
CribariNeto and Zeileis (2010) and Grün et al. (2012) for details.
In the beta regression as introduced by Ferrari and CribariNeto (2004), the mean of
the response is linked to a linear predictor described by y ~ x1 + x2
using
a link
function while the precision parameter phi is assumed to be
constant. Simas et al. (2009) suggest to extend this model by linking phi to an
additional set of regressors (z1 + z2
, say): In betareg
this can be
specified in a formula of type y ~ x1 + x2  z1 + z2
where the regressors
in the two parts can be overlapping. In the precision model (for phi), the link
function link.phi
is used. The default is a "log"
link unless no
precision model is specified. In the latter case (i.e., when the formula is of type
y ~ x1 + x2
), the "identity"
link is used by default for backward
compatibility.
Simas et al. (2009) also suggest further extensions (nonlinear specificiations,
bias correction) which are not yet implemented in betareg
. However,
Kosmidis and Firth (2010) discuss general algorithms for bias correction/reduction,
both of which are available in betareg
by setting the type
argument
accordingly. (Technical note: In case, either bias correction or reduction is requested,
the second derivative of the inverse link function is required for link
and
link.phi
. If the two links are specified by their names (as done by default
in betareg
), then the "linkglm"
objects are enhanced automatically
by the required additional d2mu.deta
function. However, if a "linkglm"
object is supplied directly by the user, it needs to have the d2mu.deta
function or, for backward compatibility, dmu.deta
.)
The main parameters of interest are the coefficients in the linear predictor of the mean
model. The additional parameters in the precision model (phi) can either
be treated as full model parameters (default) or as nuisance parameters. In the latter case
the estimation does not change, only the reported information in output from print
,
summary
, or coef
(among others) will be different. See also betareg.control
.
A set of standard extractor functions for fitted model objects is available for
objects of class "betareg"
, including methods to the generic functions
print
, summary
, plot
, coef
,
vcov
, logLik
, residuals
,
predict
, terms
,
model.frame
, model.matrix
,
cooks.distance
and hatvalues
(see influence.measures
),
gleverage
(new generic), estfun
and
bread
(from the sandwich package), and
coeftest
(from the lmtest package).
See predict.betareg
, residuals.betareg
, plot.betareg
,
and summary.betareg
for more details on all methods.
The original version of the package was written by Alexandre B. Simas and Andrea V. Rocha (up to version 1.2). Starting from version 2.00 the code was rewritten by Achim Zeileis.
betareg
returns an object of class "betareg"
, i.e., a list with components as follows.
betareg.fit
returns an unclassed list with components up to converged
.
coefficients 
a list with elements 
residuals 
a vector of raw residuals (observed  fitted), 
fitted.values 
a vector of fitted means, 
optim 
output from the 
method 
the method argument passed to the 
control 
the control arguments passed to the 
start 
the starting values for the parameters passed to the 
weights 
the weights used (if any), 
offset 
a list of offset vectors used (if any), 
n 
number of observations, 
nobs 
number of observations with nonzero weights, 
df.null 
residual degrees of freedom in the null model (constant mean and dispersion),
i.e., 
df.residual 
residual degrees of freedom in the fitted model, 
phi 
logical indicating whether the precision (phi) coefficients will be
treated as full model parameters or nuisance parameters in subsequent calls to

loglik 
loglikelihood of the fitted model, 
vcov 
covariance matrix of all parameters in the model, 
pseudo.r.squared 
pseudo Rsquared value (squared correlation of linear predictor and linktransformed response), 
link 
a list with elements 
converged 
logical indicating successful convergence of 
call 
the original function call, 
formula 
the original formula, 
terms 
a list with elements 
levels 
a list with elements 
contrasts 
a list with elements 
model 
the full model frame (if 
y 
the response proportion vector (if 
x 
a list with elements 
CribariNeto, F., and Zeileis, A. (2010). Beta Regression in R. Journal of Statistical Software, 34(2), 1–24. http://www.jstatsoft.org/v34/i02/.
Ferrari, S.L.P., and CribariNeto, F. (2004). Beta Regression for Modeling Rates and Proportions. Journal of Applied Statistics, 31(7), 799–815.
Grün, B., Kosmidis, I., and Zeileis, A. (2012). Extended Beta Regression in R: Shaken, Stirred, Mixed, and Partitioned. Journal of Statistical Software, 48(11), 1–25. http://www.jstatsoft.org/v48/i11/.
Kosmidis, I., and Firth, D. (2010). A Generic Algorithm for Reducing Bias in Parametric Estimation. Electronic Journal of Statistics, 4, 1097–1112.
Simas, A.B., BarretoSouza, W., and Rocha, A.V. (2010). Improved Estimators for a General Class of Beta Regression Models. Computational Statistics & Data Analysis, 54(2), 348–366.
summary.betareg
, predict.betareg
, residuals.betareg
,
Formula
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33  options(digits = 4)
## Section 4 from Ferrari and CribariNeto (2004)
data("GasolineYield", package = "betareg")
data("FoodExpenditure", package = "betareg")
## Table 1
gy < betareg(yield ~ batch + temp, data = GasolineYield)
summary(gy)
## Table 2
fe_lin < lm(I(food/income) ~ income + persons, data = FoodExpenditure)
library("lmtest")
bptest(fe_lin)
fe_beta < betareg(I(food/income) ~ income + persons, data = FoodExpenditure)
summary(fe_beta)
## nested model comparisons via Wald and LR tests
fe_beta2 < betareg(I(food/income) ~ income, data = FoodExpenditure)
lrtest(fe_beta, fe_beta2)
waldtest(fe_beta, fe_beta2)
## Section 3 from online supplements to Simas et al. (2010)
## mean model as in gy above
## precision model with regressor temp
gy2 < betareg(yield ~ batch + temp  temp, data = GasolineYield)
## MLE column in Table 19
summary(gy2)
## LRT row in Table 18
lrtest(gy, gy2)

Call:
betareg(formula = yield ~ batch + temp, data = GasolineYield)
Standardized weighted residuals 2:
Min 1Q Median 3Q Max
2.875 0.815 0.160 0.838 2.048
Coefficients (mean model with logit link):
Estimate Std. Error z value Pr(>z)
(Intercept) 6.159571 0.182325 33.78 < 2e16 ***
batch1 1.727729 0.101229 17.07 < 2e16 ***
batch2 1.322597 0.117902 11.22 < 2e16 ***
batch3 1.572310 0.116105 13.54 < 2e16 ***
batch4 1.059714 0.102360 10.35 < 2e16 ***
batch5 1.133752 0.103523 10.95 < 2e16 ***
batch6 1.040162 0.106036 9.81 < 2e16 ***
batch7 0.543692 0.109127 4.98 6.3e07 ***
batch8 0.495901 0.108926 4.55 5.3e06 ***
batch9 0.385793 0.118593 3.25 0.0011 **
temp 0.010967 0.000413 26.58 < 2e16 ***
Phi coefficients (precision model with identity link):
Estimate Std. Error z value Pr(>z)
(phi) 440 110 4 6.3e05 ***

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Type of estimator: ML (maximum likelihood)
Loglikelihood: 84.8 on 12 Df
Pseudo Rsquared: 0.962
Number of iterations: 51 (BFGS) + 3 (Fisher scoring)
Loading required package: zoo
Attaching package: 'zoo'
The following objects are masked from 'package:base':
as.Date, as.Date.numeric
studentized BreuschPagan test
data: fe_lin
BP = 5.9, df = 2, pvalue = 0.05
Call:
betareg(formula = I(food/income) ~ income + persons, data = FoodExpenditure)
Standardized weighted residuals 2:
Min 1Q Median 3Q Max
2.782 0.444 0.202 0.685 1.876
Coefficients (mean model with logit link):
Estimate Std. Error z value Pr(>z)
(Intercept) 0.62255 0.22385 2.78 0.0054 **
income 0.01230 0.00304 4.05 5.1e05 ***
persons 0.11846 0.03534 3.35 0.0008 ***
Phi coefficients (precision model with identity link):
Estimate Std. Error z value Pr(>z)
(phi) 35.61 8.08 4.41 1e05 ***

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Type of estimator: ML (maximum likelihood)
Loglikelihood: 45.3 on 4 Df
Pseudo Rsquared: 0.388
Number of iterations: 28 (BFGS) + 4 (Fisher scoring)
Likelihood ratio test
Model 1: I(food/income) ~ income + persons
Model 2: I(food/income) ~ income
#Df LogLik Df Chisq Pr(>Chisq)
1 4 45.3
2 3 40.5 1 9.65 0.0019 **

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Wald test
Model 1: I(food/income) ~ income + persons
Model 2: I(food/income) ~ income
Res.Df Df Chisq Pr(>Chisq)
1 34
2 35 1 11.2 8e04 ***

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Call:
betareg(formula = yield ~ batch + temp  temp, data = GasolineYield)
Standardized weighted residuals 2:
Min 1Q Median 3Q Max
2.540 0.779 0.117 0.862 2.942
Coefficients (mean model with logit link):
Estimate Std. Error z value Pr(>z)
(Intercept) 5.923236 0.183526 32.27 < 2e16 ***
batch1 1.601988 0.063856 25.09 < 2e16 ***
batch2 1.297266 0.099100 13.09 < 2e16 ***
batch3 1.565338 0.099739 15.69 < 2e16 ***
batch4 1.030072 0.063288 16.28 < 2e16 ***
batch5 1.154163 0.065643 17.58 < 2e16 ***
batch6 1.019445 0.066351 15.36 < 2e16 ***
batch7 0.622259 0.065632 9.48 < 2e16 ***
batch8 0.564583 0.060185 9.38 < 2e16 ***
batch9 0.359439 0.067141 5.35 8.6e08 ***
temp 0.010359 0.000436 23.75 < 2e16 ***
Phi coefficients (precision model with log link):
Estimate Std. Error z value Pr(>z)
(Intercept) 1.36409 1.22578 1.11 0.27
temp 0.01457 0.00362 4.03 5.7e05 ***

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Type of estimator: ML (maximum likelihood)
Loglikelihood: 87 on 13 Df
Pseudo Rsquared: 0.952
Number of iterations: 33 (BFGS) + 28 (Fisher scoring)
Likelihood ratio test
Model 1: yield ~ batch + temp
Model 2: yield ~ batch + temp  temp
#Df LogLik Df Chisq Pr(>Chisq)
1 12 84.8
2 13 87.0 1 4.36 0.037 *

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.