PLreg | R Documentation |
PLreg
is used to fit power logit regression model for continuous and bounded variables via maximum likelihood approach.
Both median and dispersion of the response variable are modeled through
parametric functions.
PLreg( formula, data, subset, na.action, family = c("NO", "LO", "TF", "PE", "SN", "SLASH", "Hyp"), zeta = NULL, link = c("logit", "probit", "cloglog", "cauchit", "loglog"), link.sigma = NULL, type = c("pML", "ML"), control = PLreg.control(...), model = TRUE, y = TRUE, x = FALSE, ... ) PLreg.fit( X, y, S = NULL, family, type = "pML", zeta = zeta, link = "logit", link.sigma = "log", control = PLreg.control() )
formula |
a symbolic description of the model. See details for further information. |
data, subset, na.action |
arguments controlling formula processing via |
family |
a description of the symmetric distribution to be used for generating the power logit model.
Supported families include " |
zeta |
a numeric value or numeric vector that represents the extra parameter of the distribution. For the PL-NO and PL-LO models, no extra parameter is needed. |
link |
an optional character that specifies the link function of the median submodel (mu).
The " |
link.sigma |
an optional character that specifies the link function of the dispersion submodel (sigma).
The " |
type |
character specifying the type of estimator for the skewness parameter.
Currently, penalized maximum likelihood (" |
control |
a list of control arguments specified via |
model, y, x |
logicals. If |
... |
arguments passed to |
X |
numeric regressor matrix for the median submodel. |
S |
numeric regressor matrix for the dispersion submodel. |
The power logit regression models, proposed by Queiroz and Ferrari (2022), is useful in
situations when the response variable is continuous and bounded on the unit interval (0, 1).
The median and the dispersion parameters are modeled through parametric link
functions. The models depend on a skewness parameter (called λ). When the skewness parameter is fixed
and equal to 1, the power logit models coincide with the GJS regression models
(Lemonte and Bazan, 2016). Queiroz and Ferrari (2022) suggest using a penalized maximum
likelihood method to estimate the parameters. This method is implemented in
PLreg
by default when λ is not fixed. If convergence is not reached,
maximum likelihood estimation is performed. The estimation
process uses optim
. If no starting values are specified,
the PLreg
function uses those suggested by Queiroz and Ferrari (2022).
This function also fits the log-log regression models by setting λ
at zero (λ = 0 represents λ \rightarrow 0^+).
The formulation of the model has the same structure as in the usual functions
lm
and glm
. The argument
formula
could comprise of three parts (separated by the symbols "~" and "|"),
namely: observed response variable in the unit interval, predictor of the median submodel,
with link function link
and predictor of the dispersion submodel, with link.sigma
link function. If the model has constant dispersion, the third part may be omitted and the link function for sigma
is "log
" by default. The skewness parameter lambda
may be
treated as fixed or not (default). If lambda
is fixed, its value
must be specified in the control
argument.
Some methods are available for objects of class "PLreg
",
see plot.PLreg
, summary.PLreg
,
coef.PLreg
, vcov.PLreg
, and
residuals.PLreg
, for details and other methods.
PLreg
returns an object of class "PLreg
" with the following
components (the PLreg.fit
returns elements up to v
).
coefficients |
a list with the " |
residuals |
a vector of the raw residuals (the difference between the observed and the fitted response). |
fitted.values |
a vector with the fitted values of the median submodel. |
optim |
a list with the output from |
family |
a character specifying the |
method |
the method argument passed to the optim call. |
control |
the control arguments passed to the optim call. |
start |
a vector with the starting values used in the iterative process. |
nobs |
number of observations. |
df.null |
residual degrees of freedom in the null model (constant median and dispersion), i.e., n-3. |
df.residual |
residual degrees of freedom in the fitted model. |
lambda |
value of the skewness parameter lambda
( |
loglik |
log-likelihood of the fitted model. |
loglikp |
penalized profile log-likelihood for lambda. If lambda is
equal to zero, |
vcov |
covariance matrix of all the parameters. |
pseudo.r.squared |
pseudo R-squared value. |
Upsilon.zeta |
an overall goodness-of-fit measure. |
link |
a list with elements " |
converged |
logical indicating successful convergence of the iterative process. |
zeta |
a numeric specifying the value of zeta used in the estimation process. |
type |
a character specifying the estimation method used. |
v |
a vector with the v(z) values for all the observations (see Queiroz and Ferrari(2021)). |
call |
the original function call. |
formula |
the formula used. |
terms |
a list with elements " |
levels |
a list with elements " |
contrasts |
a list with elements " |
model |
the full model frame (if |
y |
the response variable (if |
x |
a list with elements " |
Francisco Felipe de Queiroz (ffelipeq@outlook.com) and Silvia L. P. Ferrari.
Queiroz, F. F. and Ferrari, S. L. P. (2022). Power logit regression
for modeling bounded data. arXiv:2202.01697.
Lemonte, A. J. and Bazan, J. L. (2015). New class of Johnson SB distributions
and its associated regression model for rates and proportions. Biometrical Journal. 58:727-746.
summary.PLreg
, PLreg.control
, residuals.PLreg
#### Body fat data data("bodyfat_Aeolus") #Initial model with zeta = 2 fit1 <- PLreg(percentfat ~ days + sex + year, data = bodyfat_Aeolus, family = "PE", zeta = 2) summary(fit1) # Choosing the best value for zeta # extra.parameter(fit1, lower = 1, upper = 4, grid = 15) # Using zeta = 1.7 fit2 <- PLreg(percentfat ~ days + sex + year, data = bodyfat_Aeolus, family = "PE", zeta = 1.7) summary(fit2) # Fixing lambda = 1 fit3 <- PLreg(percentfat ~ days + sex + year, data = bodyfat_Aeolus, family = "PE", zeta = 1.7, control = PLreg.control(lambda = 1)) summary(fit3) # Comparing the AIC and Upsilon values between fit2 and fit3 AIC(fit2) < AIC(fit3) # TRUE fit2$Upsilon.zeta < fit3$Upsilon.zeta #TRUE #### Firm cost data data("Firm") fitPL <- PLreg(firmcost ~ sizelog + indcost | sizelog + indcost, data = Firm, family = "SLASH", zeta = 2.13) summary(fitPL) #extra.parameter(fitPL, lower = 1.2, upper = 4, grid = 10) #plot(fitPL, type = "standardized") #envelope(fitPL, type = "standardized") fitPL_wo72 <- PLreg(firmcost ~ sizelog + indcost | sizelog + indcost, data = Firm[-72,], family = "SLASH", zeta = 2.13) fitPL_wo15 <- PLreg(firmcost ~ sizelog + indcost | sizelog + indcost, data = Firm[-15,], family = "SLASH", zeta = 2.13) fitPL_wo16 <- PLreg(firmcost ~ sizelog + indcost | sizelog + indcost, data = Firm[-16,], family = "SLASH", zeta = 2.13) coef.mu <- coef(fitPL)[1:3] coef.mu_wo72 <- coef(fitPL_wo72)[1:3] coef.mu_wo15 <- coef(fitPL_wo15)[1:3] coef.mu_wo16 <- coef(fitPL_wo16)[1:3] plot(Firm$indcost, Firm$firmcost, pch = "+", xlab = "indcost", ylab = "firmcost") #identify(Firm$indcost, Firm$firmcost) covariate = matrix(c(rep.int(1, 1000), rep(median(Firm$sizelog), 1000), seq(0, 1.22, length.out = 1000)), ncol = 3) lines(covariate[,3], as.vector(fitPL$link$median$linkinv(covariate%*%coef.mu)), type = "l") lines(covariate[,3], as.vector(fitPL$link$median$linkinv(covariate%*%coef.mu_wo72)), type = "l", lty = 2, col = "blue") lines(covariate[,3], as.vector(fitPL$link$median$linkinv(covariate%*%coef.mu_wo15)), type = "l", lty = 3, col = "red") lines(covariate[,3], as.vector(fitPL$link$median$linkinv(covariate%*%coef.mu_wo16)), type = "l", lty = 4, col = "green") parameters = c("pML", "pML w/o 72", "pML w/o 15", "pML w/o 16") legend(x = 0.5, y = 0.8, legend = parameters, col = c("black", "blue", "red", "green"), lty = c(1, 2, 3, 4), cex = 0.6)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.