# simplexreg: Simplex Generalized Linear Model Regression Function In simplexreg: Regression Analysis of Proportional Data Using Simplex Distribution

## Description

Regression Analysis of Proportional Data Using Various Types of Simplex Models

## Usage

 ```1 2 3 4 5 6``` ```simplexreg(formula, data, subset, na.action, link = c("logit", "probit", "cloglog", "neglog"), corr = "Ind", id = NULL, control = simplexreg.control(...), model = TRUE, y = TRUE, x = TRUE, ...) simplexreg.fit(y, x, z = NULL, t = NULL, link = "logit", corr = "Ind", id = NULL, control = simplexreg.control()) ```

## Arguments

 `formula` a symbolic description of the model to be fitted(of type y ~ x or y ~ x | z | t. The Details are given under 'Details'). `data` an optional data frame, list or environment containing variables in `formula` and `id`. `subset, na.action` arguments controlling formula processing via `model.frame`
 `link` type of link function to the mean. Currently, `"logit"`(logit function), `"probit"`(probit function), `"cloglog"`(complementary log-log function), `"neglog"`(negative log function) are supported. `corr` the covariance structure, chosen from `"Ind"`(independent structure), `"Exc"`(exchangeability) and `"AR1"`(AR(1)), see Details `id` a factor identifies the clusters when `gee = TRUE`. The length of `id` should be the same as the number of observations. `y`, `x`, `z`, `t` are assumed to be sorted in accordance with clusters specified by `id` `control` a list of control argument via `simplexreg.control` `model` a logical value indicating whether model frame should be included as a component of the return value `y, x` For `simplexreg`:logical values indicating whether response vector and covariates modelling the mean parameter should be returned as components of the returned value For `simplexreg.fit`:`x` is the design matrix and `y` is the response vector `z` regressor matrix modelling the dispersion parameter `t` time covariate in the correlation structure, see Details `...` argument passed to `simplexreg.control`

## Details

Outcomes of continuous proportions arise in many applied areas. Such data could be properly modelled using simplex regression. See also `simplex`. The mean and dispersion parameters are linked to set of regressors. Regression analysis of the simplex model is implemented in `simplexreg`. If `corr = "Ind"`, simplex generalized regression model is employed. Estimations is performed by maximum likelihood via Fisher scoring technique.

Apart from including generalized simplex regression models, this function also provides users with generalized estimating equations (GEE) techniques to model longitudinal proportional response. Exchangeability and AR(1) structures are available. Parameter estimation and residual analysis are involved.

We employ the specification approach designed in the fitting model function `betareg` of beta regression in the package betareg. As for simplex regression models, assuming the dispersion is homogeneous, the response is linked to a linear predictor described by `y ~ x1 + x2` using a `link` function. Four types of function are available linking the regressors to the mean. However, for dispersion, the `link` function is restricted to logarithm function. When modeling dispersion, the regressor modelling the dispersion parameter should be specified in a formula form of type `y ~ x1 + x2 | z1 + z2` where `z1` and `z2` are linked to the dispersion parameter σ^2.

Model specification is a bit complicated when it comes to modelling longitudinal proportional response. Song et. al (2004) proposed a marginal simplex model consists of three components, the population-average effects, the pattern of dispersion and the correlation structure. Let the percentage responses for the ith subject be y_{ij}, observed at time t_{ij}. If `corr = "AR1"`, the working covariance matrix of y_{ij}, j = 1, 2, ..., n_i, is

{exp(α * |t_{ik} - t_{ij}|)}_{kj}

where α < 0 and exp(α) is the lag-1 autocorrelation. If `corr = "Exc"`, the covariance matrix will be (1 - exp(α)) I + exp(α) 1 where I is the identity matrix while 1 the matrix with all elements being equal to one.

For homogeneous dispersion, the formula is supposed to be of the form `y ~ x1 + x2 | 1 | t` where t is the time covariate. Otherwise, the formula will be of the form `y ~ x1 + x2 | z1 + z2 | t`.

## Value

 `fixef` estimates of coefficients modelling the mean as well as the standard deviation `dispar` estimates of coefficients modelling dispersion as well as the standard deviation `Dispersion` estimate of the dispersion parameter `appstdPerr` approximated standard deviations of the regression coefficients `stdPerr` exact standard deviations of the regression coefficients `meanmu` estimate of mean parameter `adjvar` adjusted dependent variable s_i. Details could be found in McCullagh and Nelder (1989) `stdscor` standardised score residuals. Details can be found in Song et al. (2004) `predict` predicted values of g(μ_i) where g is the link function and μ_i the mean parameter `loglike` value of maximum log-likelihood function `deviance` value of deviance `call` the original function call `formula` the original formula `terms` a list with elements `"mean"` and `"dispersion"` containing term object for the model `levels` a list with elements `"mean"` and `"dispersion"` containing levels of categorical regressors `link` type of function linking to the mean `type` type = `"homo"` for homogeneous dispersion while type = `"hetero"` for heterogeneous dispersion `model` the full model frame (if `model = TRUE`) `y` response vector (if `y = TRUE`) `x` a list with elements `mean`, `dispersion`, `time` and `id` containing corresponding variables `n` number of proportional observations `iter` number of Fisher iterations `...` argument passed to `simplexreg.control`

## Author(s)

Zhenguo Qiu, Peng Zhang and Chengchun Shi

## References

Barndorff-Nielsen, O.E. and Jorgensen, B. (1991) Some parametric models on the simplex. Journal of Multivariate Analysis, 39: 106–116

Jorgensen, B. (1997) The Theory of Dispersion Models. London: Chapman and Hall

McCullagh, P and Nelder J. (1989) Generalized Linear Models. London: Chapman and Hall

Song, P. and Qiu, Z. and Tan, M. (2004) Modelling Heterogeneous Dispersion in Marginal Models for Longitudinal Proportional Data. Biometrical Journal, 46: 540–553

Zhang, P. and Qiu, Z. and Shi, C. (2016) simplexreg: An R Package for Regression Analysis of Proportional Data Using the Simplex Distribution. Journal of Statistical Software, 71: 1–21

`simplex`
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13``` ```# GLM models data("sdac", package = "simplexreg") sim.glm1 <- simplexreg(rcd~ageadj+chemo, link = "logit", data = sdac) sim.glm2 <- simplexreg(rcd~ageadj+chemo|age, link = "logit", data = sdac) # GEE models data("retinal", package = "simplexreg") sim.gee1 <- simplexreg(Gas~LogT+LogT2+Level|1|Time, link = "logit", corr = "Exc", id = ID, data = retinal) sim.gee2 <- simplexreg(Gas~LogT+LogT2+Level|LogT+Level|Time, link = "logit", corr = "AR1", id = ID, data = retinal) ```