glmcat: Generalized linear models for categorical responses

View source: R/GLMcat2.R

glmcatR Documentation

Generalized linear models for categorical responses

Description

Estimate generalized linear models implemented under the unified specification ( ratio,cdf,Z) where ratio represents the ratio of probabilities (reference, cumulative, adjacent, or sequential), cdf the cumulative distribution function for the linkage, and Z the design matrix which must be specified through the parallel and the threshold arguments.

Usage

glmcat(
  formula,
  data,
  ratio = c("reference", "cumulative", "sequential", "adjacent"),
  cdf = list(),
  parallel = NA,
  categories_order = NA,
  ref_category = NA,
  threshold = c("standard", "symmetric", "equidistant"),
  control = list(),
  normalization = 1,
  na.action = "na.omit",
  find_nu = FALSE,
  ...
)

Arguments

formula

formula a symbolic description of the model to be fit. An expression of the form 'y ~ predictors' is interpreted as a specification that the response 'y' is modeled by a linear predictor specified by 'predictors'.

data

a dataframe object in R, with the dependent variable as a factor.

ratio

a string indicating the ratio (equivalently to the family) options are: reference, adjacent, cumulative and sequential. It is mandatory for the user to specify the desired ratio option as there is no default value.

cdf

The inverse distribution function to be used as part of the link function. - If the distribution has no parameters to specify, then it should be entered as a string indicating the name, e.g., 'cdf = "normal"'. The default value is 'cdf = "logistic"'. - If there are parameters to specify, then a list must be entered. For example, for Student's distribution: 'cdf = list("student", df=2)'. For the non-central distribution of Student: 'cdf = list("noncentralt", df=2, mu=1)'.

parallel

a character vector indicating the name of the variables with a parallel effect. If a variable is categorical, specify the name and the level of the variable as a string, e.g., '"namelevel"'.

categories_order

a character vector indicating the incremental order of the categories, e.g., 'c("a", "b", "c")' for 'a < b < c'. Alphabetical order is assumed by default. Order is relevant for adjacent, cumulative, and sequential ratio.

ref_category

a string indicating the reference category. This option is suitable for models with reference ratio.

threshold

a restriction to impose on the thresholds. Options are: 'standard', 'equidistant', or 'symmetric'. This is valid only for the cumulative ratio.

control

a list of control parameters for the estimation algorithm. - 'maxit': The maximum number of iterations for the Fisher scoring algorithm. - 'epsilon': A double to change the convergence criterion of GLMcat models. - 'beta_init': An appropriately sized vector for the initial iteration of the algorithm.

normalization

the quantile to use for the normalization of the estimated coefficients when the logistic distribution is used as the base cumulative distribution function.

na.action

an argument to handle missing data. Available options are 'na.omit', 'na.fail', and 'na.exclude'. It does not include the 'na.pass' option.

find_nu

a logical argument to indicate whether the user intends to utilize the Student CDF and seeks an optimization algorithm to identify an optimal degrees of freedom setting for the model.

...

additional arguments. Note: If the 'reference' ratio is used, you'll get a warning if the variable is an ordered factor. Note: If any other 'radio' is used, it will issue a warning if the response is not ordered, and the variables order will default to the alphanumeric natural order.

Details

Fitting models for categorical responses

This function fits generalized linear models for categorical responses using the unified specification framework introduced by Peyhardi, Trottier, and Guédon (2015).

References

Peyhardi J, Trottier C, Guédon Y (2015). “A new specification of generalized linear models for categorical responses.” Biometrika, 102(4), 889–906. doi:10.1093/biomet/asv042.

See Also

summary.glmcat

Examples

data(DisturbedDreams)
ref_log_com <- glmcat(formula = Level ~ Age, data = DisturbedDreams,
    ref_category = "Very.severe",
    cdf = "logistic", ratio = "reference")

GLMcat documentation built on Sept. 30, 2024, 5:08 p.m.