Home

/

GitHub

/

cglm: Fit Generalized Models to cluster correlated data

cglm: Fit Generalized Models to cluster correlated data
In NianchuanChen/chandwichGLM: Chandler-Bate Sandwich Loglikelihood Adjustment for Generalized Linear Models

Description Usage Arguments Details Value References See Also Examples

cglm has the same functionality as glm except that cluster correlated data can be analyzed using the methodology of Chandler and Bate (2007), implemented by the chandwich package.

cglm(formula, family = gaussian, data, weights, subset, na.action,
  start = NULL, etastart, mustart, offset, control = list(...),
  model = TRUE, method = "glm.fit", x = FALSE, y = TRUE,
  contrasts = NULL, cluster = NULL, use_alg = TRUE, ...)

`formula`	an object of class `"formula"` (or one that can be coerced to that class): a symbolic description of the model to be fitted. The details of model specification are given under ‘Details’.
`family`	a description of the error distribution and link function to be used in the model. For `glm` this can be a character string naming a family function, a family function or the result of a call to a family function. For `glm.fit` only the third option is supported. (See `family` for details of family functions.)
`data`	an optional data frame, list or environment (or object coercible by `as.data.frame` to a data frame) containing the variables in the model. If not found in `data`, the variables are taken from `environment(formula)`, typically the environment from which `glm` is called.
`weights`	an optional vector of ‘prior weights’ to be used in the fitting process. Should be `NULL` or a numeric vector.
`subset`	an optional vector specifying a subset of observations to be used in the fitting process.
`na.action`	a function which indicates what should happen when the data contain `NA`s. The default is set by the `na.action` setting of `options`, and is `na.fail` if that is unset. The ‘factory-fresh’ default is `na.omit`. Another possible value is `NULL`, no action. Value `na.exclude` can be useful.
`start`	starting values for the parameters in the linear predictor.
`etastart`	starting values for the linear predictor.
`mustart`	starting values for the vector of means.
`offset`	this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be `NULL` or a numeric vector of length equal to the number of cases. One or more `offset` terms can be included in the formula instead or as well, and if more than one is specified their sum is used. See `model.offset`.
`control`	a list of parameters for controlling the fitting process. For `glm.fit` this is passed to `glm.control`.
`model`	a logical value indicating whether model frame should be included as a component of the returned value.
`method`	the method to be used in fitting the model. The default method `"glm.fit"` uses iteratively reweighted least squares (IWLS): the alternative `"model.frame"` returns the model frame and does no fitting. User-supplied fitting functions can be supplied either as a function or a character string naming a function, with a function which takes the same arguments as `glm.fit`. If specified as a character string it is looked up from within the stats namespace.
`x`	For `glm`: logical values indicating whether the response vector and model matrix used in the fitting process should be returned as components of the returned value. For `glm.fit`: `x` is a design matrix of dimension `n * p`, and `y` is a vector of observations of length `n`.
`y`	For `glm`: logical values indicating whether the response vector and model matrix used in the fitting process should be returned as components of the returned value. For `glm.fit`: `x` is a design matrix of dimension `n * p`, and `y` is a vector of observations of length `n`.
`contrasts`	an optional list. See the `contrasts.arg` of `model.matrix.default`.
`cluster`	A vector or factor indicating from which cluster the respective loglikelihood contributions from `loglik` originate. Must have the same length as the vector returned by `loglik`. If `cluster` is not supplied then it is set inside `adjust_loglik` under the assumption that each observation forms its own cluster.
`use_alg`	A logical scalar. Whether or not to provide to `adjust_loglik` via the arguments `alg_deriv` and `alg_hess` functions to evalulate the derivatives (scores) of the loglikelihood with respect to the parameters and the Hessian of the loglikelihood respectively.
`...`	For `glm`: arguments to be used to form the default `control` argument if it is not supplied directly. For `weights`: further arguments passed to or from other methods.

Three adjustments to the independence loglikelihood described in Chandler and Bate (2007) are available. The vertical adjustment is described in Section 6 and two horizontal adjustments are described in Sections 3.2 to 3.4. See the descriptions of type and, for the horizontal adjustments, the descriptions of C_cholesky and C_spectral, in the documentation of adjust_loglik.

The adjustments involve first and second derivatives of the loglikelihood with respect to the model parameters. If use_alg = FALSE then these are estimated using jacobian and optimHess.

If the argument cluster is not provided then the adjustment to the loglikelihood is based on a robust sandwich estimator for the covariance matrix of the parameter estimators, under the assumption that each observation forms its own cluster.

If you only wish to adjust parameter covariance matrices, that is, you are not interested in an adjusted loglikelihood, then the sandwich package will provide equivalent results and is more flexible than chandwichGLM. The vignettes in the sandwich package give a very clear account of this general area.

An object of class "chandwich", returned by adjust_loglik. The function summary (i.e. summary.chandwich) can be used to obtain or print a summary of the results. The function conf_intervals can be used to calculate confidence intervals for each parameter. plot.confint can be used to illustrate the an individual confidence interval in a plot. The function conf_region, and its plot method plot.confreg, can be used to plot confidence regions for pairs of parameters.

Chandler, R. E. and Bate, S. (2007). Inference for clustered data using the independence loglikelihood. Biometrika, 94(1), 167-183. http://dx.doi.org/10.1093/biomet/asm015

glm, chandwich, adjust_loglik.

### Section 5.2 of sandwich vignette at
# https://cran.r-project.org/web/packages/sandwich/vignettes/sandwich-OOP.pdf

## Load the dataset Affairs from the package AER
data("Affairs", package = "AER")

## Fit Binomial regression model
fm_probit <- glm(I(affairs > 0) ~ age + yearsmarried + religiousness +
                 occupation + rating, data = Affairs,
                 family = binomial(link = "probit"))

# Fit the model using clgm(), adjusting the log-likelihood
# The default cluster is used: adjusted SEs viewed as model-robust SEs
c_probit <- cglm(I(affairs > 0) ~ age + yearsmarried + religiousness +
                occupation + rating, data = Affairs,
                family = binomial(link = "probit"))

library(sandwich)
## Check that
# (1) the unadjusted SEs are close to those from chandwichGLM
# (2) the adjusted SEs from sandwich are close to those from chandwichGLM
coeftest(fm_probit)
coeftest(fm_probit, vcov = sandwich)
summary(c_probit)

## Extra facilities provided by chandwich

# Confidence intervals based on the adjusted log-likelihood
cints <- chandwich::conf_intervals(c_probit)
cints
plot(cints, which_par = "occupation")

# Confidence region for a pair of parameters (slow to run)
conf <- chandwich::conf_region(c_probit, which_pars = 2:3)
plot(conf, conf = c(50, 75, 95, 99))

# Adjusted likelihood ratio test to compare nested models
# The p-value is different from, but in this case very similar to,
# the p-value from the Wald test from coeftest() above
chandwich::compare_models(c_probit, fixed_pars = "occupation")

### Poisson GLM

## Example from the help file for stats::glm()
## Dobson (1990) Page 93: Randomized Controlled Trial :
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)
d.AD <- data.frame(treatment, outcome, counts)

glm.D93 <- glm(counts ~ outcome + treatment, family = poisson)
cglm1 <- cglm(counts ~ outcome + treatment, family = poisson)
cglm2 <- cglm(counts ~ outcome + treatment, family = poisson,
              use_alg = FALSE)
summary(glm.D93)
summary(cglm1)
summary(cglm2)

# Confidence intervals for each parameter
cints <- chandwich::conf_intervals(cglm1)
cints
plot(cints, which_par = 2)

# A confidence region for a pair of parameters
conf <- chandwich::conf_region(cglm1, which_pars = 1:2)
plot(conf, conf = c(50, 75, 95, 99))

### Binomial GLM

# Response vector (0/1 numeric or, in this case, a yes/no factor)
w <- glm(degree ~ religion + gender + age, data = carData::WVS,
         family = binomial)
wc <- cglm(degree ~ religion + gender + age, data = carData::WVS,
           family = binomial)
summary(w)
summary(wc)

# Response (two-column) matrix (number of successes, number of failures)

# [Temporary example from the internet]

cuse <- read.table("http://data.princeton.edu/wws509/datasets/cuse.dat",
                   header=TRUE)
lrfit <- glm(cbind(using, notUsing) ~ age + education + wantsMore,
             family = binomial, data = cuse)

NianchuanChen/chandwichGLM documentation built on May 10, 2019, 3:15 a.m.

NianchuanChen/chandwichGLM index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

NianchuanChen/chandwichGLM
Chandler-Bate Sandwich Loglikelihood Adjustment for Generalized Linear Models

cglm: Fit Generalized Models to cluster correlated data
In NianchuanChen/chandwichGLM: Chandler-Bate Sandwich Loglikelihood Adjustment for Generalized Linear Models

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Related to cglm in NianchuanChen/chandwichGLM...

R Package Documentation

Browse R Packages

We want your feedback!

NianchuanChen/chandwichGLM Chandler-Bate Sandwich Loglikelihood Adjustment for Generalized Linear Models

cglm: Fit Generalized Models to cluster correlated data In NianchuanChen/chandwichGLM: Chandler-Bate Sandwich Loglikelihood Adjustment for Generalized Linear Models

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Related to cglm in NianchuanChen/chandwichGLM...

R Package Documentation

Browse R Packages

We want your feedback!

NianchuanChen/chandwichGLM
Chandler-Bate Sandwich Loglikelihood Adjustment for Generalized Linear Models

cglm: Fit Generalized Models to cluster correlated data
In NianchuanChen/chandwichGLM: Chandler-Bate Sandwich Loglikelihood Adjustment for Generalized Linear Models