Home

/

CRAN

/

drm

/

drm: Combined regression and association models for clustered...

drm: Combined regression and association models for clustered...
In drm: Regression and association models for repeated categorical data

Description Usage Arguments Details Value WARNING Author(s) References See Also Examples

drm fits a combined regression and association model for longitudinal or otherwise clustered categorical responses using dependence ratio as a measure of the association.

drm(formula, family=binomial, data=sys.parent(), weights, offset,
subset=NULL, na.action, start=NULL, link="cum", dep="I", Ncond=TRUE,
Lclass=2, dropout=FALSE, drop.x=NULL, save.profiles=TRUE, pmatrix=NULL,
print.level=2, iterlim=200, ...)

`formula`	a formula expression as for other regression models. In addition the cluster term has to be specified in the expression by `cluster()` and if using temporal association structure the temporal term has to be specified by `Time()`. See examples below and the documentation of `lm` and `formula` for further details.
`family`	a description of the link function to be used in the model for a binary response. Default is logit link. See `family` for details. For an ordinal response, link is defined for the cumulative probabilities when `link`-argument is set to "cum". See `link` below.
`data`	an optional data frame containing the variables in the model.
`weights`	an optional vector of weights to be used in the fitting process. Only equal weights within cluster are allowed.
`offset`	this can be used to specify an a priori known component to be included in the linear predictor during fitting.
`subset`	an optional vector specifying a subset of observations to be used in the fitting process.
`na.action`	a function which indicates what should happen when the data contain NAs. The default is `na.include` after which the analysis assumes missing data mechanism at random (MAR) if `dropout=FALSE`, and not at random if `dropout = TRUE`. See `dropout` below.
`start`	an optional vector of starting values for the parameters. By default, the starting values are estimated from `glm`-procedure assuming independence
`link`	this can be used to specify alternative link functions for nominal and ordinal responses. By default "cum", after which the link is specified through `family = binomial(link=?)` for the cumulative probabilities. Alternative links include adjacent category logit "acl" and baseline category logit "bcl" (baseline category being the last category). For "bcl", the regression parameters are estimated for each logit level. For a binary response, this argument is ignored.
`dep`	`dep` defines the association structure. The default is independence "I". Other singular options are for the exchangeable association: Necessary factor "N", Latent categorical factor "L", Latent Beta-distributed propensity "B" (binary response), Latent Dirichlet-distributed propensities "D" (multicategorical response), and for the temporal association: first order Markov "M", and second order Markov "M2" (binary response). By default, Markov structure for the adjacent 2-way dependence ratios is assumed to be stationary. Superpositions of these structures can be imposed, such as "NL", "NB","ND", "NM", "LM", "NLM","NM2". See [3-7] for further details. Parameter restrictions, covariates and functional forms for the association parameters can also be specified. In that case the `dep`-argument must be a list. See examples below. For the interpretation of the association parameters, see the documentation of the support function `getass.drm`.
`Ncond`	logical argument defining whether the regression model is marginal or conditional when the association is "N". The default is `TRUE`, i.e. the regression estimation is conditional on {N=1}. If covariates are used for the "N"-association, it is advisable to set `Ncond=FALSE`, since otherwise the interpretation of the regression parameters is less clear.
`Lclass`	Number of latent classes in the population when the association is "L". Default is 2. Available only for binary response. Note that in the current implementation, the conditional probabilities are not calculated for `Lclass`>2. For checking the validity of the model, the user needs to check whether the estimated conditional probabilities fall within 0 and 1. See example in `getass.drm` for parameter interpretation and how to calculate the conditional probabilities
`dropout`	logical argument. For monotone missing patterns in longitudinal studies, this argument allows to impose a selection model (see [8] for details) on top of regression and association model to investigate the sensitivity of the results due to missingness. The model formula notation is: `logit(hz(drop.cur)) = (Intercept)d+response.cur+response.prev` , where `response.cur` denotes the effect of current, possibly missing response value and `response.prev` denotes the effect of previous response value. MCAR, MAR and MNAR-models can be specified by imposing restrictions on selection model parameters in `dep`-argument as for the association parameters. See `dep` above and examples below. If the response is a factor, the effect of the factor levels are estimated contrasting to the lowest level.
`drop.x`	an optional covariate vector for the selection model. The covariate's previous value (notation: `covariate.prev`) is used in the selection model.
`save.profiles`	logical argument defining whether the fitted values of all possible profiles is saved. If `FALSE`, only the indicator vector (-1 for a negative, 1 for a positive profile) over all units will be saved. If the cluster size is large, using `save.profiles=TRUE` may result in a very large object.
`pmatrix`	a character object specifying the name of the matrix for all possible profiles, created using `profiles.drm`. If the cluster size is large, this speeds up the estimation in case several models are fitted. See examples below.
`print.level`	level of printing during numerical optimisation. The default is 2. See `nlm` for further details.
`iterlim`	maximum iteration limit for the numerical maxisimisation. See `nlm` for further details.
`...`	other arguments passed to `nlm`, e.g. controlling the convergence. See `nlm` for further details.

drm gives maximum likelihood estimates for the combined regression and association model by decomposing a joint probability of responses in a cluster to univariate marginal or cumulative probabilities and dependence ratios of all orders. See [1] and [5] for further details. The dimensionality of the association part is reduced by imposing a model for the association structure with dep-argument. See getass.drm and [3-7] for details. Furthermore, a selection model can be added on top of regression and association model. See examples below and [5] and [8] for details.

drm returns an object of class drm. The function summary (i.e., summary.drm) can be used to obtain or print a summary of the results. The generic accessor function coefficients can be used to extract coefficients.

An object of class drm is a list containing at least the following components:

`coefficients`	a named vector of regression, and possibly association and selection model coefficients.
`cov.scaled`	a variance-covariance matrix of the parameter estimates.
`fitted.marginals`	the fitted values for the univariate means, obtained by transforming the linear predictors by the inverse of the link function.
`fitted.conditionals`	in case of "L"-structure, the fitted values for the conditional univariate means, otherwise NULL. Not yet implemented for `Lclass`>2; see also `getass.drm`.
`fitted.profiles`	the fitted response profile probabilities within each cluster, calculated by using the maximum likelihood estimates from the model. See also `save.profiles` above. Note that within each cluster, the order of the responses is by Time for Markov structures, and for exchangeable structures with missing values, by response value, with missing values (NA) last.
`deviance`	minus twice the maximised log-likelihood.
`aic`	An Information Criterion: minus twice the maximised log-likelihood plus twice the number of coefficients. Not available if the likelihood is weighted with the dropout probabilities.
`niter`	the number of iterations that `nlm` used.
`code`	convergence code from `nlm`. See `nlm` for details.
`call`	the matched call.
`terms`	the ‘terms’ object used.

The maximum likelihood estimates may sometimes lead to negative fitted probabilities. In this case, both generic print-methods warn about this. In this case, the model is considered to be wrongly specified and model specification should be changed.

Jukka Jokinen, jukka.jokinen@helsinki.fi

1. Ekholm A, Smith PWF, McDonald JW. Marginal regression analysis of a multivariate binary response. Biometrika 1995; 82(4):847-854.

2. Ekholm A, Skinner C. The Muscatine children's obesity data reanalysed using pattern mixture models. Applied Statistics 1998; 47:251-263.

3. Ekholm A, McDonald JW, Smith PWF. Association models for a multivariate binary response. Biometrics 2000; 56:712-718.

4. Ekholm A, Jokinen J, Kilpi T. Combining regression and association modelling on longitudinal data on bacterial carriage. Statistics in Medicine 2002; 21:773-791.

5. Ekholm A, Jokinen J, McDonald JW, Smith PWF. Joint regression and association modelling of longitudinal ordinal data. Biometrics 2003; 59:795-803.

6. Jokinen J, McDonald JW, Smith PWF. Meaningful regression and association models for clustered ordinal data. Sociological Methodology 2006; 36:173-199.

7. Jokinen J. Fast estimation algorithm for likelihood-based analysis of repeated categorical responses. Computational Statistics and Data Analysis 2006; 51:1509-1522.

8. Diggle PJ, Kenward MJ. Informative dropout in longitudinal data analysis. Applied Statistics 1994; 43: 49-94.

getass.drm, nlm, cluster, Time profiles.drm, depratio

######################################################
## Examples for binary responses
###########################################
## Wheeze among Steubenville (see [3]):
## Latent Beta-distributed propensity
data(wheeze)
fit1 <- drm(wheeze~I(age>9)+smoking+cluster(id),data=wheeze,dep="B", print=0)

## Obesity among Muscatine children (see [2]):
## Analysis for completers: M2 for girls
data(obese)
fit2 <- drm(obese~age+cluster(id)+Time(year), subset=sex=="female",
            dep="M2",data=obese)

## Not run: 
## Muscatine children continued (see [3]):
## LM for boys and girls separately
fit3 <- drm(obese~age+cluster(id)+Time(age), subset=sex=="male",
            dep="LM",data=obese)

fit4 <- drm(obese~age+cluster(id)+Time(age), subset=sex=="female",
            dep="LM",data=obese)

## End(Not run)
############################################
## Examples for ordinal responses
############################################
## Movie critic example (see [6]):
## Latent Dirichlet propensities with baseline category link.
data(movie)

options(contrasts=c("contr.treatment","contr.treatment"))
fit5 <- drm(y~critic+cluster(movie), data=movie, dep="D", link="bcl")

## Longitudinal dataset on teenage marijuana use (see [6]):
## Superposition of structures N, L and M for the girls.
data(marijuana)

fit6 <- drm(y~age+cluster(id)+Time(age), data=marijuana,
            subset=sex=="female", dep=list("NLM", ~kappa1==1,
            ~kappa2==0, ~tau12==1, ~tau21==1, ~tau11==tau22))

## Parameter restrictions with functions using M-structure for the boys.
## Plot the second order dependence ratios:
plot(depratio(y~cluster(id)+Time(age), data=marijuana,
     subset=sex=="male"))

## fit the model in [6]:
fit7 <- drm(y~age+cluster(id)+Time(age), data=marijuana,
            subset=sex=="male", dep=list("M", 
            tau12~function(a=1,b=0) a+b*c(0:3),
            tau21~function(a=1,b=0) a+b*c(0:3)))

## Not run: 
##############################################
## Covariates for the association (see [7]):
##############################################
data(madras)

## plot empirical 2nd order dependence ratios with bootstrap CI's
tau.madras <- depratio(symptom~cluster(id)+Time(month), data=madras,
                       boot.ci = TRUE, n.boot = 1000)
plot(tau.madras, log="y", ylim=c(1,40), plot.ci=TRUE)

## create matrix for profiles:
W.madras <- profiles.drm(n.categories=2, n.repetitions=12, "M")

## create four-level covariate, combining age and sex:
madras$age.sex <- factor(paste(madras$age,madras$sex,sep="."))

## fit the model in [7], Section 4:
fit8 <- drm(symptom~age+sex+month+month:age+month:sex+cluster(id)+Time(month),
            data=madras, Ncond=FALSE, save.profiles=FALSE, pmatrix="W.madras",
            dep=list("NM",nu~nu:age.sex,
                     tau~function(a0=0,a1=0) 1+a0*exp(a1*c(0:10))), print=2)

###################################################
## Dropout model on top of regression & association 
###################################################
## Continue with the madras data.
## fit a model without the dropout model:
fit9 <- drm(symptom~age+sex+month+month:age+month:sex+cluster(id)+Time(month),
            data=madras, save.profiles=FALSE, pmatrix="W.madras", print=0,
            dep=list("NM", tau~function(a0=0,a1=0) 1+a0*exp(a1*c(0:10))))

## A dropout model assuming MCAR for the thought disorders:

mcar <- drm(symptom~age+sex+month+month:age+month:sex+cluster(id)+Time(month),
            data=madras, save.profiles=FALSE, pmatrix="W.madras",
            dep=list("NM", tau~function(a0=0,a1=0) 1+a0*exp(a1*c(0:10)),
                     ~symptom.cur==0,~symptom.prev==0),
            dropout=TRUE, start=c(coef(fit9), -4))

## A dropout model assuming MAR; including sex as a covariate:

mar <- drm(symptom~age+sex+month+month:age+month:sex+cluster(id)+Time(month),
           data=madras, save.profiles=FALSE, pmatrix="W.madras",
           dep=list("NM", tau~function(a0=0,a1=0) 1+a0*exp(a1*c(0:10)),
                    ~symptom.cur==0), dropout=TRUE, drop.x=sex,
           start=c(coef(mcar),0,0))

## A dropout model assuming MNAR and sex as a covariate:

mnar <- drm(symptom~age+sex+month+month:age+month:sex+cluster(id)+Time(month),
            data=madras, save.profiles=FALSE, pmatrix="W.madras",
            dep=list("NM", tau~function(a0=0,a1=0) 1+a0*exp(a1*c(0:10))),
            dropout=TRUE, drop.x=sex, start=c(coef(mcar),0,0,0))

## print out coefficients and std.errors:
coef(summary(mnar))

## End(Not run)
## std.error of `symptom.cur' all over the place; too few dropouts
## for a comprehensive evaluation of the dropout mechanism

drm documentation built on May 29, 2017, 7:24 p.m.

drm index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

drm
Regression and association models for repeated categorical data

drm: Combined regression and association models for clustered...
In drm: Regression and association models for repeated categorical data

Description

Usage

Arguments

Details

Value

WARNING

Author(s)

References

See Also

Examples

Related to drm in drm...

R Package Documentation

Browse R Packages

We want your feedback!

drm Regression and association models for repeated categorical data

drm: Combined regression and association models for clustered... In drm: Regression and association models for repeated categorical data

Description

Usage

Arguments

Details

Value

WARNING

Author(s)

References

See Also

Examples

Related to drm in drm...

R Package Documentation

Browse R Packages

We want your feedback!

drm
Regression and association models for repeated categorical data

drm: Combined regression and association models for clustered...
In drm: Regression and association models for repeated categorical data