cvam | R Documentation |
Fits log-linear models to categorical variables by three methods: maximizing the loglikelihood or log-posterior density by Expectation-Maximization (EM) algorithms, simulating the posterior distribution by a Markov chain Monte Carlo (MCMC) algorithms, and creating random draws of parameters from an approximate Bayesian posterior distribution. The factors in the model may have missing or coarsened values.
cvam(obj, ...) ## S3 method for class 'formula' cvam(obj, data, freq, weight, subPop, stratum, cluster, nest = FALSE, prior = cvamPrior(), method = c("EM", "MCMC", "approxBayes", "mfSeen", "mfTrue", "mfPrior", "modelMatrix"), control = list(), omitData = FALSE, saturated = FALSE, modelMatrix = NULL, offset = NULL, strZero = NULL, startVal = NULL, estimate = NULL, ...) ## S3 method for class 'cvam' cvam(obj, method = obj$method, control = NULL, startVal = NULL, estimate = NULL, ...)
obj |
an object used to select a method: either a model
formula or the result from a previous call to |
data |
an optional data frame, list or environment (or object
coercible to a data frame by |
freq |
an optional variable for holding integer frequencies when the
observations are grouped. If |
weight |
an optional numeric variable containing survey weights, which
are used when computing pseudo-maximum likelihood (PML) estimates
from survey data. If |
subPop |
an optional logical variable indicating membership in a subpopulation for computing PML estimates from survey data. |
stratum |
an optional factor variable indicating the sampling stratum to which a unit belongs, used when computing linearized variance estimates for parameter estimates under a with-replacement (WR) survey design; see DETAILS. |
cluster |
an optional factor variable indicating the primary (first-stage) sampling cluster to which a unit belongs, used when computing linearized variance estimates for parameters under a with-replacement (WR) survey design; see DETAILS. |
nest |
if TRUE, duplicate values of the cluster variable appearing in different strata are assumed to refer to different clusters. |
prior |
an object produced by |
method |
a procedure for fitting the model:
|
control |
a named list containing control parameters which are
passed to |
omitData |
if |
saturated |
if |
modelMatrix |
an optional model matrix that defines the
log-linear model. In ordinary circumstances, |
offset |
an optional numeric vector of length
|
strZero |
an optional logical vector of length
|
startVal |
an optional vector of starting values for the model
parameters. If |
estimate |
an optional formula or list of formulas of the kind
expected by |
... |
values to be passed to the methods. |
A log-linear model is specified by a one-sided formula that determines
which associations among the variables are allowed. For
example, ~ A + B + C
implies that A
, B
and
C
are mutually independent; ~ A*B + A*C
implies that
B
and C
are conditionally independent given A
;
and so on. Variables in a model may be factors or coarsened factors,
and missing values are permitted. All models are fit using a surrogate
Poisson formulation which is appropriate for Poisson, multinomial
or product-multinomial sampling. A formula may contain a vertical bar
to specify variables to be regarded as fixed; for example, ~ A*B
+ A*C | A
fixes the variable A
. Fixing variables does not
change the model fitting procedure; the only difference is that, after
the model has been fit, the cell probabilities are scaled to sum to
one within every combination of levels of the fixed variables.
If cvam
is called with a cvam
object as its first
argument, then the data, model and prior distribution will be
taken from the previous run, and (unless startVal
is
supplied), starting values will be set to the final parameter values
from the previous run.
If method
is "EM"
and survey weights are supplied
through weight
, then cvam
performs pseudo-maximum
likelihood (PML) estimation. The target of PML is the set of
parameters that would be obtained if the given model were fit to all
units in the finite population (or, if subPop
is given, the
subpopulation). If saturated=FALSE
, then
standard errors for log-linear coefficients are computed using a
linearization method that assumes the first stage of sampling within
strata was carried out with replacement (WR). Although WR sampling is
rarely done in actual surveys, it is often assumed for
variance estimation, and if the first-stage sampling was actually done
without replacement the resulting standard errors tend to be
conservative. The WR survey design information is provided through
weight
, stratum
and cluster
. The stratum
and cluster
variables are coerced to factors. If stratum
is omitted, then the population is regarded as a single stratum. If
cluster
is omitted, then each sample unit is treated as a cluster.
if method
is "EM"
, "MCMC"
or
"approxBayes"
, an object of class c("cvam","list")
containing the results of a model fit. For other values of
method
, the requested object is returned without fitting a
model.
Joe Schafer Joseph.L.Schafer@census.gov
Extended descriptions and examples for all major functions are provided in two vignettes, Understanding Coarsened Factors in cvam and Log-Linear Modeling with Missing and Coarsened Values Using the cvam Package.
coarsened
,
cvamPrior
,
cvamControl
,
cvamEstimate
,
get.coef
,
summary.cvam
# convert U.C. Berkeley admissions three-way table to data frame, # fit model of conditional independence, display summary # compare the fit to the saturated model dF <- as.data.frame(UCBAdmissions) fit <- cvam( ~ Dept*Gender + Dept*Admit, data=dF, freq=Freq ) summary(fit) fitSat <- cvam( ~ Dept*Gender*Admit, data=dF, freq=Freq ) anova(fit, fitSat, pval=TRUE) # fit non-independence model to crime data; then run MCMC for # 5000 iterations, creating 10 multiple imputations of the frequencies # for the 2x2 complete-data table fit <- cvam( ~ V1 * V2, data=crime, freq=n ) set.seed(56182) fitMCMC <- cvam(fit, method="MCMC", control=list( iterMCMC=5000, imputeEvery=500) ) get.imputedFreq(fitMCMC)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.