mi.eval: Multiple-imputation evaluation
In bucky: Bucky's Archive for Data Analysis in the Social Sciences

View source: R/mult-imp.R

mi.eval

R Documentation

Multiple-imputation evaluation

Description

Evaluation of an expression across multiply imputed data sets.

Usage

mi.eval(EXPR, robust, cluster, coef., vcov., df.=NULL,
parallel=FALSE, lazy=NULL, ...)

Arguments

`EXPR`	An R expression to evaluate. This expression must contain a `data` argument that specifies a list containing the imputed data sets or a an object of class `amelia`, `mids`, or `imputationList`.
`robust`	Whether to use Huber-White robust standard errors. The default is `TRUE` if `cluster` is specified and `FALSE` otherwise.
`cluster`	A vector specifying clusters for the purpose of computing clustered robust standard errors. This can be a variable inside the imputed data set. If unspecified, standard errors are not clustered. If specified, `robust` cannot be `FALSE`.
`coef.`	The function used to get a numeric vector of coefficient estimates when evaluated on an object returned from evaluating `EXPR` for each data set. The default is to use `coef`.
`vcov.`	The function that returns a numeric matrix giving the variance-covariance matrix when evaluated on an object returned from evaluating `EXPR` for each data set. The default is to use `vcovCR` if `cluster` is specified, `vcovHC` if `robust=TRUE` and `cluster` is not specified, and `vcov` otherwise.
`df.`	Either the degrees of freedom for each model or a function that calculates degrees of freedom on an object returned from evaluating `EXPR` for each data set. The default value of `NULL` uses the minimum result of applying `df.residual` it returns a numeric value when applied to the object returned by `EXPR` and this object is not of class `glm` and `Inf` otherwise.
`parallel`	A logical indicating whether to evaluate `EXPR` across data sets in parallel using `mclapply`. Otherwise, evaluation is done serially using `lapply`. `NULL` means to use parallel evaluation if and only if the 'parallel' package can be loaded and `getOption("mc.cores", detectCores()-1L)` is greater than 1.
`lazy`	A logical indicating whether to use lazy evaluation to avoid copying all imputed data sets into memory. When the `data` argument to `EXPR` generates the multiply imputed data set, this is generally a bad idea because it means redoing the imputation multiple times. The default value of `NULL` means to use lazy evaluation if and only if the `data` argument to `EXPR` is a `name`.
`...`	Any additional arguments to be passed to `lapply` or `mclapply` when evaluating `EXPR` across data sets.

Details

This function evaluates a R command for each of several multiply imputed data sets and combines results across data sets into a single set of estimates. This is similar to the functionality provided by with.mids but also works with multiply-imputed data sets generated by other packages like 'Amelia' as well as those from 'mice'.

For generating formatted tables of regression coefficients, the outputted objects should be compatible with the 'texreg' package. When used with lm, glm or a few other types of models, these objects are also compatible with the 'stargazer' package.

Value

An object of class mi.estimates containing the coefficient estimates, variance-covariance matrix, and related information.

Examples

if (require("Amelia")) {
    ## Load data
    data(africa)
    africa$civlib <- factor(round(africa$civlib*6), ordered=TRUE)

    ## Estimate a linear model using imputed data sets
    model0 <- lm(trade ~ log(gdp_pc), data=africa, subset=year==1973)
    summary(model0)

    ## Impute using Amelia    
    a.out <- amelia(x = africa, cs = "country", ts = "year",
                    logs = "gdp_pc", ord="civlib")

    ## Estimate a linear model using imputed data sets
    model1 <- mi.eval(lm(trade ~ log(gdp_pc), data=a.out, subset=year==1973))

    ## Show estimates
    model1
    coef(model1)

    ## Show summary information
    summary(model1)

    if (require("MASS")) {
        ## Estimate an ordered logit model
        model2 <- mi.eval(polr(civlib ~ log(gdp_pc) + log(population),
                               data=a.out))
        summary(model2)

        ## Also show thresholds by including thresholds with coefficients
        model3 <- mi.eval(polr(civlib ~ log(gdp_pc) + log(population),
                               data=a.out),
                          coef=function(x) c(x$coefficients, x$zeta))
        summary(model2)
    }
}

bucky documentation built on March 26, 2022, 1:12 a.m.