mice.impute.poisson: Multiple Imputation of Poisson Distributed Count Data

mice.impute.poissonR Documentation

Multiple Imputation of Poisson Distributed Count Data

Description

Imputes univariate missing data based on a poisson GLM following either the Bayesian regression or bootstrap regression (appendix .boot) MI approach.

Usage

mice.impute.poisson(y, ry, x, wy = NULL, EV = TRUE, ...)

mice.impute.poisson.boot(y, ry, x, wy = NULL, EV = TRUE, ...)

mice.impute.pois(y, ry, x, wy = NULL, EV = TRUE, ...)

mice.impute.pois.boot(y, ry, x, wy = NULL, EV = TRUE, ...)

Arguments

y

Numeric vector with incomplete data

ry

Response pattern of y (TRUE=observed, FALSE=missing)

x

matrix with length(y) rows containing complete covariates

wy

Logical vector of length length(y). A TRUE value indicates locations in y for which imputations are created. Default is !ry

EV

should automatic outlier handling of imputed values be enabled? Default is TRUE: extreme imputations will be identified. These values will be replaced by imputations obtained by predictive mean matching (function mice.impute.midastouch())

...

Other named arguments.

Details

A Poisson GLM assumes that the mean of the count variable is equal to its variance (equidispersion assumption). For details, see Zeileis, Kleiber, & Jackman (2008), or Hilbe (2007). The Bayesian method consists of the following steps:

  1. Fit the model, and find bhat, the posterior mean, and V(bhat), the posterior variance of model parameters b.

  2. Draw b.star from N(bhat,V(bhat)).

  3. Compute fitted values using exp(x[!ry, ] %*% b.star)

  4. Simulate imputations from a Poisson distribution with mean parameter lamda being the respective fitted value from step 3.

The function uses the standard glm.fit function, using the poisson family. The bootstrap method draws a bootstrap sample from y[ry] and x[ry,] and consists of the following steps:

  1. Fit the model to the bootstrap sample and get model parameters b.star

  2. Compute fitted values using exp(x[!ry, ] %*% b.star)

  3. Simulate imputations from a Poisson distribution.

Value

Numeric vector of length sum(!ry) with imputations

Functions

  • mice.impute.poisson: Bayesian regression variant

  • mice.impute.poisson.boot: Bootstrap variant

  • mice.impute.pois: Identical to mice.impute.poisson; included for backward compatibility

  • mice.impute.pois.boot: Identical to mice.impute.poisson.boot; included for backward compatibility

Author(s)

Kristian Kleinke

References

  • Hilbe, J. M. (2007). Negative binomial regression. Cambridge: Cambridge University Press.

  • Kleinke, K., & Reinecke, J. (2013). countimp 1.0 – A multiple imputation package for incomplete count data [Technical Report]. University of Bielefeld, Faculty of Sociology, available from www.uni-bielefeld.de/soz/kds/pdf/countimp.pdf.

  • Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York: Wiley.

  • Zeileis, A., Kleiber, C., & Jackman, S. (2008). Regression models for count data in R. Journal of Statistical Software, 27(8), 1–-25.

Examples

## simulate Poisson distributed data
set.seed( 1234 )
b0 <- 1
b1 <- .75
b2 <- -.25
b3 <- .5
N <- 5000
x1 <- rnorm(N)
x2 <- rnorm(N)
x3 <- rnorm(N)
lam <- exp( b0 + b1 * x1 + b2 * x2 + b3 * x3 )
y <- rpois( N, lam )
POIS <- data.frame( y, x1, x2, x3 )

## introduce MAR missingness to simulated data
generate.md <- function( data, pos = 1, Z = 2, pmis = .5, strength = c( .5, .5 ) ) 
{
 total <- round( pmis * nrow(data) )
 sm <- which( data[,Z] < mean( data[,Z] ) )
 gr <- which( data[,Z] > mean( data[,Z] ) )
 sel.sm <- sample( sm, round( strength[1] * total ) )
 sel.gr <- sample( gr, round( strength[2] * total ) )
 sel <- c( sel.sm, sel.gr )
 data[sel,pos] <- NA
 return(data)
}
MPOIS <- generate.md( POIS, pmis = .2, strength = c( .2, .8) )

## impute missing data
imp <- countimp( MPOIS, method = c( "poisson" ,"" ,"" ,"" ))

kkleinke/countimp documentation built on Nov. 5, 2024, 11:51 a.m.