mice.impute.poisson: Multiple Imputation of Poisson Distributed Count Data
In kkleinke/countimp: Multiple Imputation of incomplete count data

mice.impute.poisson

R Documentation

Multiple Imputation of Poisson Distributed Count Data

Description

Imputes univariate missing data based on a poisson GLM following either the Bayesian regression or bootstrap regression (appendix .boot) MI approach.

Usage

mice.impute.poisson(y, ry, x, wy = NULL, EV = TRUE, ...)

mice.impute.poisson.boot(y, ry, x, wy = NULL, EV = TRUE, ...)

mice.impute.pois(y, ry, x, wy = NULL, EV = TRUE, ...)

mice.impute.pois.boot(y, ry, x, wy = NULL, EV = TRUE, ...)

Arguments

`y`	Numeric vector with incomplete data
`ry`	Response pattern of `y` (`TRUE`=observed, `FALSE`=missing)
`x`	matrix with `length(y)` rows containing complete covariates
`wy`	Logical vector of length `length(y)`. A `TRUE` value indicates locations in `y` for which imputations are created. Default is `!ry`
`EV`	should automatic outlier handling of imputed values be enabled? Default is `TRUE`: extreme imputations will be identified. These values will be replaced by imputations obtained by predictive mean matching (function `mice.impute.midastouch()`)
`...`	Other named arguments.

Details

A Poisson GLM assumes that the mean of the count variable is equal to its variance (equidispersion assumption). For details, see Zeileis, Kleiber, & Jackman (2008), or Hilbe (2007). The Bayesian method consists of the following steps:

Fit the model, and find bhat, the posterior mean, and V(bhat), the posterior variance of model parameters b.
Draw b.star from N(bhat,V(bhat)).
Compute fitted values using exp(x[!ry, ] %*% b.star)
Simulate imputations from a Poisson distribution with mean parameter lamda being the respective fitted value from step 3.

The function uses the standard glm.fit function, using the poisson family. The bootstrap method draws a bootstrap sample from y[ry] and x[ry,] and consists of the following steps:

Fit the model to the bootstrap sample and get model parameters b.star
Compute fitted values using exp(x[!ry, ] %*% b.star)
Simulate imputations from a Poisson distribution.

Value

Numeric vector of length sum(!ry) with imputations

Functions

mice.impute.poisson: Bayesian regression variant
mice.impute.poisson.boot: Bootstrap variant
mice.impute.pois: Identical to mice.impute.poisson; included for backward compatibility
mice.impute.pois.boot: Identical to mice.impute.poisson.boot; included for backward compatibility

Author(s)

Kristian Kleinke

References

Hilbe, J. M. (2007). Negative binomial regression. Cambridge: Cambridge University Press.
Kleinke, K., & Reinecke, J. (2013). countimp 1.0 – A multiple imputation package for incomplete count data [Technical Report]. University of Bielefeld, Faculty of Sociology, available from www.uni-bielefeld.de/soz/kds/pdf/countimp.pdf.
Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York: Wiley.
Zeileis, A., Kleiber, C., & Jackman, S. (2008). Regression models for count data in R. Journal of Statistical Software, 27(8), 1–-25.

Examples

## simulate Poisson distributed data
set.seed( 1234 )
b0 <- 1
b1 <- .75
b2 <- -.25
b3 <- .5
N <- 5000
x1 <- rnorm(N)
x2 <- rnorm(N)
x3 <- rnorm(N)
lam <- exp( b0 + b1 * x1 + b2 * x2 + b3 * x3 )
y <- rpois( N, lam )
POIS <- data.frame( y, x1, x2, x3 )

## introduce MAR missingness to simulated data
generate.md <- function( data, pos = 1, Z = 2, pmis = .5, strength = c( .5, .5 ) ) 
{
 total <- round( pmis * nrow(data) )
 sm <- which( data[,Z] < mean( data[,Z] ) )
 gr <- which( data[,Z] > mean( data[,Z] ) )
 sel.sm <- sample( sm, round( strength[1] * total ) )
 sel.gr <- sample( gr, round( strength[2] * total ) )
 sel <- c( sel.sm, sel.gr )
 data[sel,pos] <- NA
 return(data)
}
MPOIS <- generate.md( POIS, pmis = .2, strength = c( .2, .8) )

## impute missing data
imp <- countimp( MPOIS, method = c( "poisson" ,"" ,"" ,"" ))

kkleinke/countimp documentation built on Nov. 5, 2024, 11:51 a.m.