mice.impute.poisson | R Documentation |
Imputes univariate missing data based on a poisson
GLM following either the Bayesian regression or bootstrap regression (appendix .boot
) MI approach.
mice.impute.poisson(y, ry, x, wy = NULL, EV = TRUE, ...)
mice.impute.poisson.boot(y, ry, x, wy = NULL, EV = TRUE, ...)
mice.impute.pois(y, ry, x, wy = NULL, EV = TRUE, ...)
mice.impute.pois.boot(y, ry, x, wy = NULL, EV = TRUE, ...)
y |
Numeric vector with incomplete data |
ry |
Response pattern of |
x |
matrix with |
wy |
Logical vector of length |
EV |
should automatic outlier handling of imputed values be enabled? Default is |
... |
Other named arguments. |
A Poisson GLM assumes that the mean of the count variable is equal to its variance (equidispersion assumption). For details, see Zeileis, Kleiber, & Jackman (2008), or Hilbe (2007). The Bayesian method consists of the following steps:
Fit the model, and find bhat, the posterior mean, and V(bhat), the posterior variance of model parameters b.
Draw b.star from N(bhat,V(bhat)).
Compute fitted values using exp(x[!ry, ] %*% b.star)
Simulate imputations from a Poisson distribution with mean parameter lamda
being the respective fitted value from step 3.
The function uses the standard glm.fit
function, using the poisson
family.
The bootstrap method draws a bootstrap sample from y[ry]
and x[ry,]
and consists of the following steps:
Fit the model to the bootstrap sample and get model parameters b.star
Compute fitted values using exp(x[!ry, ] %*% b.star)
Simulate imputations from a Poisson distribution.
Numeric vector of length sum(!ry)
with imputations
mice.impute.poisson
: Bayesian regression variant
mice.impute.poisson.boot
: Bootstrap variant
mice.impute.pois
: Identical to mice.impute.poisson
; included for backward compatibility
mice.impute.pois.boot
: Identical to mice.impute.poisson.boot
; included for backward compatibility
Kristian Kleinke
Hilbe, J. M. (2007). Negative binomial regression. Cambridge: Cambridge University Press.
Kleinke, K., & Reinecke, J. (2013). countimp 1.0 – A multiple imputation package for incomplete count data [Technical Report]. University of Bielefeld, Faculty of Sociology, available from www.uni-bielefeld.de/soz/kds/pdf/countimp.pdf.
Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York: Wiley.
Zeileis, A., Kleiber, C., & Jackman, S. (2008). Regression models for count data in R. Journal of Statistical Software, 27(8), 1–-25.
## simulate Poisson distributed data
set.seed( 1234 )
b0 <- 1
b1 <- .75
b2 <- -.25
b3 <- .5
N <- 5000
x1 <- rnorm(N)
x2 <- rnorm(N)
x3 <- rnorm(N)
lam <- exp( b0 + b1 * x1 + b2 * x2 + b3 * x3 )
y <- rpois( N, lam )
POIS <- data.frame( y, x1, x2, x3 )
## introduce MAR missingness to simulated data
generate.md <- function( data, pos = 1, Z = 2, pmis = .5, strength = c( .5, .5 ) )
{
total <- round( pmis * nrow(data) )
sm <- which( data[,Z] < mean( data[,Z] ) )
gr <- which( data[,Z] > mean( data[,Z] ) )
sel.sm <- sample( sm, round( strength[1] * total ) )
sel.gr <- sample( gr, round( strength[2] * total ) )
sel <- c( sel.sm, sel.gr )
data[sel,pos] <- NA
return(data)
}
MPOIS <- generate.md( POIS, pmis = .2, strength = c( .2, .8) )
## impute missing data
imp <- countimp( MPOIS, method = c( "poisson" ,"" ,"" ,"" ))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.